In [None]:
-----World's Oldest Businesses Dataset------

BusinessFinancing.co.uk researched the oldest company that is still in 
business in (almost) every country and compiled the results into a dataset. 
In this project, you'll explore that dataset to see what they found.

This dataset contains three tables: 1)businesses
									2)categories
									3)countries

----Questions answered using World's Oldest Businesses Dataset:

1. What is the range of the founding years throughout the World?
2. How many businesses were founded before the year 1000?
3. Which businesses were founded before the year 1000?
4. What category of businesses are those that have lasted over a millennium?
5. What other industries constitute the oldest companies around the World, 
	and which industries are most common?
6. How old the oldest business is on each continent?
7. Which are the most common categories for the oldest businesses on each continent?
8. Which are the continent/category pairs with a high count i.e. >5?
9. Combine all three tables together to make viewing all columns easier.

In [2]:
%%sql 
postgresql:///oldestbusinesses

'Connected: @oldestbusinesses'

In [None]:
SELECT *
FROM countries;

SELECT * 
FROM businesses;

SELECT * 
FROM categories;

--Data cleaning

UPDATE categories
SET category = 'Cafes, Restaurants & Bars'
WHERE category_code = 'CAT4';

UPDATE categories
SET category = 'Distillers, Vineries, & Breweries'
WHERE category_code = 'CAT9';

--removing " from start of strings
SELECT country, TRIM('"' FROM country)
FROM countries
WHERE country LIKE'"%';

UPDATE countries
SET country = TRIM('"' FROM country)
WHERE country LIKE'"%';

--checking
SELECT country
FROM countries
WHERE country LIKE'"%';

--removing " and , from continent col. in countries
SELECT continent , REPLACE(REPLACE(continent, ',', ' '), '"', '')
FROM countries
WHERE continent LIKE '%"%,%';   -- this method worked

UPDATE countries
SET continent = REPLACE(REPLACE(continent, ',', ' '), '"', '')
WHERE continent LIKE '%"%,%';

--check for update
SELECT country
FROM countries
WHERE country LIKE'"%'; 


--cleaning continents col.
SELECT CASE WHEN continent LIKE '%Asia%' THEN 'Asia'
			WHEN continent LIKE '%Africa%' THEN 'Africa'
			WHEN continent LIKE '%Europe%' THEN 'Europe'
			WHEN continent LIKE '%North America%' THEN 'North America'
			WHEN continent LIKE '%Oceania%' THEN 'Oceania'
			WHEN continent LIKE '%South America%' THEN 'South America'
			END as continent
FROM countries;

UPDATE countries
SET continent = (CASE WHEN continent LIKE '%Asia%' THEN 'Asia'
			WHEN continent LIKE '%Africa%' THEN 'Africa'
			WHEN continent LIKE '%Europe%' THEN 'Europe'
			WHEN continent LIKE '%North America%' THEN 'North America'
			WHEN continent LIKE '%Oceania%' THEN 'Oceania'
			WHEN continent LIKE '%South America%' THEN 'South America'
			END )

SELECT DISTINCT continent 
from countries

--SQL queries

-- 1. What is the range of the founding years throughout the World?

In [4]:
%%sql

SELECT MIN(CAST(year_founded AS int)) AS min, 
       MAX(CAST(year_founded AS int)) AS max
FROM businesses;


 * postgresql:///oldestbusinesses
1 rows affected.


min,max
578,1999


-- Answer: Founding years ranges from year 578, being the oldest to year 1999 as the latest.

--2. How many businesses were founded before the year 1000?

In [5]:
%%sql

SELECT COUNT(business)
FROM businesses
WHERE year_founded < 1000;

 * postgresql:///oldestbusinesses
1 rows affected.


count
6


--Answer: There is a total of 6 businesses that were founded before the year1000, they span across a millennium.

--3. Which businesses were founded before the year 1000?

In [6]:
%%sql

SELECT business, year_founded
FROM businesses
WHERE year_founded < 1000;

 * postgresql:///oldestbusinesses
6 rows affected.


business,year_founded
Kongō Gumi,578
St. Peter Stifts Kulinarium,803
The Royal Mint,886
Monnaie de Paris,864
Staffelter Hof Winery,862
Sean's Bar,900


--4. What category of businesses are those that have lasted over a millennium?

In [19]:
%%sql

SELECT business, category, country_code,
	   (EXTRACT(YEAR FROM NOW()) - year_founded) AS no_years_in_business
FROM categories AS cat
INNER JOIN businesses AS b
ON b.category_code = cat.category_code
WHERE year_founded < 1000
ORDER BY no_years_in_business DESC;

 * postgresql:///oldestbusinesses
6 rows affected.


business,category,country_code,no_years_in_business
Kongō Gumi,Construction,JPN,1446.0
St. Peter Stifts Kulinarium,"Cafes, Restaurants & Bars",AUT,1221.0
Staffelter Hof Winery,"Distillers, Vineries, & Breweries",DEU,1162.0
Monnaie de Paris,Manufacturing & Production,FRA,1160.0
The Royal Mint,Manufacturing & Production,GBR,1138.0
Sean's Bar,"Cafes, Restaurants & Bars",IRL,1124.0


--5. What other industries constitute the oldest companies around the World, 
--	and which industries are most common?

In [20]:
%%sql

SELECT category, COUNT(b.category_code) AS count
FROM categories AS cat
INNER JOIN businesses AS b
ON b.category_code = cat.category_code
GROUP BY cat.category
ORDER BY count DESC;

 * postgresql:///oldestbusinesses
19 rows affected.


category,count
Banking & Finance,37
"Distillers, Vineries, & Breweries",22
Aviation & Transport,19
Postal Service,16
Manufacturing & Production,15
Media,7
Agriculture,6
Food & Beverages,6
"Cafes, Restaurants & Bars",6
Retail,4


-- 6. How old the oldest business is on each continent?

In [21]:
%%sql

SELECT continent, MIN(CAST(year_founded AS int)) AS oldest_business
FROM countries AS c
INNER JOIN businesses AS b
ON b.country_code = c.country_code
GROUP BY continent
ORDER BY oldest_business;

 * postgresql:///oldestbusinesses
6 rows affected.


continent,oldest_business
Asia,578
Europe,803
North America,1534
South America,1565
Africa,1772
Oceania,1809


--7. Which are the most common categories for the oldest businesses on each continent?

In [23]:
%%sql

SELECT continent, category,  COUNT(b.category_code) AS count
FROM countries AS c
INNER JOIN businesses AS b
ON c.country_code = b.country_code
INNER JOIN categories AS cat
ON cat.category_code = b.category_code
GROUP BY continent, category
ORDER BY continent, count DESC;

 * postgresql:///oldestbusinesses
56 rows affected.


continent,category,count
Africa,Banking & Finance,17
Africa,Aviation & Transport,10
Africa,Postal Service,9
Africa,Media,4
Africa,Agriculture,3
Africa,"Distillers, Vineries, & Breweries",3
Africa,Mining,1
Africa,Food & Beverages,1
Africa,Manufacturing & Production,1
Africa,Energy,1


--8. Which are the continent/category pairs with a high count i.e. >5?

In [24]:
%%sql 

SELECT continent, category,  COUNT(b.category_code) AS count
FROM countries AS c
INNER JOIN businesses AS b
ON c.country_code = b.country_code
INNER JOIN categories AS cat
ON cat.category_code = b.category_code
GROUP BY continent, category
HAVING COUNT(b.category_code) > 5
ORDER BY count DESC;

 * postgresql:///oldestbusinesses
7 rows affected.


continent,category,count
Africa,Banking & Finance,17
Europe,"Distillers, Vineries, & Breweries",12
Africa,Aviation & Transport,10
Africa,Postal Service,9
Europe,Manufacturing & Production,8
Asia,Aviation & Transport,7
Asia,Banking & Finance,6


--9. Combine all three tables together to make viewing all columns easier.

In [25]:
%%sql 

SELECT *
FROM countries AS c
INNER JOIN businesses AS b
ON c.country_code = b.country_code
INNER JOIN categories AS cat
ON cat.category_code = b.category_code;

 * postgresql:///oldestbusinesses
163 rows affected.


country_code,country,continent,business,year_founded,category_code,country_code_1,category_code_1,category
AFG,Afghanistan,Asia,Spinzar Cotton Company,1930,CAT1,AFG,CAT1,Agriculture
ALB,Albania,Europe,ALBtelecom,1912,CAT18,ALB,CAT18,Telecommunications
AND,Andorra,Europe,Andbank,1930,CAT3,AND,CAT3,Banking & Finance
ARE,United Arab Emirates,Asia,Liwa Chemicals,1939,CAT12,ARE,CAT12,Manufacturing & Production
ARG,Argentina,South America,Bank of the Province of Buenos Aires,1822,CAT3,ARG,CAT3,Banking & Finance
ARM,Armenia,Asia,Yerevan Ararat Brandy-Wine-Vodka Factory,1877,CAT9,ARM,CAT9,"Distillers, Vineries, & Breweries"
AUS,Australia,Oceania,Australia Post,1809,CAT16,AUS,CAT16,Postal Service
AUT,Austria,Europe,St. Peter Stifts Kulinarium,803,CAT4,AUT,CAT4,"Cafes, Restaurants & Bars"
AZE,Azerbaijan,Asia,Azerbaijan Caspian Shipping Company,1858,CAT2,AZE,CAT2,Aviation & Transport
BDI,Burundi,Africa,Brarudi,1955,CAT9,BDI,CAT9,"Distillers, Vineries, & Breweries"
