# The oldest businesses in the world

## 1. The oldest businesses in the world

In [1]:
# Import all libraries required under their usual alias (for this work, only pandas will be required)
import pandas as pd

# Load the business.csv file as a DataFrame called businesses
businesses = pd.read_csv('Datasets/businesses.csv')

# Sort businesses from oldest businesses to youngest
sorted_businesses = businesses.sort_values(by='year_founded', ascending=True)

# Display the first few lines of sorted_businesses
print(sorted_businesses.head())

                        business  year_founded category_code country_code
64                    Kongō Gumi           578          CAT6          JPN
94   St. Peter Stifts Kulinarium           803          CAT4          AUT
107        Staffelter Hof Winery           862          CAT9          DEU
106            Monnaie de Paris            864         CAT12          FRA
103               The Royal Mint           886         CAT12          GBR


In the previous code, we have learned that Kongō Gumi is the world's oldest continuously operating business.

## 2. Oldest Business in each continent

The next study aims to find the oldest business for each continent. For that, it will be required to use the countries.csv dataset which has the data for the name of each country and each continent.

In [2]:
# Load countries.csv to a DataFrame
countries = pd.read_csv('Datasets/countries.csv')

# Merge sorted_businesses with countries
businesses_countries = sorted_businesses.merge(countries, how='inner', left_on='country_code', right_on='country_code')

print(businesses_countries['continent'].unique())

['Asia' 'Europe' 'North America' 'South America' 'Africa' 'Oceania']


In [3]:
# Filter businesses_countries to include countries in Asia
asia = businesses_countries[businesses_countries['continent'] == 'Asia']
asia.head()

Unnamed: 0,business,year_founded,category_code,country_code,country,continent
0,Kongō Gumi,578,CAT6,JPN,Japan,Asia
9,Ma Yu Ching's Bucket Chicken House,1153,CAT4,CHN,China,Asia
24,Çemberlitaş Hamamı,1584,CAT19,TUR,Turkey,Asia
38,Wadia Group,1736,CAT12,IND,India,Asia
44,Pos Malaysia,1800,CAT16,MYS,Malaysia,Asia


In [4]:
# Filter businesses_countries to include countries in Europe
europe = businesses_countries[businesses_countries['continent'] == 'Europe']
europe.head()

Unnamed: 0,business,year_founded,category_code,country_code,country,continent
1,St. Peter Stifts Kulinarium,803,CAT4,AUT,Austria,Europe
2,Staffelter Hof Winery,862,CAT9,DEU,Germany,Europe
3,Monnaie de Paris,864,CAT12,FRA,France,Europe
4,The Royal Mint,886,CAT12,GBR,United Kingdom,Europe
5,Sean's Bar,900,CAT4,IRL,Ireland,Europe


In [5]:
# Filter businesses_countries to include countries in North America only
north_america = businesses_countries[businesses_countries['continent'] == 'North America']
north_america.head()

Unnamed: 0,business,year_founded,category_code,country_code,country,continent
22,La Casa de Moneda de México,1534,CAT12,MEX,Mexico,North America
28,Shirley Plantation,1638,CAT1,USA,United States,North America
33,Hudson's Bay Company,1670,CAT17,CAN,Canada,North America
35,Mount Gay Rum,1703,CAT9,BRB,Barbados,North America
40,Rose Hall,1770,CAT19,JAM,Jamaica,North America


In [6]:
# Filter businesses_countries to include countries in South America only
south_america = businesses_countries[businesses_countries['continent'] == 'South America']
south_america.head()

Unnamed: 0,business,year_founded,category_code,country_code,country,continent
23,Casa Nacional de Moneda,1565,CAT3,PER,Peru,South America
27,Casa de Moneda de Colombia,1621,CAT12,COL,Colombia,South America
31,Hacienda Chuao,1660,CAT11,VEN,"Venezuela, Bolivarian Republic of",South America
34,Casa da Moeda do Brasil,1694,CAT12,BRA,Brazil,South America
47,Famae,1811,CAT8,CHL,Chile,South America


In [7]:
# Filter businesses_countries to include countries in Africa
africa = businesses_countries[businesses_countries['continent'] == 'Africa']
africa.head()

Unnamed: 0,business,year_founded,category_code,country_code,country,continent
41,Mauritius Post,1772,CAT16,MUS,Mauritius,Africa
48,NamPost,1814,CAT16,NAM,Namibia,Africa
50,Premier FMCG,1820,CAT12,ZAF,South Africa,Africa
60,La Poste Tunisienne,1847,CAT16,TUN,Tunisia,Africa
61,Correios de Cabo Verde,1849,CAT16,CPV,Cabo Verde,Africa


In [8]:
# Filter businesses_countries to include countries in Oceania
oceania = businesses_countries[businesses_countries['continent'] == 'Oceania']
oceania.head()

Unnamed: 0,business,year_founded,category_code,country_code,country,continent
46,Australia Post,1809,CAT16,AUS,Australia,Oceania
65,Bank of New Zealand,1861,CAT3,NZL,New Zealand,Oceania
159,European Trust Company,1991,CAT3,VUT,Vanuatu,Oceania


The oldest business was Kongō Gumi (578), St. Peter Stifts Kulinarium (803), La Casa de Moneda de México (1534), Casa Nacional de Moneda (1565), Mauritius Post (1772), Australia Post (1809) for Asia, Europe, North America, South America, Africa and Oceania, respectively.

The previous code allows finding the oldest business for each continent. A simpler way to obtain the same information is shown below.

In [9]:
# Create continent, which lists only the continent and oldest year_founded
continent = businesses_countries.groupby('continent').agg({'year_founded':'min'})

# Merge continent with businesses_countries
merged_continent = continent.merge(businesses_countries, on=["continent", "year_founded"])

# Subset continent so that only the four columns of interest are included
subset_merged_continent = merged_continent[['continent', 'country', 'business', 'year_founded']]
subset_merged_continent

Unnamed: 0,continent,country,business,year_founded
0,Africa,Mauritius,Mauritius Post,1772
1,Asia,Japan,Kongō Gumi,578
2,Europe,Austria,St. Peter Stifts Kulinarium,803
3,North America,Mexico,La Casa de Moneda de México,1534
4,Oceania,Australia,Australia Post,1809
5,South America,Peru,Casa Nacional de Moneda,1565


## 3. Unkown Oldest Business

The provided datasets can be used to check if all countries have the oldest business.

In [10]:
# Use .merge() to create a DataFrame, all_countries
all_countries = businesses.merge(countries, on='country_code', how='right', indicator=True)

# Filter to include only countries without oldest businesses
missing_countries = all_countries[all_countries['_merge'] != 'both']

# Create a series of the country names with missing oldest business data
missing_countries_series = missing_countries['country']

# Display the series
missing_countries_series

1                                Angola
7                   Antigua and Barbuda
18                              Bahamas
48                   Dominican Republic
50                              Ecuador
57                                 Fiji
59      Micronesia, Federated States of
63                                Ghana
65                               Gambia
69                              Grenada
79            Iran, Islamic Republic of
89                           Kyrgyzstan
91                             Kiribati
92                Saint Kitts and Nevis
107                              Monaco
108                Moldova, Republic of
110                            Maldives
112                    Marshall Islands
131                               Nauru
138                               Palau
139                    Papua New Guinea
143                            Paraguay
144                 Palestine, State of
153                     Solomon Islands
160                            Suriname


In [11]:
missing_countries_series.count()

32

In this analysis, it can be concluded that 32 countries do not have the oldest business as BusinessFinancing.co.uk was not able to determine it.

## 4. Adding new oldest business data

Some countries lack their respective oldest business in this dataset. It can be updated with a new dataset that has some of that information.

In [12]:
# Import new_businesses.csv
new_businesses = pd.read_csv('Datasets/new_businesses.csv')

# Add the data in new_businesses to the existing businesses
all_businesses = pd.concat([new_businesses, businesses])

# Merge and filter to find countries with missing business data
new_all_countries = all_businesses.merge(countries, how='outer', on='country_code',  indicator=True)
new_missing_countries = new_all_countries[new_all_countries['_merge'] != 'both']

# Group by continent and create a "count_missing" column
count_missing = new_missing_countries.groupby('continent').agg({'country':'count'})
count_missing.columns = ['count_missing']
count_missing

Unnamed: 0_level_0,count_missing
continent,Unnamed: 1_level_1
Africa,3
Asia,7
Europe,2
North America,5
Oceania,10
South America,3


## 5. The oldest industries

It is possible to know how many oldest businesses are in each category of industry with the information of the categories.

In [13]:
# Import categories.csv and merge to businesses
categories = pd.read_csv("Datasets/categories.csv")
businesses_categories = businesses.merge(categories, on='category_code')

# Create a DataFrame which lists the number of oldest businesses in each category
count_business_cats = businesses_categories.groupby('category').agg({'category':'count'})

# Rename column and display the first five rows of the DataFrame
count_business_cats.columns = ['count']
display(count_business_cats.head())

Unnamed: 0_level_0,count
category,Unnamed: 1_level_1
Agriculture,6
Aviation & Transport,19
Banking & Finance,37
"Cafés, Restaurants & Bars",6
Conglomerate,3


## 6. Restaurant representation

For the section, the idea is to find which cafés, restaurants, and bars have been around since before 1800.

In [14]:
# Filter using .query() for CAT4 businesses founded before 1800; sort results
old_restaurants = businesses_categories[businesses_categories['category_code'] =='CAT4'].query('year_founded < 1800')

# Sort the DataFrame
old_restaurants = old_restaurants.sort_values('year_founded', ascending=True)
old_restaurants

Unnamed: 0,business,year_founded,category_code,country_code,category
142,St. Peter Stifts Kulinarium,803,CAT4,AUT,"Cafés, Restaurants & Bars"
143,Sean's Bar,900,CAT4,IRL,"Cafés, Restaurants & Bars"
139,Ma Yu Ching's Bucket Chicken House,1153,CAT4,CHN,"Cafés, Restaurants & Bars"


## 7. Categories and continents

For this part, it is intended to find the oldest business in each category of commerce for each continent.

In [15]:
# Merge all businesses, countries, and categories together
businesses_categories_countries = businesses.merge(categories, on='category_code').merge(countries, on='country_code')

# Sort businesses_categories_countries from oldest to most recent
businesses_categories_countries = businesses_categories_countries.sort_values('year_founded', ascending=True)

# Create the oldest by continent and category DataFrame
oldest_by_continent_category = businesses_categories_countries.groupby(['continent', 'category']).agg({'year_founded':'min'})
oldest_by_continent_category.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,year_founded
continent,category,Unnamed: 2_level_1
Africa,Agriculture,1947
Africa,Aviation & Transport,1854
Africa,Banking & Finance,1892
Africa,"Distillers, Vintners, & Breweries",1933
Africa,Energy,1968
