# What and Where are the World's Oldest Businesses?

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adamelliotfields/datacamp/blob/main/notebooks/projects/worlds_oldest_businesses_python/notebook.ipynb)

<figure>
  <img
    src="MKn_Staffelter_Hof.jpg"
    alt="Staffelter Hof"
    width="480"
  />
  <figcaption style="font-size:0.625em">This is Staffelter Hof Winery, Germany's oldest business, which was established in 862 under the Carolingian dynasty. It has continued to serve customers through dramatic changes in Europe such as the Holy Roman Empire, the Ottoman Empire, and both world wars. Image credit:&nbsp;<a href="https://commons.wikimedia.org/wiki/User:Martin_Kraft" target="_blank" rel="noopener noreferrer">Martin Kraft</a>.</figcaption>
</figure>

_What characteristics enable a business to stand the test of time?_

To help answer this question, BusinessFinancing.co.uk [researched](https://businessfinancing.co.uk/the-oldest-company-in-almost-every-country) the oldest company that is still in business in almost every country and compiled the results into a dataset. Let's explore this work to to better understand these historic businesses.

**Contents**
1. [The oldest businesses in the world](#The-oldest-businesses-in-the-world)
2. [The oldest businesses in North America](#The-oldest-businesses-in-North-America)

In [1]:
import pandas as pd

BASE_URL = "https://raw.githubusercontent.com/adamelliotfields/datacamp/main/notebooks/projects/worlds_oldest_businesses_python/"

pd.set_option("display.width", 200)

businesses = pd.read_pickle(BASE_URL + "businesses.pkl")
categories = pd.read_pickle(BASE_URL + "categories.pkl")
countries = pd.read_pickle(BASE_URL + "countries.pkl")

In [2]:
display(businesses.head())

Unnamed: 0,business,year_founded,category_code,country_code
0,Hamoud Boualem,1878,CAT11,DZA
1,Communauté Électrique du Bénin,1968,CAT10,BEN
2,Botswana Meat Commission,1965,CAT1,BWA
3,Air Burkina,1967,CAT2,BFA
4,Brarudi,1955,CAT9,BDI


In [3]:
display(categories.head())

Unnamed: 0,category_code,category
0,CAT1,Agriculture
1,CAT2,Aviation & Transport
2,CAT3,Banking & Finance
3,CAT4,"Cafés, Restaurants & Bars"
4,CAT5,Conglomerate


In [4]:
display(countries.head())

Unnamed: 0,country_code,country,continent
0,AFG,Afghanistan,Asia
1,AGO,Angola,Africa
2,ALB,Albania,Europe
3,AND,Andorra,Europe
4,ARE,United Arab Emirates,Asia


## The oldest businesses in the world

Now let's learn about some of the world's oldest businesses still in operation!

In [5]:
# sort businesses oldest first
sorted_businesses = businesses.sort_values(by="year_founded", ascending=True)
display(sorted_businesses.head())

Unnamed: 0,business,year_founded,category_code,country_code
64,Kongō Gumi,578,CAT6,JPN
94,St. Peter Stifts Kulinarium,803,CAT4,AUT
107,Staffelter Hof Winery,862,CAT9,DEU
106,Monnaie de Paris,864,CAT12,FRA
103,The Royal Mint,886,CAT12,GBR


## The oldest businesses in North America

So far we've learned that Kongō Gumi is the world's oldest continuously operating business, beating out the second oldest business by well over 100 years! It's a little hard to read the country codes, though. Wouldn't it be nice if we had a list of country names to go along with the country codes?

In [6]:
# merge sorted_businesses with countries
businesses_countries = sorted_businesses.merge(countries, on="country_code")

# only North America
north_america = businesses_countries[businesses_countries["continent"] == "North America"]
display(north_america.head())

Unnamed: 0,business,year_founded,category_code,country_code,country,continent
22,La Casa de Moneda de México,1534,CAT12,MEX,Mexico,North America
28,Shirley Plantation,1638,CAT1,USA,United States,North America
33,Hudson's Bay Company,1670,CAT17,CAN,Canada,North America
35,Mount Gay Rum,1703,CAT9,BRB,Barbados,North America
40,Rose Hall,1770,CAT19,JAM,Jamaica,North America


## The oldest business on each continent

Now we can see that the oldest company in North America is La Casa de Moneda de México, founded in 1534. Why stop there, though, when we could easily find out the oldest business on every continent?

In [7]:
# get the oldest `year_founded` for each continent
continent = businesses_countries.groupby("continent").agg({"year_founded": "min"})

# merge on `continent` and `year_founded`
merged_continent = continent.merge(businesses_countries, on=["continent", "year_founded"])

# subset only the `continent`, `country`, `business`, and `year_founded` columns
subset_merged_continent = merged_continent[["continent", "country", "business", "year_founded"]]
display(subset_merged_continent)

Unnamed: 0,continent,country,business,year_founded
0,Africa,Mauritius,Mauritius Post,1772
1,Asia,Japan,Kongō Gumi,578
2,Europe,Austria,St. Peter Stifts Kulinarium,803
3,North America,Mexico,La Casa de Moneda de México,1534
4,Oceania,Australia,Australia Post,1809
5,South America,Peru,Casa Nacional de Moneda,1565


## Unknown oldest businesses

BusinessFinancing.co.uk wasn't able to determine the oldest business for some countries, and those countries are simply left off of `businesses.csv`. However, the `countries` that we created *does* include all countries in the world, regardless of whether the oldest business is known.

In [8]:
all_countries = businesses.merge(countries, how="right", on="country_code", indicator=True)
missing_countries = all_countries[all_countries["_merge"] != "both"]
missing_countries_series = missing_countries["country"]
print(missing_countries_series.head())

1                  Angola
7     Antigua and Barbuda
18                Bahamas
48     Dominican Republic
50                Ecuador
Name: country, dtype: object


## Adding new business data

It looks like we've got some holes in our dataset! Fortunately, we've taken it upon ourselves to improve upon BusinessFinancing.co.uk's work and find oldest businesses in a few of the missing countries.

In [9]:
new_businesses = pd.DataFrame(
    {
        "business": ["Fiji Times", "J. Armando Bermúdez & Co."],
        "year_founded": [1869, 1852],
        "category_code": ["CAT13", "CAT9"],
        "country_code": ["FJI", "DOM"],
    }
)

# stack vertically (using concat)
all_businesses = pd.concat([new_businesses, businesses], axis=0)

# merge with countries and find missing
new_all_countries = all_businesses.merge(countries, how="right", on="country_code", indicator=True)
new_missing_countries = new_all_countries[new_all_countries["_merge"] != "both"]

# group by continent and create a `count_missing` column
count_missing = new_missing_countries.groupby("continent").agg({"continent": "count"})
count_missing.columns = ["count_missing"]
display(count_missing)

Unnamed: 0_level_0,count_missing
continent,Unnamed: 1_level_1
Africa,3
Asia,7
Europe,2
North America,5
Oceania,10
South America,3


## The oldest industries

Remember our oldest business in the world, Kongō Gumi?

We know Kongō Gumi was founded in the year 578 in Japan, but it's a little hard to decipher which industry it's in. Let's use `categories` to understand how many oldest businesses are in each category of industry.

In [10]:
# merge businesses and categories (default join is "inner")
businesses_categories = businesses.merge(categories, on="category_code")

# list oldest business for each category
count_business_cats = businesses_categories.groupby("category").agg({"business": "count"})

# rename columns
count_business_cats.columns = ["count"]
display(count_business_cats.head())

Unnamed: 0_level_0,count
category,Unnamed: 1_level_1
Agriculture,6
Aviation & Transport,19
Banking & Finance,37
"Cafés, Restaurants & Bars",6
Conglomerate,3


## Restaurant representation

No matter how we measure it, looks like Banking and Finance is an excellent industry to be in if longevity is our goal! Let's zoom in on another industry: cafés, restaurants, and bars. Which restaurants in our dataset have been around since before the year 1800?

In [11]:
# query for CAT4 businesses founded before 1800 (sort oldest first)
old_restaurants = businesses_categories.query('year_founded < 1800 and category_code == "CAT4"')
old_restaurants = old_restaurants.sort_values("year_founded")
display(old_restaurants)

Unnamed: 0,business,year_founded,category_code,country_code,category
142,St. Peter Stifts Kulinarium,803,CAT4,AUT,"Cafés, Restaurants & Bars"
143,Sean's Bar,900,CAT4,IRL,"Cafés, Restaurants & Bars"
139,Ma Yu Ching's Bucket Chicken House,1153,CAT4,CHN,"Cafés, Restaurants & Bars"


## Categories and continents

St. Peter Stifts Kulinarium is old enough that the restaurant is believed to have served Mozart - and it would have been over 900 years old even when he was a patron! Let's finish by looking at the oldest business in each category of commerce for each continent.

In [12]:
# merge all businesses, countries, and categories together
businesses_categories_countries = businesses.merge(categories, on="category_code").merge(
    countries, on="country_code"
)

# sort oldest first
businesses_categories_countries = businesses_categories_countries.sort_values("year_founded")

# get oldest by continent by aggregating on min `year_founded`
oldest_by_continent_category = businesses_categories_countries.groupby(
    ["continent", "category"]
).agg({"year_founded": "min"})
display(oldest_by_continent_category)

Unnamed: 0_level_0,Unnamed: 1_level_0,year_founded
continent,category,Unnamed: 2_level_1
Africa,Agriculture,1947
Africa,Aviation & Transport,1854
Africa,Banking & Finance,1892
Africa,"Distillers, Vintners, & Breweries",1933
Africa,Energy,1968
Africa,Food & Beverages,1878
Africa,Manufacturing & Production,1820
Africa,Media,1943
Africa,Mining,1962
Africa,Postal Service,1772
