<center><img src="MKn_Staffelter_Hof.jpeg" alt="Picture of old business"</center>
<!--Image Credit: Martin Kraft https://commons.wikimedia.org/wiki/File:MKn_Staffelter_Hof.jpg -->

Staffelter Hof Winery is Germany's oldest business, established in 862 under the Carolingian dynasty. It has continued to serve customers through dramatic changes in Europe, such as the Holy Roman Empire, the Ottoman Empire, and both world wars. What characteristics enable a business to stand the test of time?

To help answer this question, BusinessFinancing.co.uk researched the oldest company still in business in **almost** every country and compiled the results into several CSV files. This dataset has been cleaned.

Having useful information in different files is a common problem. While it's better to keep different types of data separate for data storage, you'll want all the data in one place for analysis. You'll use joining and data manipulation to work with this data and better understand the world's oldest businesses.

## The Data
`businesses` and `new_businesses`
|Column|Description|
|------|-----------|
|`business`|Name of the business (varchar)|
|`year_founded`|Year the business was founded (int)|
|`category_code`|Code for the business category (varchar)|
|`country_code`|ISO 3166-1 three-letter country code (char)|
---
`countries`
|Column|Description|
|------|-----------|
|`country_code`|ISO 3166-1 three-letter country code (varchar)|
|`country`|Name of the country (varchar)|
|`continent`|Name of the continent the country exists in (varchar)|
---
`categories`
|Column|Description|
|------|-----------|
|`category_code`|Code for the business category (varchar)|
|`category`|Description of the business category (varchar)|

In [8]:
# connect to the database
import os
import psycopg2

conn = psycopg2.connect(
    host=os.getenv("DB_HOST", "localhost"),
    port=os.getenv("DB_PORT", "5432"),
    user=os.getenv("DB_USER", "postgres"),
    password=os.getenv("DB_PASS"),
    database=os.getenv("DB_NAME", "Oldest_Businesses_DB")
)

def run_query(query):
    with conn.cursor() as cursor:
        cursor.execute(query)
        return cursor.fetchall()

df = run_query("SELECT * FROM categories")
display(df)

[(0, 'CAT1', 'Agriculture'),
 (1, 'CAT2', 'Aviation & Transport'),
 (2, 'CAT3', 'Banking & Finance'),
 (3, 'CAT4', 'Cafés, Restaurants & Bars'),
 (4, 'CAT5', 'Conglomerate'),
 (5, 'CAT6', 'Construction'),
 (6, 'CAT7', 'Consumer Goods'),
 (7, 'CAT8', 'Defense'),
 (8, 'CAT9', 'Distillers, Vintners, & Breweries'),
 (9, 'CAT10', 'Energy'),
 (10, 'CAT11', 'Food & Beverages'),
 (11, 'CAT12', 'Manufacturing & Production'),
 (12, 'CAT13', 'Media'),
 (13, 'CAT14', 'Medical'),
 (14, 'CAT15', 'Mining'),
 (15, 'CAT16', 'Postal Service'),
 (16, 'CAT17', 'Retail'),
 (17, 'CAT18', 'Telecommunications'),
 (18, 'CAT19', 'Tourism & Hotels')]

In [9]:
run_query("SELECT * FROM categories")

[(0, 'CAT1', 'Agriculture'),
 (1, 'CAT2', 'Aviation & Transport'),
 (2, 'CAT3', 'Banking & Finance'),
 (3, 'CAT4', 'Cafés, Restaurants & Bars'),
 (4, 'CAT5', 'Conglomerate'),
 (5, 'CAT6', 'Construction'),
 (6, 'CAT7', 'Consumer Goods'),
 (7, 'CAT8', 'Defense'),
 (8, 'CAT9', 'Distillers, Vintners, & Breweries'),
 (9, 'CAT10', 'Energy'),
 (10, 'CAT11', 'Food & Beverages'),
 (11, 'CAT12', 'Manufacturing & Production'),
 (12, 'CAT13', 'Media'),
 (13, 'CAT14', 'Medical'),
 (14, 'CAT15', 'Mining'),
 (15, 'CAT16', 'Postal Service'),
 (16, 'CAT17', 'Retail'),
 (17, 'CAT18', 'Telecommunications'),
 (18, 'CAT19', 'Tourism & Hotels')]

In [None]:
-- What is the oldest business on each continent?
with ranking as(
	select continent, country, business, year_founded,
		ROW_NUMBER() OVER (
      	PARTITION BY c.continent
      	ORDER BY b.year_founded ASC
    	) AS rn

	from businesses as b
	left join countries as c
	on b.country_code=c.country_code
	)

	
select continent, country, business, year_founded
from ranking
where rn=1;

In [None]:
-- How many countries per continent lack data on the oldest businesses
-- Does including the `new_businesses` data change this?

-- Count of countries per continent with no business data (including new_businesses)
WITH all_businesses AS (
    SELECT DISTINCT country_code FROM businesses
    UNION
    SELECT DISTINCT country_code FROM new_businesses
),
	
missing_countries AS (
    SELECT c.continent, c.country_code
	
    FROM countries c
    LEFT JOIN all_businesses a
      ON c.country_code = a.country_code
	
    WHERE a.country_code IS NULL
)


	
SELECT
    continent,
    COUNT(DISTINCT country_code) AS countries_without_businesses
	
FROM missing_countries
GROUP BY continent
ORDER BY continent;


In [None]:
-- Which business categories are best suited to last over the course of centuries?

-- Oldest founding year per continent-category
SELECT
  c.continent,
  cat.category,
  MIN(b.year_founded) AS year_founded
	
FROM businesses b
JOIN countries c
  ON b.country_code = c.country_code
JOIN categories cat
  ON b.category_code = cat.category_code
	
WHERE b.year_founded IS NOT NULL
GROUP BY c.continent, cat.category;


In [2]:
import os
import psycopg2
import pandas as pd
from dotenv import load_dotenv

# Load DB credentials from .env
from dotenv import load_dotenv
load_dotenv()

DB_NAME = os.getenv("DB_NAME")
DB_USER = os.getenv("DB_USER")
DB_PASS = os.getenv("DB_PASS")
DB_HOST = os.getenv("DB_HOST")
DB_PORT = os.getenv("DB_PORT")


# Check if any are None
if None in [DB_NAME, DB_USER, DB_PASS, DB_HOST, DB_PORT]:
    raise ValueError("One or more database credentials are missing. Check your .env file and variable names.")

# Connect without exposing password
conn = psycopg2.connect(
    dbname=DB_NAME,
    user=DB_USER,
    password=DB_PASS,
    host=DB_HOST,
    port=DB_PORT
)

# Helper: run SQL and return DataFrame
def run_query(sql: str):
    return pd.read_sql(sql, conn)

# Example: show first 5 businesses
df = run_query("SELECT * FROM businesses LIMIT 5;")
display(df)


  return pd.read_sql(sql, conn)


Unnamed: 0,index,business,year_founded,category_code,country_code
0,0,Hamoud Boualem,1878,CAT11,DZA
1,1,Communauté Électrique du Bénin,1968,CAT10,BEN
2,2,Botswana Meat Commission,1965,CAT1,BWA
3,3,Air Burkina,1967,CAT2,BFA
4,4,Brarudi,1955,CAT9,BDI
