# Exercice

## Calculate the mean altitude per department

For some reason it is quite difficult to find the mean altitude of each department in France on the web. So let's compute it and check out which department is the highest on average.

For those who don't know, France is divided into several administrative levels, the main ones being:   

- Régions (13 + 5 overseas)
- Départements (96 + 5 overseas)
- Communes (about 35,000), which can be translated as "municipalities", "towns" or "cities".

## Instructions

- Go on the web and find a dataset that allows you to do this.
- Then use pandas to load the dataset and perform the necessary calculations.
- Verify that your results are the same than the file `mean_altitude_dept_output.csv` which is located in the same folder as this notebook. Results may vary slightly depending on the dataset you used, but the differences should be minimal.

In [None]:
import pandas as pd

# Code here!



In [None]:
#https://www.data.gouv.fr/datasets/communes-et-villes-de-france-en-csv-excel-json-parquet-et-feather/

import pandas as pd

d = {
    "code_insee": ["string", "insee_code"],
    "nom_standard": ["string", "name"],
    "dep_code": ["string", "dep_code"],
    "dep_nom": ["string", "dep_name"],
    "superficie_km2": ["float32", "area_km2"],
    "altitude_moyenne": ["float32", "mean_altitude_m"],
}

df = pd.read_csv(
    "data/communes-france-2025.csv.gz",
    compression="gzip",
    usecols=d.keys(),
    dtype={k: v[0] for k, v in d.items()}
).rename(columns={k: v[1] for k, v in d.items()})

df

In [None]:
import pandas as pd

dtypes = {
    "code_insee": "string",
    "nom_standard": "string",
    "dep_code": "string",
    "dep_nom": "string",
    "superficie_km2": "float32",
    "altitude_moyenne": "float32",
}

df = pd.read_csv(
    "data/communes-france-2025.csv.gz",
    compression="gzip",
    usecols=dtypes.keys(),
    dtype=dtypes
).rename(columns={
    "code_insee": "insee_code",
    "nom_standard": "standard_name",
    "dep_code": "dep_code",
    "dep_nom": "dep_name",
    "superficie_km2": "area_km2",
    "altitude_moyenne": "mean_altitude_m"})

df

In [None]:
total_area_per_department = df.groupby(["dep_code", "dep_name"]).agg(
    department_area_km2=("area_km2", "sum")
).sort_values("department_area_km2", ascending=False).reset_index()

total_area_per_department

In [None]:
df = df.merge(
    total_area_per_department,
    on=["dep_code", "dep_name"],
    how="left")

In [None]:
df['area_share'] = df['area_km2'] / df['department_area_km2']
df['mean_altitude_weighted'] = df['mean_altitude_m'] * df['area_share']

In [None]:

df.groupby(["dep_code", "dep_name"], as_index=False).agg(
    mean_altitude_m=("mean_altitude_weighted", lambda x: int(x.sum()))
).sort_values("mean_altitude_m", ascending=False).reset_index(drop=True)
