# Analysis of SNCF Train Punctuality

This notebook demonstrates how to process the open data released by SNCF for Intercités and TER trains. It aggregates monthly performance metrics and produces basic charts for exploratory analysis.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter

# Load aggregated data
combined = pd.read_csv('train_performance_monthly.csv')
combined['date'] = pd.to_datetime(combined['date'], format='%Y-%m')
combined.head()

## Trend of Monthly Punctuality

We plot the average punctuality (taux de régularité) of Intercités and TER over time.

In [None]:
plt.figure(figsize=(10,6))
plt.plot(combined['date'], combined['regularite_intercites'], label='Intercités', marker='o')
plt.plot(combined['date'], combined['regularite_ter'], label='TER', marker='o')
plt.title('Régularité mensuelle des trains Intercités et TER')
plt.xlabel('Date')
plt.ylabel('Taux de régularité (%)')
plt.gca().xaxis.set_major_formatter(DateFormatter('%Y-%m'))
plt.xticks(rotation=45)
plt.grid(True, linestyle='--', alpha=0.5)
plt.legend()
plt.tight_layout()
plt.show()

## Average punctuality

Calculate the mean regularity across the entire time span for each service.

In [None]:
avg_ic = combined['regularite_intercites'].mean()
avg_ter = combined['regularite_ter'].mean()
avg_ic, avg_ter