# üìä Analyse des Inscriptions √âtudiants

**Auteur:** Diallo Naby Moussa  
**Date:** Janvier 2026  
**Objectif:** Analyse exploratoire de donn√©es

---

## üîß Configuration initiale

In [None]:
import os
os.makedirs('visualizations', exist_ok=True)
os.makedirs('data', exist_ok=True)
print("‚úÖ Dossiers cr√©√©s!")

In [None]:
import urllib.request
url = 'https://raw.githubusercontent.com/nabydiallo49-gif/analyse-inscriptions-etudiants/main/data/inscriptions_etudiants.csv'
try:
    urllib.request.urlretrieve(url, 'data/inscriptions_etudiants.csv')
    print("‚úÖ Dataset t√©l√©charg√©!")
except:
    print("‚ö†Ô∏è Erreur t√©l√©chargement")

## 1Ô∏è‚É£ Importation

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')
print("‚úÖ Biblioth√®ques OK!")

## 2Ô∏è‚É£ Chargement

In [None]:
df = pd.read_csv('data/inscriptions_etudiants.csv')
print(f"‚úÖ {len(df)} lignes charg√©es")
df.head()

## 3Ô∏è‚É£ Exploration

In [None]:
df.info()

In [None]:
df.describe()

## 4Ô∏è‚É£ Nettoyage

In [None]:
print("Valeurs manquantes:")
print(df.isnull().sum())
print(f"\nDoublons: {df.duplicated().sum()}")

In [None]:
for col in df.select_dtypes(include=['object']).columns:
    df[col] = df[col].str.strip()
print("‚úÖ Nettoyage termin√©")

## 5Ô∏è‚É£ Statistiques

In [None]:
print(f"Total: {len(df)} √©tudiants")
print(f"√Çge moyen: {df['age'].mean():.1f} ans")
print(f"Frais moyen: {df['frais_scolarite'].mean():,.0f} FCFA")
payes = (df['statut_paiement'] == 'Pay√©').sum()
print(f"Taux paiement: {payes/len(df)*100:.1f}%")

## 6Ô∏è‚É£ R√©partitions

In [None]:
print("Par sexe:")
print(df['sexe'].value_counts())
print("\nPar fili√®re:")
print(df['filiere'].value_counts())

## 7Ô∏è‚É£ Visualisations

In [None]:
plt.figure(figsize=(10, 6))
df['sexe'].value_counts().plot(kind='pie', autopct='%1.1f%%')
plt.title('R√©partition par sexe')
plt.ylabel('')
plt.savefig('visualizations/repartition_sexe.png', dpi=300, bbox_inches='tight')
plt.show()
print("‚úÖ 1/6")

In [None]:
plt.figure(figsize=(12, 6))
df['filiere'].value_counts().plot(kind='bar')
plt.title('√âtudiants par fili√®re')
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.savefig('visualizations/repartition_filieres.png', dpi=300, bbox_inches='tight')
plt.show()
print("‚úÖ 2/6")

In [None]:
plt.figure(figsize=(10, 6))
df['statut_paiement'].value_counts().plot(kind='pie', autopct='%1.1f%%')
plt.title('Statut des paiements')
plt.ylabel('')
plt.savefig('visualizations/statut_paiements.png', dpi=300, bbox_inches='tight')
plt.show()
print("‚úÖ 3/6")

In [None]:
plt.figure(figsize=(12, 6))
inscriptions = df['annee_inscription'].value_counts().sort_index()
plt.plot(inscriptions.index, inscriptions.values, marker='o')
plt.title('√âvolution des inscriptions')
plt.xlabel('Ann√©e')
plt.ylabel('Inscriptions')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('visualizations/evolution_inscriptions.png', dpi=300, bbox_inches='tight')
plt.show()
print("‚úÖ 4/6")

In [None]:
plt.figure(figsize=(12, 6))
plt.hist(df['age'], bins=15, edgecolor='black')
plt.axvline(df['age'].mean(), color='red', linestyle='--', label='Moyenne')
plt.title('Distribution des √¢ges')
plt.xlabel('√Çge')
plt.ylabel('Fr√©quence')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('visualizations/distribution_ages.png', dpi=300, bbox_inches='tight')
plt.show()
print("‚úÖ 5/6")

In [None]:
plt.figure(figsize=(12, 6))
plt.hist(df['frais_scolarite'], bins=15, edgecolor='black')
plt.axvline(df['frais_scolarite'].mean(), color='red', linestyle='--', label='Moyenne')
plt.title('Distribution des frais')
plt.xlabel('Frais (FCFA)')
plt.ylabel('Fr√©quence')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('visualizations/distribution_frais.png', dpi=300, bbox_inches='tight')
plt.show()
print("‚úÖ 6/6")

## 8Ô∏è‚É£ Analyses crois√©es

In [None]:
print("Frais moyens par fili√®re:")
print(df.groupby('filiere')['frais_scolarite'].mean().sort_values(ascending=False))

In [None]:
print("Taux de paiement par fili√®re:")
taux = df.groupby('filiere')['statut_paiement'].apply(lambda x: (x=='Pay√©').sum()/len(x)*100)
print(taux.sort_values(ascending=False))

## 9Ô∏è‚É£ T√©l√©charger les r√©sultats

In [None]:
import shutil
shutil.make_archive('visualizations_etudiants', 'zip', 'visualizations')
print("‚úÖ ZIP cr√©√©!")

In [None]:
from google.colab import files
files.download('visualizations_etudiants.zip')

## üìù Conclusions

### Insights cl√©s:
- Population √©quilibr√©e
- Diversit√© des fili√®res
- Taux de paiement √† am√©liorer

### Recommandations:
1. Syst√®me de relance paiements
2. Analyse orientation √©tudiants
3. Suivi tendances

---
**Diallo Naby Moussa**  
nabydiallo49@gmail.com