# üß† Sp√©cialit√©s Advisor ‚Äì A Decision Tool for French High School Students

## üéØ Goal

This project aims to help French high school students in **classe de 2nde** choose the best **enseignements de sp√©cialit√©** for Premi√®re and Terminale, based on:
- Their academic **interests**
- Their **strengths in subjects studied in 2nde**
- Their **desired university fields** (if known)

The final version will be deployed as a **Streamlit web app**.

## üß∞ Inputs

Students will be asked to provide:
- ‚úÖ Their **interests** (academic or personal)
- ‚úÖ Their **strongest 2nde subjects**, such as:
  - Fran√ßais
  - Histoire-G√©ographie
  - Langues Vivantes
  - SES
  - Math√©matiques
  - Physique-Chimie
  - SVT
  - EPS
  - EMC
  - SNT

## üìå Output

The tool will return:
- Suggested **3 sp√©cialit√©s** to choose for Premi√®re
- Which one to drop in Terminale (optional suggestion)
- Matching university degree examples
- Relevant admission statistics (from Parcoursup)


In [36]:
# üîç Define the list of interests that students can choose from

interests_list = [
    # üéì Academic & Career-Oriented
    "Medicine & health", "Engineering & technology", "Law & politics", "Economics & business",
    "Environment & sustainability", "Psychology & human behavior", "Education & teaching",
    "Architecture & design", "Science & research", "Mathematics & logic", "Literature & philosophy",
    "History & geopolitics", "Computer science", "Space & astronomy",

    # üé® Creative & Artistic
    "Music", "Visual arts", "Theater & performance", "Writing & storytelling",
    "Film & media", "Fashion & aesthetics", "Design & architecture",

    # üíº Practical & Social
    "Entrepreneurship", "Communication", "Leadership", "Helping others / volunteering",
    "Public speaking", "Management", "Travel & cultures",

    # üßò Personal & Lifestyle
    "Sports & fitness", "Gaming", "Nature & animals", "Food & cooking",
    "DIY / crafting", "Photography", "Digital tools & technology"
]

In [38]:
# üîó Map each interest to one or more relevant sp√©cialit√©s
interest_to_specialite = {
    "Medicine & health": ["Sciences de la Vie et de la Terre", "Physique-Chimie"],
    "Engineering & technology": ["Math√©matiques", "Physique-Chimie", "Num√©rique et Sciences Informatiques", "Sciences de l‚ÄôIng√©nieur"],
    "Law & politics": ["Histoire-G√©ographie, G√©opolitique et Sciences Politiques", "Sciences √©conomiques et sociales", "Humanit√©s, Litt√©rature et Philosophie"],
    "Economics & business": ["Math√©matiques", "Sciences √©conomiques et sociales"],
    "Environment & sustainability": ["Sciences de la Vie et de la Terre", "Physique-Chimie"],
    "Psychology & human behavior": ["Sciences de la Vie et de la Terre", "Humanit√©s, Litt√©rature et Philosophie"],
    "Education & teaching": ["Humanit√©s, Litt√©rature et Philosophie", "Litt√©rature, langues et cultures de l‚ÄôAntiquit√©"],

    "Architecture & design": ["Arts", "Math√©matiques", "Sciences de l‚ÄôIng√©nieur"],
    "Science & research": ["Math√©matiques", "Physique-Chimie", "Sciences de la Vie et de la Terre"],
    "Mathematics & logic": ["Math√©matiques", "Num√©rique et Sciences Informatiques"],
    "Literature & philosophy": ["Humanit√©s, Litt√©rature et Philosophie", "Litt√©rature, langues et cultures de l‚ÄôAntiquit√©", "Langues, litt√©ratures et cultures √©trang√®res"],
    "History & geopolitics": ["Histoire-G√©ographie, G√©opolitique et Sciences Politiques", "Humanit√©s, Litt√©rature et Philosophie"],
    "Computer science": ["Num√©rique et Sciences Informatiques", "Math√©matiques"],
    "Space & astronomy": ["Math√©matiques", "Physique-Chimie"],

    "Music": ["Arts"],
    "Visual arts": ["Arts"],
    "Theater & performance": ["Arts", "Humanit√©s, Litt√©rature et Philosophie"],
    "Writing & storytelling": ["Humanit√©s, Litt√©rature et Philosophie", "Litt√©rature, langues et cultures de l‚ÄôAntiquit√©"],
    "Film & media": ["Arts", "Langues, litt√©ratures et cultures √©trang√®res"],
    "Fashion & aesthetics": ["Arts"],
    "Design & architecture": ["Arts", "Sciences de l‚ÄôIng√©nieur"],

    "Entrepreneurship": ["Sciences √©conomiques et sociales", "Math√©matiques"],
    "Communication": ["Langues, litt√©ratures et cultures √©trang√®res", "Humanit√©s, Litt√©rature et Philosophie"],
    "Leadership": ["Sciences √©conomiques et sociales", "Histoire-G√©ographie, G√©opolitique et Sciences Politiques"],
    "Helping others / volunteering": ["Sciences de la Vie et de la Terre", "Humanit√©s, Litt√©rature et Philosophie"],
    "Public speaking": ["Histoire-G√©ographie, G√©opolitique et Sciences Politiques", "Langues, litt√©ratures et cultures √©trang√®res"],
    "Management": ["Sciences √©conomiques et sociales", "Math√©matiques"],
    "Travel & cultures": ["Histoire-G√©ographie, G√©opolitique et Sciences Politiques", "Langues, litt√©ratures et cultures √©trang√®res"],

    "Sports & fitness": ["Sciences de la Vie et de la Terre", "Math√©matiques"],
    "Gaming": ["Num√©rique et Sciences Informatiques", "Math√©matiques"],
    "Nature & animals": ["Sciences de la Vie et de la Terre"],
    "Food & cooking": ["Sciences de la Vie et de la Terre", "Physique-Chimie"],
    "DIY / crafting": ["Arts", "Sciences de l‚ÄôIng√©nieur"],
    "Photography": ["Arts"],
    "Digital tools & technology": ["Num√©rique et Sciences Informatiques", "Math√©matiques"],
}

In [40]:
# üìö Define the list of 2nde subjects the user can select as their strengths

strengths_list = [
    "Fran√ßais",
    "Histoire-G√©ographie",
    "Langues vivantes (A ou B)",
    "Sciences √©conomiques et sociales (SES)",
    "Math√©matiques",
    "Physique-Chimie",
    "Sciences de la vie et de la Terre (SVT)",
    "√âducation physique et sportive (EPS)",
    "Enseignement moral et civique (EMC)",
    "Sciences num√©riques et technologie (SNT)"
]

In [42]:
# üéì Map 2nde subjects to related sp√©cialit√©s
strength_to_specialite = {
    "Fran√ßais": ["Humanit√©s, Litt√©rature et Philosophie", "Litt√©rature, langues et cultures de l‚ÄôAntiquit√©"],
    "Histoire-G√©ographie": ["Histoire-G√©ographie, G√©opolitique et Sciences Politiques", "Humanit√©s, Litt√©rature et Philosophie"],
    "Langues vivantes (A ou B)": ["Langues, litt√©ratures et cultures √©trang√®res", "Litt√©rature, langues et cultures de l‚ÄôAntiquit√©"],
    "Sciences √©conomiques et sociales (SES)": ["Sciences √©conomiques et sociales", "Histoire-G√©ographie, G√©opolitique et Sciences Politiques"],
    "Math√©matiques": ["Math√©matiques", "Num√©rique et Sciences Informatiques", "Physique-Chimie"],
    "Physique-Chimie": ["Physique-Chimie", "Math√©matiques", "Sciences de l‚ÄôIng√©nieur"],
    "Sciences de la vie et de la Terre (SVT)": ["Sciences de la Vie et de la Terre", "Physique-Chimie"],
    "√âducation physique et sportive (EPS)": ["Sciences de la Vie et de la Terre"],
    "Enseignement moral et civique (EMC)": ["Humanit√©s, Litt√©rature et Philosophie", "Histoire-G√©ographie, G√©opolitique et Sciences Politiques"],
    "Sciences num√©riques et technologie (SNT)": ["Num√©rique et Sciences Informatiques", "Math√©matiques"],
}


In [44]:
from collections import Counter

def recommend_specialites(interests_selected, strengths_selected, weight_interests=1, weight_strengths=2):
    total_scores = Counter()

    # Add scores from interests
    for interest in interests_selected:
        for sp in interest_to_specialite.get(interest, []):
            total_scores[sp] += weight_interests

    # Add scores from strengths
    for strength in strengths_selected:
        for sp in strength_to_specialite.get(strength, []):
            total_scores[sp] += weight_strengths

    # Return top 3 sp√©cialit√©s
    return [sp for sp, _ in total_scores.most_common(3)]


In [50]:
def top_two_specialites(interests_selected, strengths_selected, weight_strengths=1):
    recos = recommend_specialites(interests_selected, strengths_selected, weight_strengths)

    if len(recos) >= 2:
        to_keep = recos[:2]
        to_drop = recos[2] if len(recos) >= 3 else None
    else:
        to_keep = recos
        to_drop = None

    return {
        "to_keep": to_keep,
        "to_drop": to_drop
    }


In [64]:
import pandas as pd
df = pd.read_csv("cleaned_specialites.csv")

In [66]:
df

Unnamed: 0,specialites,formation,nb_candidats_voeu,nb_candidats_admis,nb_candidats_acceptes
0,"['Arts', 'Humanit√©s, Litt√©rature et Philosophie']",CPGE L,391,266.0,138
1,"['Arts', 'Num√©rique et Sciences Informatiques']",CPGE ECG,2,1.0,0
2,"['Arts', 'Num√©rique et Sciences Informatiques']",Formations d'ART,163,51.0,32
3,"['Arts', 'Physique-Chimie']","Licence Arch√©ologie, Ethno, Pr√©histoire, Anthr...",3,2.0,1
4,"['Arts', 'Physique-Chimie']",Ensemble des candidats bacheliers,513,470.0,350
...,...,...,...,...,...
3180,"['Sciences de la vie et de la terre', ""Litt√©ra...",CPGE ECG,0,0.0,0
3181,"['Sciences de la vie et de la terre', ""Science...",Licence AES,4,4.0,0
3182,"['Sciences de la vie et de la terre', ""Science...",BTS production,35,20.0,4
3183,"['Sciences de la vie et de la terre', ""Science...",CPGE ECG,0,0.0,0


In [86]:
recommended = ['Math√©matiques', 'Physique-Chimie']
filtered_df = df[df['specialites'].apply(lambda x: all(spec in x for spec in recommended))]