# Freshman On-Track (FOT) Intervention Recommender
### A Proof-of-Concept

**Goal:** To show, in just a few steps, how we can turn a description of a struggling student into a set of clear, actionable, and evidence-based strategies.

This notebook demonstrates the core **Retrieval** engine that powers our recommender. It shows how the system intelligently finds the most relevant documents from a knowledge base to match a student's needs.

## Step 1: Setting Up the Environment

First, we need to load the project's code and install its dependencies. This cell prepares the notebook to run our custom logic.

*(This notebook is designed to run in Google Colab, but the code below will also adapt to a local environment if the project files are present.)*

In [1]:
import sys, os, warnings, json
from pathlib import Path

# This prevents common, harmless warnings from cluttering the output.
os.environ["TOKENIZERS_PARALLELISM"] = "false"
warnings.filterwarnings(
    "ignore", category=FutureWarning
)  # Suppress specific torch warning

# Clones the project from GitHub if not already present.
PROJECT_DIR = "fot-intervention-recommender"
if not Path(PROJECT_DIR).is_dir():
    print("🚀 Downloading project files...")
    !git clone -q https://github.com/chuckfinca/fot-intervention-recommender.git

# Installs packages and adds the project's code to our Python path.
print("📦 Setting up the environment...")
!{sys.executable} -m pip install -q -r {PROJECT_DIR}/requirements.txt
sys.path.insert(0, str(Path(PROJECT_DIR) / "src"))

# Define the project_path variable needed by the rest of the notebook
project_path = Path(PROJECT_DIR)

print("✅ Environment is ready!")

📦 Setting up the environment...
/Users/charlesfeinn/Developer/job_applications/fot-intervention-recommender/.venv/bin/python3: No module named pip
✅ Environment is ready!


## Step 2: Define the Student (The Input)

Everything starts with a student. Our system takes a simple narrative summary that an educator might write. This summary describes the student's challenges in plain English. 

Let's use the sample profile from the project description.

In [2]:
from IPython.display import display, Markdown

student_profile = {
    "narrative_summary": "This student is struggling to keep up with coursework, "
    "having failed one core class and earning only 2.5 credits out of 4 credits "
    "expected for the semester. Attendance is becoming a concern at 88% for an average "
    "annual target of 90%, and they have had one behavioral incident. "
    "The student needs targeted academic and attendance support to get back on track for graduation."
}

student_query = student_profile["narrative_summary"]

display(Markdown(f"**Student Query:**\n> {student_query}"))

**Student Query:**
> This student is struggling to keep up with coursework, having failed one core class and earning only 2.5 credits out of 4 credits expected for the semester. Attendance is becoming a concern at 88% for an average annual target of 90%, and they have had one behavioral incident. The student needs targeted academic and attendance support to get back on track for graduation.

## Step 3: Find Relevant Strategies (The "Retrieval" Step)

Now, we take the student's story and find the most relevant strategies from our **Knowledge Base**—a curated library of best practices and proven interventions.

This next cell will perform the core retrieval logic:
1.  Load the pre-processed knowledge base and citation data.
2.  Initialize the text embedding model.
3.  Create a searchable Facebook AI Similarity Search (FAISS) vector index.
4.  Use the student query to find the top 3 most similar interventions.

The output will show the evidence-based strategies our system identified.

In [4]:
display(Markdown("🚀 **Starting the retrieval pipeline...**"))
print("This may take a moment as the system loads the embedding model, prepares the knowledge base, and performs the search.")

from fot_recommender.rag_pipeline import (
    load_knowledge_base,
    initialize_embedding_model,
    create_embeddings,
    create_vector_db,
    search_interventions,
)
from fot_recommender.utils import display_recommendations

# 1. Load data
kb_path = project_path / "data" / "processed" / "knowledge_base_final_chunks.json"
citations_path = project_path / "data" / "processed" / "citations.json"
knowledge_base_chunks = load_knowledge_base(str(kb_path))
with open(citations_path, "r") as f:
    citations_map = {item["source_document"]: item for item in json.load(f)}

# 2. Initialize models and DB (quietly)
embedding_model = initialize_embedding_model()
embeddings = create_embeddings(knowledge_base_chunks, embedding_model)
vector_db = create_vector_db(embeddings)

# 3. Perform search (quietly)
retrieved_interventions = search_interventions(
    query=student_query,
    model=embedding_model,
    index=vector_db,
    knowledge_base=knowledge_base_chunks,
    k=3,
    min_similarity_score=0.4,
)

# 4. Display a clean summary and the rich results
print(
    f"✅ Successfully loaded models and retrieved the top {len(retrieved_interventions)} most relevant interventions from the knowledge base."
)
display_recommendations(retrieved_interventions, citations_map)

🚀 **Starting the retrieval pipeline...**

This may take a moment as the system loads the embedding model, prepares the knowledge base, and performs the search.
Initializing embedding model: all-MiniLM-L6-v2...
Model initialized successfully.
Creating embeddings for 27 chunks...


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

Embeddings created successfully.
Creating FAISS index with dimension 384...
FAISS index created with 27 vectors.

Searching for top 3 interventions for query: 'This student is struggling to keep up with coursework, having failed one core cl...'
Found 3 relevant interventions.
✅ Successfully loaded models and retrieved the top 3 most relevant interventions from the knowledge base.


### Evidence Base


**Strategy: Differentiating Intervention Tiers**
- **Source:** *Freshman On‑Track Toolkit (2nd Edition)* (Network for College Success, 2017).
- **Page(s):** Pages: 46
- **Relevance Score:** 0.57
- **Content Snippet:**
> To what degree is attendance playing a role in student performance? To whom do you refer Tier 3 students who have serious attendance issues (inside and outside of the school) so that the Success Team can really concentrate on supporting Tier 2 students?

---



**Tool: Intervention Tracking**
- **Source:** *Freshman On‑Track Toolkit (2nd Edition)* (Network for College Success, 2017).
- **Page(s):** Pages: 49
- **Relevance Score:** 0.54
- **Content Snippet:**
> Features of Good Intervention Tracking Tools:
> • Name of the intervention and what key performance indicator it addresses (attendance, point-in-time On-Track rates, GPA, behavior metric, etc.)
> • Names of the targeted students
>   ° If tracking grades, include each core course's average expressed as a percentage
> • Intervention contacts/implementation evidence
>   ° Tutoring attendance
>   ° Mentorship contact dates
>   ° "Office hours" visits
> • Point-in-time progress on the key performance indicator impacted by the intervention
>   ° Should include at least 2 checkpoints within a 10-week period
>   ° If tracking grades, provide an average expressed as a percentage for each core course
>   ° If tracking attendance, provide number of cumulative absences and/or tardies

---



**Tool: BAG Report (Example)**
- **Source:** *Freshman On‑Track Toolkit (2nd Edition)* (Network for College Success, 2017).
- **Page(s):** Pages: 61
- **Relevance Score:** 0.53
- **Content Snippet:**
> Student: Keith
> Grade Level: 9
> 8th Period Teacher: Donson
> The numbers below reflect totals through Semester 1
> 
> BEHAVIOR - In what ways do I contribute to a Safe and Respectful school climate?
> • # of Infractions (# of Major Infractions): 5 (1)
> • # of Days of In-School-Suspension (ISS): 10
> • # of Days of Out-of-School-Suspension (OSS): 0
> If I have any questions regarding my misconducts, I should schedule an appointment with the Dean of Discipline.
> 
> ATTENDANCE - Do my actions reflect the real me?
> • Days Enrolled: 80
> • Days Present: 73
> • Days Absent: 7
> • My Year-to-Date Attendance Rate is 91%
> If I have any questions regarding my attendance, I should schedule an appointment with the Attendance Dean.
> 
> GRADES - How am I doing academically in my classes? Do my grades represent my true ability?
> Period | Courses | Teacher | Grade
> P1 | Algebra 1 | Flint | D
> P2 | English 1 | Lemon | B
> P3 | World Studies | Moeller | C
> P4 | PE I-Health | Spann | A
> P5 | Lunch | | 
> P6 | Science | Tyson | D
> P7 | Photography | McCain | B
> P8 | Intro to Comp | Penny | A
> 
> My Estimated GPA is 2.57
> (this estimate does NOT include any previous semesters)
> 
> If I have any questions regarding my grade in a course, I should schedule an appointment with my Teacher.

---


## Step 4: See the Full System in the Live Demo!

You've just seen the core **Retrieval** engine at work. The system successfully took a student's story and identified the most relevant, evidence-based strategies from our knowledge base.

The final step in our RAG pipeline is **Generation**, where we use a Large Language Model to synthesize this evidence into a clear, actionable recommendation for an educator. This step requires a secure API key, so we've hosted it in an interactive web application.

Click the link below to see the full system in action. You can use the student narrative from this notebook or try your own!

### [👉 Click Here to Launch the Live FOT Recommender API](https://huggingface.co/spaces/chuckfinca/fot-recommender-api)