In [2]:
import minsearch
import json

with open('./documents.json', 'rt') as f:
    docs_raw = json.load(f)

In [3]:
documents = []
for course_dict in docs_raw:
    for doc in course_dict['documents']:
        doc['course'] = course_dict['course']
        documents.append(doc)

In [4]:
documents[0]

{'text': "The purpose of this document is to capture frequently asked technical questions\nThe exact day and hour of the course will be 15th Jan 2024 at 17h00. The course will start with the first  “Office Hours'' live.1\nSubscribe to course public Google Calendar (it works from Desktop only).\nRegister before the course starts using this link.\nJoin the course Telegram channel with announcements.\nDon’t forget to register in DataTalks.Club's Slack and join the channel.",
 'section': 'General course-related questions',
 'question': 'Course - When will the course start?',
 'course': 'data-engineering-zoomcamp'}

In [5]:
index = minsearch.Index(text_fields=['question',
                             'text',
                             'section'],
                keyword_fields=['course']
)

In [6]:
q = "the course has already started, can i still enroll?"

In [7]:
index.fit(documents)

<minsearch.minsearch.Index at 0x79e436e955e0>

In [8]:
boost = {
    'enroll': 2.0,
    'started': 1.0,
    'course': 0.5
}

results = index.search(
    query=q,
    filter_dict={'course': 'machine-learning-zoomcamp'},
    boost_dict=boost,
    num_results=5,
)

In [9]:
results

[{'text': 'Yes, you can. You won’t be able to submit some of the homeworks, but you can still take part in the course.\nIn order to get a certificate, you need to submit 2 out of 3 course projects and review 3 peers’ Projects by the deadline. It means that if you join the course at the end of November and manage to work on two projects, you will still be eligible for a certificate.',
  'section': 'General course-related questions',
  'question': 'The course has already started. Can I still join it?',
  'course': 'machine-learning-zoomcamp'},
 {'text': 'Welcome to the course! Go to the course page (http://mlzoomcamp.com/), scroll down and start going through the course materials. Then read everything in the cohort folder for your cohort’s year.\nClick on the links and start watching the videos. Also watch office hours from previous cohorts. Go to DTC youtube channel and click on Playlists and search for {course yyyy}. ML Zoomcamp was first launched in 2021.\nOr you can just use this lin

In [10]:
q

'the course has already started, can i still enroll?'

In [11]:
import ollama

In [12]:
response = ollama.chat(
    model="gpt-oss:20b",
    messages=[
        {
            "role": "user",
            "content": q
        }
    ]
)

In [16]:
print(response['message']['content'])

Sure! It’s not uncommon for courses to still let you join even after the first session has already happened. Here’s a quick checklist to help you get in on the action:

| What to do | Why it matters | Tips |
|------------|----------------|------|
| **1. Check the enrollment button** | Some platforms automatically unlock the course for anyone who signs up, regardless of the start date. | On the course landing page, look for “Enroll now,” “Join course,” or a similar button. |
| **2. Review the course syllabus & calendar** | Even if you’re a late‑comer, you’ll want to see what you’ll miss and how much you’ll need to catch up. | Pay attention to lecture dates, assignment due dates, and any live‑event schedules. |
| **3. Look for a “Waitlist” or “Enroll late” option** | Many programs provide a waitlist or an “late enrollee” pathway that ensures you still get access. | If the button says “Join waitlist” or “Enroll later,” click it. |
| **4. Contact the instructor or support team** | A quick 

In [23]:
def get_prompt_template(context, question):
    prompt_template = f"""
    You're a course teaching assistant. Your job is to help students with their 
    questions and provide explanations on various topics.
    Answer the question based on the CONTEXT. Use only the facts from the CONTEXT when answering the QUESTION.
    If the CONTEXT doesn't contain the answer, output NONE.

    CONTEXT:
    {context}
    
    QUESTION:
    {question}
    """
    return prompt_template

In [18]:
context = ""
for doc in results:
    context += f"section: {doc['section']}\nquestion: {doc['question']}\ntext: {doc['text']}\n\n" 

In [20]:
print(context)

section: General course-related questions
question: The course has already started. Can I still join it?
text: Yes, you can. You won’t be able to submit some of the homeworks, but you can still take part in the course.
In order to get a certificate, you need to submit 2 out of 3 course projects and review 3 peers’ Projects by the deadline. It means that if you join the course at the end of November and manage to work on two projects, you will still be eligible for a certificate.

section: General course-related questions
question: I just joined. What should I do next? How can I access course materials?
text: Welcome to the course! Go to the course page (http://mlzoomcamp.com/), scroll down and start going through the course materials. Then read everything in the cohort folder for your cohort’s year.
Click on the links and start watching the videos. Also watch office hours from previous cohorts. Go to DTC youtube channel and click on Playlists and search for {course yyyy}. ML Zoomcamp w

In [24]:
aug_prompt = get_prompt_template(context, q)

In [25]:
aug_prompt

"\n    You're a course teaching assistant. Your job is to help students with their \n    questions and provide explanations on various topics.\n    Answer the question based on the CONTEXT. Use only the facts from the CONTEXT when answering the QUESTION.\n    If the CONTEXT doesn't contain the answer, output NONE.\n\n    CONTEXT:\n    section: General course-related questions\nquestion: The course has already started. Can I still join it?\ntext: Yes, you can. You won’t be able to submit some of the homeworks, but you can still take part in the course.\nIn order to get a certificate, you need to submit 2 out of 3 course projects and review 3 peers’ Projects by the deadline. It means that if you join the course at the end of November and manage to work on two projects, you will still be eligible for a certificate.\n\nsection: General course-related questions\nquestion: I just joined. What should I do next? How can I access course materials?\ntext: Welcome to the course! Go to the course 

In [26]:
aug_response = ollama.chat(
    model="gpt-oss:20b",
    messages=[
        {"role": "user", "content": aug_prompt}
    ]
)

In [27]:
print(aug_response['message']['content'])

Yes – you can still enroll. You may miss the opportunity to submit some of the earlier homeworks, but you can still participate in the course and, if you complete two projects and review three peers’ projects by the deadline, you’ll still be eligible for a certificate.
