# Please make a copy in your team's workspace and have fun exploring!
## Hackathon demo notebook


Here we have created a skeleton code to follow an example problem statement that can be used by hackathon participants as a starting point, if needed.

### Import packages

In [None]:
import requests
from pprint import pprint
import json
from pydantic import BaseModel
from openai import OpenAI

### Run a basic LLM call to establish the 'before' results for specific problem *without* Content Store data

Example problem statement: As students progress, they build iteratively on skills and concepts learned at each stage of their education. The challenge is to develop a tool which can analyse STA guidance and National Curriculum materials, map how key skills and concepts are built upon across Key Stages 1 to 3 and generate resources and materials specifically focused on helping teachers with students transitioning between the Key Stages.

One approach here could be to ask an LLM if it can generate some broad lesson plans for an example subject across different age groups to test how well it captures the evolution of skills and concepts across educational stages.

This example is specific to OpenAI, using a pydantic model, but many LLMs can produce the same output with a `json` model instead in the `response_format`.

In [None]:
#Set up a pydantic model for the response format
class Lesson(BaseModel):
    title: str
    content: str

class YearGroup(BaseModel):
    year: str
    lessons: list[Lesson]

class Subject(BaseModel):
    name: str
    year_groups: list[YearGroup]

class EducationalTransitions(BaseModel):
    subjects: list[Subject]

In [None]:
#Run the GPT call and adapt the user prompt to suit your relevant problem statement
client_openai = OpenAI(api_key="your-OpeanAI-api-key") #enter your own key here

prompt = """We want to investigate how educational content develops over time within a particular subject.
            Prepare a set of lessons for a couple of subjects and year groups that evolve across time, to
            demonstrate how children can improve their understanding as they grow up. For example the topic of atoms,
            how would you teach the simpler concepts to a year 2 child comparatively to the complex content taught
            in year 9? Pick the year group transition points as you think is most sensible and important."""

completion = client_openai.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "You are an educational expert in lesson planning in England."},
        {"role": "user", "content": prompt},
    ],
    response_format=EducationalTransitions
)

#Inspect the output
result = completion.choices[0].message.parsed
result.model_dump()

### Example for extracting data from the Content Store

The example here extracts data from the 'collections' and 'element' endpoints either via the SDK or via the API.

Make sure to play around with different endpoints to extract the data most relevant for your problem statement.

#### Access through the SDK

Make sure you have followed the steps in the documentation to download and install the Content Store SDK.

In [None]:
#Instantiating the sdk client with syncronous mode
from sdk.settings import get_settings
from sdk.client import HTTPClient

# Initalise the settings required for the SDK
settings = get_settings()

# Initalise the SDK client
client_sdk = HTTPClient()

Example to extract collections data

In [None]:
from sdk.endpoints.collections import Collections

# Initialse the collections wrapper with the previously created client instance
collections_client = Collections(client=client_sdk)

# Extract collections data
collections = collections_client.list_all()
collections_results = collections.results

In [None]:
collections_results

Example to extract elements data

In [None]:
from sdk.endpoints.elements import Elements

elements_client = Elements(client=client_sdk)

# Extract all elements in the store
all_elements = elements_client.list_all(limit=100, offset=0)

In [None]:
all_elements.model_dump()

In [None]:
# Search for elements with a specific tag e.g. GCSE
GCSE_elements = elements_client.search(
    query={
        "type": "bool",
        "must": [{"type": "terms", "field_name": "tags", "values": ["GCSE"]}],
        "pagination": {
            "limit": 100,
            "offset": 0
        }
    }
)
GCSE_elements.model_dump()

In [None]:
# Search for elements of a specific subject e.g. Mathematics
maths_elements = elements_client.search(
    query={
        "type": "bool",
        "must": [
            {
            "type": "hierarchical_match",
            "field_name": "taxonomy",
            "value": "*{1}.mathematics.*"
            }
        ],
        "pagination": {
            "limit": 2,
            "offset": 0
        }
    }
)
maths_elements.model_dump()

#### (or) Access directly through the API

Make sure you have followed the steps in the documentation for setting up your Content Store API key.

##### Example for extracting collections data

In [None]:
reqUrl = "https://pp-api.education.gov.uk/dev/aics-public-tst/collections" #change to desired endpoint

headers = {
  "Content-Type": "application/json",
  "Ocp-Apim-Subscription-Key": "your-key-here", #enter your Content Store API key here
  "user-agent": "python"
}

response = requests.get(reqUrl, headers=headers)

pprint(response.text)

In [None]:
# Convert to a dict and inspect
collections_dict = json.loads(response.text)

# Display all collection names and corresponding IDs
for collection  in collections_dict['results']:
    print(f"{collection ['name']}: {collection ['id']}")

##### Example for extracting all files corresponding to a collection

In [None]:
collection_id = 'c8e37652-055d-4564-b6a8-20cd91d2172d' # Here we use the id corresponding to the collection of the National Curriculum (Key stages 1-4)
reqUrl = f"https://pp-api.education.gov.uk/dev/aics-public-tst/collections/{collection_id}/files?limit=100&offset=0"

response = requests.get(reqUrl, headers=headers)

files_dict = json.loads(response.text)

for file in files_dict['results']:
    print(f"{file['name']}: {file['id']}")

In [None]:
file_id = '7f8b50bb-6358-41ae-aaf1-32ed6f985a76' # We select the file with the id corresponding to the National Curriculum for Key Stages 1&2 Science
reqUrl = f"https://pp-api.education.gov.uk/dev/aics-public-tst/files/{file_id}/parts"

response = requests.get(reqUrl, headers=headers)
text_dict = json.loads(response.text)
pprint(text_dict['results'])

### Run the relevant Content Store data through a similar LLM pipeline to establish how the problem solution is improved *with* access to Content Store

This time, you might want to run a similar LLM approach but now inputting the data that you have extracted from the Content Store to see if having access to this data improves the outcome of your solution.

In [None]:
#Adapt the prompt to include the relevent data from the Content Store - change the {all_elements} data
prompt = """We want to investigate how educational content develops over time within a particular subject.
            Prepare a set of lessons for a couple of subjects and year groups that evolve across time, to
            demonstrate how children can improve their understanding as they grow up. For example the topic of atoms,
            how would you teach the simpler concepts to a year 2 child comparatively to the complex content taught
            in year 9?

            You will be given a dictionary containing national curricula requirements for different subjects and year
            groups here:
            {all_elements}

            Make sure that the lesson plans follow the requirements set out for the same subject or topic across
            year groups. Pick the year group transition points that you think are most sensible and important."""

In [None]:
#Run again with new prompt and data
completion = client_openai.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "You are an educational expert in lesson planning in England."},
        {"role": "user", "content": prompt},
    ],
    response_format=EducationalTransitions
)

#Inspect the output
result = completion.choices[0].message.parsed
result.model_dump()