# 🧑‍🏫 Large Language Models for Education 🧑‍🏫

Ref from: 
- https://github.com/aws-samples/amazon-sagemaker-immersion-day-for-research/tree/main/10.%20Generative_AI/1.%20Education_QnA

---

### 🧑‍🎓 A note on Generative AI in Education

By harnessing generative AI, educators can unlock new and captivating products, enabling them to craft engaging and interactive learning experiences that promote student growth. Experts envision a future where generative AI empowers educators to revolutionize the way knowledge is imparted, paving the way for transformative educational practices.

---

In this notebook, we demonstrate how to use large language models (LLMs) for use cases in education.  LLMs can be used for tasks such as summarization, question-answering, or the generation of question & answer pairs.

Text Summarization is the task of shortening the data and creating a summary that represents the most important information present in the original text. Here, we show how to use state-of-the-art pre-trained model **FLAN T5** for text summarization, as well all the other tasks. 

In the first part of the notebook, we select and deploy the **FLAN T5** model as a SageMaker Real-time endpoint, on a single `ml.p3.2xlarge` instance. SageMaker Real-time endpoints is ideal for inference workloads where you have real-time, interactive, low latency requirements.  These endpoints are fully managed, automatically serve your models through HTTP, and support auto-scaling.

Once the model is deployed and ready to use, we demonstrate how it can be queried, how to prompt the model for summarization, question-answering and the generation of question & answer pairs.

The final section is split into four demos on querying the Wikipedia article on quantum computing, an ebook text from [Project Gutenberg](https://www.gutenberg.org/), a scientific pdf article from arxiv.org, and the Australian Budget 2023-24 Medicare Overview.

## 1.Setting up the SageMaker Endpoint

### 1.1 Install Python Dependencies and SageMaker setup

Before executing the notebook, there are some initial steps required for set up. This notebook requires latest version of sagemaker and other libraries.

In [None]:
%%capture
!pip install --upgrade pip
!pip install -U sagemaker
!pip install -U langchain
!pip install -U PyPDF2

In [None]:
!pip install bs4

We Load SDK and helper scripts. First, we import required packages and load the S3 bucket from SageMaker session, as shown below.

In [None]:
import sagemaker, boto3, json, logging
from sagemaker import image_uris, instance_types, model_uris, script_uris
from sagemaker.model import Model
from sagemaker.predictor import Predictor
from sagemaker.session import Session
from sagemaker.utils import name_from_base
from IPython.display import display, HTML, IFrame

In [None]:
logger = logging.getLogger('sagemaker')
logger.setLevel(logging.DEBUG)
logger.addHandler(logging.StreamHandler())

In [None]:
logger.info(f'Using sagemaker=={sagemaker.__version__}')
logger.info(f'Using boto3=={boto3.__version__}')

In [None]:
# Create the folder where the model weights will be stored
!mkdir -p download_dir
!mkdir -p source_documents_dir

In [None]:
def get_sagemaker_session(local_download_dir) -> sagemaker.Session:
    """Return the SageMaker session."""

    sagemaker_client = boto3.client(
        service_name="sagemaker", region_name=boto3.Session().region_name
    )

    session_settings = sagemaker.session_settings.SessionSettings(
        local_download_dir=local_download_dir
    )

    # the unit test will ensure you do not commit this change
    session = sagemaker.session.Session(
        sagemaker_client=sagemaker_client, settings=session_settings
    )

    return session

### 1.2 Deploying a SageMaker Endpoint

Using SageMaker, we can perform inference on the pre-trained model, even without fine-tuning it first on a new dataset. We start by retrieving the `deploy_image_uri`, `deploy_source_uri`, and `model_uri` for the pre-trained model. To host the pre-trained model, we create an instance of [`sagemaker.model.Model`](https://sagemaker.readthedocs.io/en/stable/api/inference/model.html) and deploy it. This may take a few minutes.

In [None]:
sagemaker_session = Session()
aws_role = sagemaker_session.get_caller_identity_arn()
aws_region = boto3.Session().region_name
sess = sagemaker.Session()

# We select the Flan-T5 XL model available in the Hugging Face container.
model_id, model_version = "huggingface-text2text-flan-t5-xl", "*"
_model_env_variable_map = {
    "huggingface-text2text-flan-t5-xl": {"MMS_DEFAULT_WORKERS_PER_MODEL": "1"},
}

endpoint_name = name_from_base(f"jumpstart-example-{model_id}")
instance_type = 'ml.p3.2xlarge'
logger.info(f'Using role {aws_role} in region {aws_region}')

In [None]:
# Retrieve the inference docker container uri. This is the base HuggingFace container image for the default model above.
deploy_image_uri = image_uris.retrieve(
    region=None,
    framework=None,  # automatically inferred from model_id
    image_scope="inference",
    model_id=model_id,
    model_version=model_version,
    instance_type=instance_type,
)

# Retrieve the inference script uri. This includes all dependencies and scripts for model loading, inference handling etc.
deploy_source_uri = script_uris.retrieve(
    model_id=model_id, model_version=model_version, script_scope="inference"
)

# Retrieve the model uri.
model_uri = model_uris.retrieve(
    model_id=model_id, model_version=model_version, model_scope="inference"
)

# Create the SageMaker model instance
if model_id in _model_env_variable_map:
    # For those large models, we already repack the inference script and model
    # artifacts for you, so the `source_dir` argument to Model is not required.
    model = Model(
        image_uri=deploy_image_uri,
        model_data=model_uri,
        role=aws_role,
        predictor_cls=Predictor,
        name=endpoint_name,
        env=_model_env_variable_map[model_id],
    )
else:
    model = Model(
        image_uri=deploy_image_uri,
        source_dir=deploy_source_uri,
        model_data=model_uri,
        entry_point="inference.py",  # entry point file in source_dir and present in deploy_source_uri
        role=aws_role,
        predictor_cls=Predictor,
        name=endpoint_name,
        sagemaker_session=get_sagemaker_session("download_dir"),
    )

In [None]:
%%time
# deploy the Model. Note that we need to pass Predictor class when we deploy model through Model class,
# for being able to run inference through the sagemaker API.
print(f'Deploying endpoint {endpoint_name} on 1 x {instance_type} (this will take approximately 6-8 minutes)')
try:
    model_predictor = model.deploy(
        initial_instance_count=1,
        instance_type=instance_type,
        predictor_cls=Predictor,
        endpoint_name=endpoint_name,
    )
except Exception as e:
    print(f'Error: {e}')
    print('Two common reasons for this error')
    print('1. You are in a AWS region that does not have the ml.p3.2xlarge instance type')
    print('2. You have exceeded the service quota of this AWS account')

In [None]:
print(f'Successfully deployed endpoint {endpoint_name} on 1 x {instance_type}')

## 3. LLM Demos for Education

In [None]:
import nlp_helper
nlp_helper.endpoint_name = endpoint_name

In this notebook, we use the following texts to demonstrate summarization tasks and the generation of question & answer pairs.

1. Quantum Computing from Wikipedia: https://en.wikipedia.org/wiki/Quantum_computing
<!-- 1. Quantum Computing and Quantum Information (by Nielsen & Chuang): https://michaelnielsen.org/qcqi/QINFO-book-nielsen-and-chuang-toc-and-chapter1-nov00.pdf (this is a sample chapter from [this website](https://michaelnielsen.org/qcqi/)) -->
2. Winnie the Pooh (by Alan Alexander Milne): https://www.gutenberg.org/ebooks/67098.txt.utf-8
3. Attention is all you need (by Vaswani et al): https://arxiv.org/pdf/1706.03762.pdf
4. Australian Budget 2023-24 Overview: https://budget.gov.au/content/overview/download/budget_overview-20230511.pdf

Note that for this notebook, we are using the Flan T5 XL model for simplicity and ease of deployment--additional fine tuning or using improved models would be required to get better results.

In [None]:
# Download pdfs and texts with the `curl` command. Flags used here are `-L` (allow redirects),
# `-s` (for silent mode) and `-o` (to specify the output file name).

# Attention is all you need (by Vaswani et al)
!curl -Ls https://arxiv.org/pdf/1706.03762.pdf -o source_documents_dir/attention.pdf
# Australian Budget 2023-24 Overview
!curl -Ls https://budget.gov.au/content/overview/download/budget_overview-20230511.pdf -o source_documents_dir/aus_budget_overview-2023-24.pdf

### 3.1 Wikipedia Page on Quantum Computing

In this example, a Wikipedia page on Quantum Computing is used for context. The LLM is used for keyword generation, a point by point summary, and a set of question and answer pairs. You may also wish to replace the Wikipedia URL with a website, blog, or news article of your own preference.

In [None]:
NCHARS = 400     # We will show just the first and last 400 characters of each extracted text. Increase this number for more context.
NQUESTIONS = 10  # The number of Q&A pairs that we will generate.

In [None]:
wiki_paragraphs = nlp_helper.extract_paragraphs_from_html(
    nlp_helper.download_url_text('https://en.wikipedia.org/wiki/Quantum_computing')
)[1:11]  # We will skip the first 2 paragraphs
wiki_txt = '\n\n'.join(wiki_paragraphs)
# print(f'{txt1[:NCHARS]}...\n\n...{txt1[-NCHARS:]}')
IFrame('https://en.wikipedia.org/wiki/Quantum_computing', width=800, height=300)

#### Key word Generation

In [None]:
KEY_WORDS = nlp_helper.generate_text_from_prompt(
    f'FIND KEY WORDS\n\nContext:\n{wiki_txt}\nKey Words:',
    seed=12345
)
key_word_list = KEY_WORDS.split(', ')
print(KEY_WORDS)

#### Summary of key points

For each of each of paragraphs, let's create a short summary.

In [None]:
summary = []
for i, x in enumerate(wiki_paragraphs):
    summary.append(f'{i+1}. {nlp_helper.summarize(x[:1500])}')

In [None]:
HTML(
    '<h4>Key Points</h4>' + 
    '\n'.join([ f'<li>{x}</li>' for x in summary ])
)

In [None]:
# The 10 points above can be used to create an even shorter summary.
nlp_helper.summarize('\n'.join(summary))

#### Checking for correct answers

In this example, we generate a "correct answer" based on the text. One incorrect answer,
and one correct answer (paraphrased slightly differently from the official "correct answer")
from a student are generated. The LLM is used to check if the student's answer is correct.

In [None]:
prompt=f"""Context:{wiki_txt}
What is quantum computing?"""
answer = nlp_helper.generate_text_from_prompt(prompt, temperature=0.01)
print(answer)

In [None]:
prompt=f"""Context:{wiki_txt}
Question: What is quantum computing?
Answer: {answer}
Student: Quantum computing is using computers with quantum dots
Is this answer correct?"""
print(nlp_helper.generate_text_from_prompt(prompt, temperature=0.01))

In [None]:
prompt=f"""Context:{wiki_txt}
Question: What is quantum computing?
Answer: {answer}
Student: Quantum computing involves using computers that make use of quantum mechanics
Is this answer correct?"""
print(nlp_helper.generate_text_from_prompt(prompt, temperature=0.01))

#### Generation of Question & Answer Pairs

In [None]:
nlp_helper.create_qna_pairs(wiki_txt, NQUESTIONS, output_style=nlp_helper.QNA_OUTPUT_STYLE, seed=1234)

### 3.2 Winnie the Pooh (by Alan Alexander Milne)

In [None]:
winnie_the_pooh = nlp_helper.download_url_text('https://www.gutenberg.org/ebooks/67098.txt.utf-8')

In [None]:
x = winnie_the_pooh.find('CHAPTER III')
pooh_txt = winnie_the_pooh[x:x+5000]  # Extract the first 5000 characters of chapter 3
print(pooh_txt)

In [None]:
nlp_helper.ask(pooh_txt, "What is the storyline here?")

In [None]:
nlp_helper.ask(pooh_txt, "Who is the main character?")

In [None]:
nlp_helper.ask(pooh_txt, "What happens at the end?")

In [None]:
nlp_helper.create_qna_pairs(pooh_txt, NQUESTIONS, output_style=nlp_helper.QNA_OUTPUT_STYLE, seed=12345)

### 3.3 Attention is all you need (by Vaswani et al)

In [None]:
attention = nlp_helper.extract_pages('source_documents_dir/attention.pdf')

In [None]:
attention_txt = '\n\n'.join(attention[1:3] + attention[9:10])  # We will use pages 1, 2 (for the intro), and 9 (for the conclusion)
# print(f'{attention_txt[:NCHARS]}...\n\n\n...{attention_txt[-NCHARS:]}')
IFrame('source_documents_dir/attention.pdf', width=800, height=400)

#### Question Answering

In [None]:
nlp_helper.ask(attention_txt, "What is the main gist of the paper?")

In [None]:
nlp_helper.ask(attention_txt, "What is the problem being solved?")

In [None]:
nlp_helper.ask(attention_txt, "What is the conclusion of the paper?")

In [None]:
chunk_size = len(attention_txt)//8
print(f'The text will be split up into chunks of {chunk_size} characters and summarized')

In [None]:
display(HTML('<h4>Key Points</h4>'))
summary = []
for i in range(8):
    x0 = i*chunk_size
    x1 = (i+1)*chunk_size
    line_summary = f'{i+1}. {nlp_helper.summarize(attention_txt[x0:x1])}'
    display(HTML(line_summary))
    summary.append(line_summary)

In [None]:
nlp_helper.create_qna_pairs(attention_txt, NQUESTIONS, output_style=nlp_helper.QNA_OUTPUT_STYLE)

### 3.4 Australian Budget 2023-24 Overview (Medicare)

In this example, we look at the Australian Budget 2023-24 and we focus on the Medicare improvements.

In [None]:
# Extracting the pages from the Budget overview and work on the pages 24 to 27 (Medicare related)
aus_budget_overview = nlp_helper.extract_pages('source_documents_dir/aus_budget_overview-2023-24.pdf')
txt_aus_budget_overview_medicare = '\n\n'.join(aus_budget_overview[24:27])  # We will use pages 24 to 27. Those pages cover the Medicare budget.
print(f'{txt_aus_budget_overview_medicare[:NCHARS]}...\n\n\n...{txt_aus_budget_overview_medicare[-NCHARS:]}')

#### Question Answering

In [None]:
nlp_helper.summarize(txt_aus_budget_overview_medicare)

In [None]:
nlp_helper.create_qna_pairs(txt_aus_budget_overview_medicare, 5, output_style=nlp_helper.QNA_OUTPUT_STYLE)

In [None]:
nlp_helper.ask(txt_aus_budget_overview_medicare,"What is a Level B consultation?")

In [None]:
nlp_helper.ask(txt_aus_budget_overview_medicare, "How much is the govermement investing?")

In [None]:
nlp_helper.ask(txt_aus_budget_overview_medicare, "Is the governement helping the homeless people?")

## 4. [Optional] LLM Demos for Education Part II

In this section, we deploy a Gradio app that takes a URL as input, and allows us to answer questions based on the content of the web page.

### 4.1 Gradio Demo App

In [None]:
%%capture
!pip install gradio

In [None]:
!pip install huggingface_hub ipywidgets

In [None]:
import gradio as gr

In [None]:
from nlp_helper import extract_paragraphs_from_html, generate_text_from_prompt, ask, download_url_text, summarize

In [None]:
def url2context(url):
    paragraph_list = extract_paragraphs_from_html(
        download_url_text(url)
    )[1:11]  # We will skip the first paragraph, and take only 10 paragraphs
    return '\n\n'.join(paragraph_list)

In [None]:
def chatbot(prompt, temperature, max_length, url):
    if url == "":
        return generate_text_from_prompt(prompt, max_length, temperature)
    else:
        return ask(url2context(url), prompt)

def summary(url):
    context = url2context(url)
    key_words = generate_text_from_prompt(
        f'FIND KEY WORDS\n\nContext:\n{context}\nKey Words:'
    )
    return f"""{summarize(context)}\n\nKey words: {key_words}"""

with gr.Blocks() as demo:
    gr.Markdown("## Flan T5 Chatbot Demo")
    with gr.Row():
        with gr.Column():
            url = gr.Textbox(label="URL", placeholder="Enter URL here", lines=1, show_label=True,
                             value="https://mmrjournal.biomedcentral.com/articles/10.1186/s40779-022-00416-w"
                             # value="https://k12.libretexts.org/Bookshelves/Science_and_Technology/Biology/03%3A_Genetics/3.14%3A_Human_Genome"
                            )
    with gr.Row():
        with gr.Column():
            prompt = gr.Textbox(
                label="Prompt", placeholder="Enter your prompt here", lines=3, show_label=True,
                value=f"How do mRNA vaccines work for pancreatic cancer treatment?")
            temperature = gr.Slider(label="Temperature", minimum=0.0, maximum=1.0, value=0.5)
            max_length = gr.Slider(label="Max Length", minimum=20, maximum=400, value=100)
        with gr.Column():
            output = gr.Textbox(label="Output", lines=10, show_label=True)
    with gr.Row():
        with gr.Column():
            submit_btn = gr.Button("Submit")
        with gr.Column():
            summary_btn = gr.Button("Summary")
    submit_btn.click(
        fn=chatbot,
        inputs=[prompt, temperature, max_length, url],
        outputs=output,
        api_name="chatbot",
        queue=False
    )
    summary_btn.click(
        fn=summary,
        inputs=[url],
        outputs=output,
        api_name="summary",
        queue=False
    )

demo.launch(share=True)

In [None]:
demo.close()

## 5. Cleanup

Delete the SageMaker model and endpoint

In [None]:
#model_predictor.delete_model()
#model_predictor.delete_endpoint()

To completely shutdown SageMaker, go to File > Shut Down > Shutdown All