In [None]:
#install required packages
%pip install --upgrade --quiet langchain langchain-community langchain-openai wikipedia

In [None]:
#openAI api key input
import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass.getpass()

··········


In [None]:
#import libraries

from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import WikipediaLoader
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_openai import ChatOpenAI


In [None]:
#loads the wikipedia page related to the subject chosen. Splits the page into sections
#that will be used in the question building by the AI. We use GPT4
subject = "baseball"
doc = WikipediaLoader(query=subject, load_max_docs=1, doc_content_chars_max=-1).load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(doc)
llm = ChatOpenAI(model_name="gpt-4", temperature=0)

In [None]:
#An example of one section produced from the wikipedia page
splits[0].page_content

'Baseball is a bat-and-ball sport played between two teams of nine players each, taking turns batting and fielding. The game occurs over the course of several plays, with each play generally beginning when a player on the fielding team, called the pitcher, throws a ball that a player on the batting team, called the batter, tries to hit with a bat. The objective of the offensive team (batting team) is to hit the ball into the field of play, away from the other team\'s players, allowing its players to run the bases, having them advance counter-clockwise around four bases to score what are called "runs". The objective of the defensive team (referred to as the fielding team) is to prevent batters from becoming runners, and to prevent runners\' advance around the bases. A run is scored when a runner legally advances around the bases in order and touches home plate (the place where the player started as a batter).'

In [None]:
#Builds the prompt to create the question to administer to the student.
#The standard used for evaluation is specified in the core_std variable.
#Injects a section from the wikipedia page loaded before to use as context.
#Note that we fix the section to use in this notebook (splits[0].page_content)
#in order to have reproducible results, in production we can select one section at random.

core_std = """draw evidence from the context to support analysis,
              reflection, and research. """

template = """Use the following piece of context to create a free response question at the end
regarding the topics in the context.
The question is aimed at a student and has to be formulated to understand the student's ability
to """ + core_std + """
Context:

{context}

Create an introduction about the context, repeat the context between quotes,
 and at the end of your message ask the question.
"""
custom_rag_prompt = PromptTemplate.from_template(template)

rag_chain = (custom_rag_prompt
             | llm
             | StrOutputParser()
            )

ai_question = rag_chain.invoke({"context": splits[0].page_content})
print(ai_question)

Introduction:

We are going to delve into the world of baseball, a popular bat-and-ball sport that is played between two teams. This sport is not just about hitting the ball and scoring runs, but it also involves strategic planning and teamwork. The game is played in several plays and involves two main roles - the pitcher from the fielding team and the batter from the batting team. The objective of the game for both teams is clearly defined. 

Context:

"Baseball is a bat-and-ball sport played between two teams of nine players each, taking turns batting and fielding. The game occurs over the course of several plays, with each play generally beginning when a player on the fielding team, called the pitcher, throws a ball that a player on the batting team, called the batter, tries to hit with a bat. The objective of the offensive team (batting team) is to hit the ball into the field of play, away from the other team's players, allowing its players to run the bases, having them advance cou

In [None]:
#Here we build the grading prompt. We give the rubric to the AI and use chain of thought
#prompting to force the AI to give a feedback to explain its grading logic.
#We also ask the AI to produce a follow-up question.
#Note that we provide a very basic student's answer to the AI, and it correctly grades it as progressing.

grading_prompt = """Grade the answer below based on the question below and the context below. Use these grades: needs improvement,
progressing, meets,exceeds. The rubric for the grades is below. Describe your grading step by step.
End your message with a follow-up question on the context that covers different aspects of the context
not asked in the question below.

Rubric:

Needs improvement: Unable to apply evidence gathered from the context to support written analysis, reflection, and research.
Progressing: Requires prompting and support to apply evidence gathered from
the context to support written analysis, reflection, and
research.
Meets: Independently able to apply evidence gathered from the context to support written analysis, reflection,
and research
Exceeds: Independently able to apply evidence gathered from the context and other sources to support written analysis, reflection,
and research

Question:

{question}

Answer:

{answer}

Context:

{context}
"""

student_answer = """ The objective of the batting team is to hit the ball into
 the field of play. The objective of the defensive team
 is to prevent batters from becoming runners. """

grade_prompt = PromptTemplate.from_template(grading_prompt)

grade_chain = (grade_prompt
             | llm
             | StrOutputParser()
            )

ai_grading = grade_chain.invoke({"context" : splits[0].page_content,
                                "question": ai_question,
                                "answer"  : student_answer})
print(ai_grading)

Grade: Progressing

The answer does correctly identify the basic objectives of the batting and fielding teams in a baseball game, which shows that the respondent has understood and applied some of the information from the context. However, the answer does not fully meet the requirements of the question. The respondent has not explained how a player scores a run, nor have they suggested any strategies that the fielding team might use to prevent this. This shows that the respondent has not fully applied all the evidence gathered from the context to support their written analysis. 

Follow-up question: Can you explain the role of the pitcher in the fielding team and how the batting team might strategize to hit the ball effectively?


# Test/ quality control

In [None]:
#For pratical reasons (openAI limits on personal use of the API)
#we use a shortened version of the wikipedia page we want to test.
#In production we can use the whole wikipedia page.
short_page = doc[0].page_content.split('== History ==')[0]


In [None]:
#The test checks that the AI creates an open ended question that asks the student
#to summarize the context given to them. Here we build the prompt fed to the AI where
#we change the core standard to use to evaluate the student - in the summary_std variable.

summary_std = """summarize a given text. """

summary_template = """Use the following piece of context to create a question at the end
regarding the topics in the context. The question asks the student to summarize the context.
The question is aimed at a student and has to be formulated to understand the student's ability
to """ + summary_std + """
Context:

{context}.

Question:
"""
summary_prompt = PromptTemplate.from_template(summary_template)

summary_chain = (summary_prompt
             | llm
             | StrOutputParser()
            )

ai_summary_question = summary_chain.invoke({"context": short_page})

In [None]:
#Here we test that the AI has produced a question that asks to summarize the context.
# the test checks that the AI's question contains the words summary and/or summarize.
test_result = any(el in ai_summary_question for el in ('summary', 'summarize'))
print('Did AI generate a correct question? ', test_result)
print('Question generated: \n', ai_summary_question)

Did AI generate a correct question?  True
Question generated: 
 Can you summarize the context provided, detailing the rules, gameplay, personnel, strategy, and tactics of baseball?


In [None]:
#Here we test the AI's capability to grade the student's answer. We use wikipedia summary
#for each page to simulate the student's answer, and we expect that the AI grades at a level
#of meets or exceeds.

qc_template = """Grade the summary below based on the question below and the context below.
Use these grades: needs improvement,progressing, meets, exceeds.
The rubric for the grades is below. Describe your grading step by step.

Rubric:

Needs improvement: Unable to apply summarization gathered from the context.
Progressing: Requires prompting and support to apply summarization gathered from
the context.
Meets: Independently able to apply summarization gathered from the context.
Exceeds: Independently able to apply summarization gathered from the context.
The summary has depth and coverage of the context. It captures the fundamental aspects of
the context.


Question:

{question}

Summary:

{summary}

Context:

{context}
"""



qc_prompt = PromptTemplate.from_template(qc_template)

qc_chain = (qc_prompt
             | llm
             | StrOutputParser()
            )

ai_qc = qc_chain.invoke({"context"  : short_page,
                          "question": ai_summary_question,
                          "summary" : doc[0].metadata['summary']})

In [None]:
#Here we test that the AI has graded correctly the simulated answer.
#The test checks that the AI grade is meets or exceeds.
grading_test_result = any(el in ai_qc for el in  ['Meets','meets', 'exceeds','Exceeds'] )
print('Did AI graded the answer correctly? ', grading_test_result)
print('AI grading and explanation: \n', ai_qc)

Did AI graded the answer correctly?  True
AI grading and explanation: 
 The summary exceeds expectations. It independently applies the summarization gathered from the context and captures the fundamental aspects of the context. The summary provides a comprehensive overview of the rules, gameplay, personnel, strategy, and tactics of baseball. It includes details about the roles of the players, the structure of the game, the equipment used, and the strategic decisions made during the game. It also provides historical context and information about the popularity and professional organization of the sport. The summary is well-structured and covers all the key points from the context.
