## 1. API and Other Setup
Here I use the Meta Llama 3 "llama3-8b-8192" model available through groq. 
The task has two steps: the first step uses the LLM to generate a quiz with questions related to a statistics area. The user (a.k.a, human) can specify the number of questions, the area, and the grade or grade level (e.g., "middle school"). The user can provide some text. The second step asks the LLM to evaluate the quiz.

In [1]:
# import getpass
import os
import json
import pandas as pd

# os.environ["OPENAI_API_KEY"] = getpass.getpass()

from dotenv import load_dotenv

load_dotenv()  # take environment variables from .env.
KEY=os.getenv("GROQ_API_KEY")


In [2]:
from operator import itemgetter
from langchain_core.prompts import PromptTemplate
# from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough, RunnableLambda
# !pip install -qU langchain-groq
from langchain_groq import ChatGroq


model = ChatGroq(groq_api_key= KEY, model="llama3-8b-8192")
# parser = StrOutputParser()
# model_parser = model | parser

In [3]:
RESPONSE_JSON = {
    "1": {
        "STATSQA": "multiple choice question",
        "options": {
            "a": "choice here",
            "b": "choice here",
            "c": "choice here",
            "d": "choice here"
        },
        "correct": "correct answer"
    },
    "2": {
        "STATSQA": "multiple choice question",
        "options": {
            "a": "choice here",
            "b": "choice here",
            "c": "choice here",
            "d": "choice here"
        },
        "correct": "correct answer"
    },
    "3": {
        "STATSQA": "multiple choice question",
        "options": {
            "a": "choice here",
            "b": "choice here",
            "c": "choice here",
            "d": "choice here"
        },
        "correct": "correct answer"
    }
}

In [4]:
TEMPLATE="""
Text:{text}
You are an expert STATSQA maker. Given the above text, it is your job to \
create a quiz of {number} multiple choice questions related to {area} for students in {grade}. 
Make sure the questions are not repeated. Your response should be formated like RESPONSE_JSON below with {number} of items. \
Ensure to generate {number} questions.
### RESPONSE_JSON
{response_json} \

"""

In [5]:
TEMPLATE2="""
You are an expert of English grammar. \
You are given a STATSQA quiz: {quiz} of multiple choice questions related to {area} in statistics.\
You need to evaluate the complexity of the quiz. Use at most 50 words for complexity analysis. \
If the quiz is too easy or too difficult for students in {grade}, 
update the quiz questions to make it more suitable for the students in {grade}.

Check from an expert English Writer of the above quiz:
"""

## 2. Use LangChain Expression Language (LCEL)

In [6]:
quiz_generation_prompt = PromptTemplate(
    template=TEMPLATE, 
    input_variables=["text", "number", "area", "grade", "response_json"]
)
# quiz_generation_prompt = PromptTemplate.from_template(
#     template=TEMPLATE
# )

In [7]:
quiz_evaluation_prompt = PromptTemplate.from_template(TEMPLATE2)

In [8]:
quiz_generation_chain= quiz_generation_prompt | model

In [9]:
quiz_evaluation_chain = quiz_evaluation_prompt | model

In [10]:
dirname = os.getcwd()
file_path=os.path.join(dirname, "..", "data.txt")
with open(file_path, 'r') as file:
    TEXT = file.read()

In [11]:
NUMBER=5
AERA ="hypothesis testing"
GRADE="middle school"

In [12]:
# quiz_generation_chain.invoke(
#     {
#         "text": TEXT,
#         "number": NUMBER,
#         "area": AERA,
#         "grade": GRADE,
#         "response_json": json.dumps(RESPONSE_JSON),
#     }
# )

In [13]:
complete_chain = ({
    "text": itemgetter("text"),
    "number": itemgetter("number"),
    "area": itemgetter("area"),
    "grade": itemgetter("grade"),
    "response_json": itemgetter("response_json"),
    "quiz": quiz_generation_chain
    }
    | RunnablePassthrough.assign(eval=quiz_evaluation_chain)
)

In [14]:
all_result = complete_chain.invoke(
    {
        "text": TEXT,
        "number": NUMBER,
        "area": AERA,
        "grade": GRADE,
        "response_json": json.dumps(RESPONSE_JSON)
    }
)

In [15]:
# quiz_evaluation_chain.invoke(
#     {
#         "area": "point estimation",
#         "grade": "high school",
#         "quiz": quiz_generation_chain
#     }
# )

In [16]:
all_result

{'text': 'Estimation statistics, or simply estimation, is a data analysis framework that uses a combination of effect sizes, confidence intervals, precision planning, and meta-analysis to plan experiments, analyze data and interpret results.[1] It complements hypothesis testing approaches such as null hypothesis significance testing (NHST), by going beyond the question is an effect present or not, and provides information about how large an effect is.[2][3] Estimation statistics is sometimes referred to as the new statistics.[3][4][5]\n\nThe primary aim of estimation methods is to report an effect size (a point estimate) along with its confidence interval, the latter of which is related to the precision of the estimate.[6] The confidence interval summarizes a range of likely values of the underlying population effect. Proponents of estimation see reporting a P value as an unhelpful distraction from the important business of reporting an effect size with its confidence intervals,[7] and

In [17]:
all_result.get("eval")

AIMessage(content='After reviewing the quiz, I\'d rate its complexity as moderate. The language is clear and concise, making it accessible to middle school students. The vocabulary is not overly technical, and the options are well-crafted to guide students towards the correct answers.\n\nHowever, the concepts presented may still be challenging for some middle school students, especially if they have limited exposure to hypothesis testing and estimation statistics. To make the quiz more suitable for middle school students, I\'d suggest rephrasing some questions to make them more concrete and relatable.\n\nFor example, question 1 could be rephrased to "What do estimation methods aim to do?" instead of "What is the primary aim of estimation methods?" This would make the question more student-friendly and easier to understand.\n\nAdditionally, providing real-life examples or scenarios to illustrate the concepts could help students better grasp the ideas. This might involve adding a few con

Process the string outcome in "quiz" to be JSON string

In [18]:
import json

def extract_between_braces(s):
    start = s.find('{')
    # Ensure the character after the first '{' is not another '{'
    while start != -1 and start + 1 < len(s) and s[start + 1] == '{':
        start = s.find('{', start + 1)
    
    end = s.rfind('}}')
    
    if start != -1 and end != -1 and end > start:
        return "{" + s[start+1:end] + "}}"
    return ""

quiz_string = all_result.get("quiz").content
quiz_string = extract_between_braces(quiz_string)
print(quiz_string)


{"1": {"STATSQA": "What is the primary aim of estimation methods?", "options": {"a": "To test if an effect is present or not", "b": "To report an effect size along with its confidence interval", "c": "To determine the significance of the results", "d": "To compare the results with other studies"}, "correct": "b"}, "2": {"STATSQA": "What is the function of a confidence interval in estimation statistics?", "options": {"a": "To determine the significance of the results", "b": "To summarize a range of likely values of the underlying population effect", "c": "To test if an effect is present or not", "d": "To report an effect size along with its precision"}, "correct": "b"}, "3": {"STATSQA": "What do proponents of estimation statistics see as an unhelpful distraction?", "options": {"a": "Hypothesis testing", "b": "Null hypothesis significance testing", "c": "Reporting a P value", "d": "Meta-analysis"}, "correct": "c"}, "4": {"STATSQA": "What is the term sometimes used to refer to estimation 

In [19]:
quiz = json.loads(quiz_string)

In [20]:
quiz

{'1': {'STATSQA': 'What is the primary aim of estimation methods?',
  'options': {'a': 'To test if an effect is present or not',
   'b': 'To report an effect size along with its confidence interval',
   'c': 'To determine the significance of the results',
   'd': 'To compare the results with other studies'},
  'correct': 'b'},
 '2': {'STATSQA': 'What is the function of a confidence interval in estimation statistics?',
  'options': {'a': 'To determine the significance of the results',
   'b': 'To summarize a range of likely values of the underlying population effect',
   'c': 'To test if an effect is present or not',
   'd': 'To report an effect size along with its precision'},
  'correct': 'b'},
 '3': {'STATSQA': 'What do proponents of estimation statistics see as an unhelpful distraction?',
  'options': {'a': 'Hypothesis testing',
   'b': 'Null hypothesis significance testing',
   'c': 'Reporting a P value',
   'd': 'Meta-analysis'},
  'correct': 'c'},
 '4': {'STATSQA': 'What is the t

In [21]:
quiz_table_data = []
for key, value in quiz.items():
    STATSQA = value["STATSQA"]
    options = " | ".join(
        [
            f"{option}: {option_value}"
            for option, option_value in value["options"].items()
            ]
        )
    correct = value["correct"]
    quiz_table_data.append({"STATSQA": STATSQA, "Choices": options, "Correct": correct})

In [22]:
quiz=pd.DataFrame(quiz_table_data)
quiz

Unnamed: 0,STATSQA,Choices,Correct
0,What is the primary aim of estimation methods?,a: To test if an effect is present or not | b:...,b
1,What is the function of a confidence interval ...,a: To determine the significance of the result...,b
2,What do proponents of estimation statistics se...,a: Hypothesis testing | b: Null hypothesis sig...,c
3,What is the term sometimes used to refer to es...,a: New data analysis | b: New statistics | c: ...,b
4,What is an advantage of estimation statistics ...,a: It provides more precise results | b: It re...,b


In [23]:
quiz.to_csv(f"quiz_on_{AERA}.csv",index=False)

In [24]:
quiz

Unnamed: 0,STATSQA,Choices,Correct
0,What is the primary aim of estimation methods?,a: To test if an effect is present or not | b:...,b
1,What is the function of a confidence interval ...,a: To determine the significance of the result...,b
2,What do proponents of estimation statistics se...,a: Hypothesis testing | b: Null hypothesis sig...,c
3,What is the term sometimes used to refer to es...,a: New data analysis | b: New statistics | c: ...,b
4,What is an advantage of estimation statistics ...,a: It provides more precise results | b: It re...,b
