# Getting Started

**Useful Documentation:**
* [Langchain](https://python.langchain.com/docs/)
* [Beautiful Soup](https://beautiful-soup-4.readthedocs.io/en/latest/#quick-start)
* [CCSB GitHub Repo](https://github.com/lgtanimoto/CulturalCSBytes/tree/main)

**Installations:**
```bash
$ pip install beautifulsoup4
$ pip install langchain
$ pip install python-dotenv
$ pip install requests
```

**Imports:**

In [None]:
# built-in python modules
import json
import os
from io import StringIO
from html.parser import HTMLParser
import random
from typing import Dict, List

# langchain
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage, SystemMessage, AIMessage
from langchain.prompts import FewShotPromptTemplate, PromptTemplate
from langchain.prompts.example_selector.base import BaseExampleSelector

# python libraries
from dotenv import load_dotenv
from bs4 import BeautifulSoup
import requests

**Set-up OpenAI API Key**:
* generate a secret key in the [openai API key manager](https://platform.openai.com/account/api-keys)
* create a .env file with the environmental variable
```openai_api_key=insert_your_key_here```
* load in the api key by running the following code block

In [None]:
load_dotenv()
openai_api_key = os.getenv("openai_api_key")
llm = OpenAI(model_name="gpt-4", openai_api_key=openai_api_key)
chat_turbo = ChatOpenAI(temperature=0, openai_api_key=openai_api_key, model_name="gpt-3.5-turbo")

**Load Example Questions from the [GitHub Repo](https://github.com/lgtanimoto/CulturalCSBytes/tree/main/content/A000/A000/)**

Note: we scrape the 50 default-culture questions to use as examples when generating new questions.

In [None]:
# scrape default questions
URL = "https://github.com/lgtanimoto/CulturalCSBytes/tree/main/content/A000/A000"
page = requests.get(URL)
soup = BeautifulSoup(page.content, "html.parser")

# get all links to CCSB question json files
links = [("https://raw.githubusercontent.com" + link['href'].replace("blob/","")) for link in soup.find_all("a") if ".json" in link['href']]
# list of JSON objects for default questions
default_questions = [json.loads(requests.get(link).text) for link in links]

In [None]:
# Strip HTML from string
class MLStripper(HTMLParser):
    def __init__(self):
        super().__init__()
        self.reset()
        self.strict = False
        self.convert_charrefs= True
        self.text = StringIO()
    def handle_data(self, d):
        self.text.write(d)
    def get_data(self):
        return self.text.getvalue()

def strip_tags(html):
    s = MLStripper()
    s.feed(html)
    return s.get_data()

# remove unwanted fields from dictionary
def remove_fields(question, fields=['FixedOrderFromBottom', 'AnswerImage', 'QuestionImage']):
    for field in fields:
        if field in question.keys():
            question.pop(field)

Create learning objective list and groupings.

In [None]:
# list of learning standards (objectives), with the full 6-character code
raw_objectives = {
    "1A-CS-01": "Select and operate appropriate software to perform a variety of tasks, and recognize that users have different needs and preferences for the technology they use.",
    "1A-CS-02": "Use appropriate terminology in identifying and describing the function of common physical components of computing systems (hardware).",
    "1A-CS-03": "Describe basic hardware and software problems using accurate terminology.",
    "1A-NI-04": "Explain what passwords are and why we use them, and use strong passwords to protect devices and information from unauthorized access.",
    "1A-DA-05": "Store, copy, search, retrieve, modify, and delete information using a computing device and define the information stored as data.",
    "1A-DA-06": "Collect and present the same data in various visual formats.",
    "1A-DA-07": "Identify and describe patterns in data visualizations, such as charts or graphs, to make predictions.",
    "1A-AP-08": "Model daily processes by creating and following algorithms (sets of step-by-step instructions) to complete tasks.",
    "1A-AP-09": "Model the way programs store and manipulate data by using numbers or other symbols to represent information.",
    "1A-AP-10": "Develop programs with sequences and simple loops, to express ideas or address a problem.",
    "1A-AP-11": "Decompose (break down) the steps needed to solve a problem into a precise sequence of instructions.",
    "1A-AP-12": "Develop plans that describe a program’s sequence of events, goals, and expected outcomes.",
    "1A-AP-13": "Give attribution when using the ideas and creations of others while developing programs.",
    "1A-AP-14": "Debug (identify and fix) errors in an algorithm or program that includes sequences and simple loops.",
    "1A-AP-15": "Using correct terminology, describe steps taken and choices made during the iterative process of program development.",
    "1A-IC-16": "Compare how people live and work before and after the implementation or adoption of new computing technology.",
    "1A-IC-17": "Work respectfully and responsibly with others online.",
    "1A-IC-18": "Keep login information private, and log off of devices appropriately.",
    "1B-CS-01": "Describe how internal and external parts of computing devices function to form a system.",
    "1B-CS-02": "Model how computer hardware and software work together as a system to accomplish tasks.",
    "1B-CS-03": "Determine potential solutions to solve simple hardware and software problems using common troubleshooting strategies.",
    "1B-NI-04": "Model how information is broken down into smaller pieces, transmitted as packets through multiple devices over networks and the Internet, and reassembled at the destination.",
    "1B-NI-05": "Discuss real-world cybersecurity problems and how personal information can be protected.",
    "1B-DA-06": "Organize and present collected data visually to highlight relationships and support a claim.",
    "1B-DA-07": "Use data to highlight or propose cause-and-effect relationships, predict outcomes, or communicate an idea.",
    "1B-AP-08": "Compare and refine multiple algorithms for the same task and determine which is the most appropriate.",
    "1B-AP-09": "Create programs that use variables to store and modify data.",
    "1B-AP-10": "Create programs that include sequences, events, loops, and conditionals.",
    "1B-AP-11": "Decompose (break down) problems into smaller, manageable subproblems to facilitate the program development process.",
    "1B-AP-12": "Modify, remix, or incorporate portions of an existing program into one's own work, to develop something new or add more advanced features.",
    "1B-AP-13": "Use an iterative process to plan the development of a program by including others' perspectives and considering user preferences.",
    "1B-AP-14": "Observe intellectual property rights and give appropriate attribution when creating or remixing programs.",
    "1B-AP-15": "Test and debug (identify and fix errors) a program or algorithm to ensure it runs as intended.",
    "1B-AP-16": "Take on varying roles, with teacher guidance, when collaborating with peers during the design, implementation, and review stages of program development.",
    "1B-AP-17": "Describe choices made during program development using code comments, presentations, and demonstrations.",
    "1B-IC-18": "Discuss computing technologies that have changed the world, and express how those technologies influence, and are influenced by, cultural practices.",
    "1B-IC-19": "Brainstorm ways to improve the accessibility and usability of technology products for the diverse needs and wants of users.",
    "1B-IC-20": "Seek diverse perspectives for the purpose of improving computational artifacts.",
    "1B-IC-21": "Use public domain or creative commons media, and refrain from copying or using material created by others without permission.",
    "2-CS-01": "Recommend improvements to the design of computing devices, based on an analysis of how users interact with the devices.",
    "2-CS-02": "Design projects that combine hardware and software components to collect and exchange data.",
    "2-CS-03": "Systematically identify and fix problems with computing devices and their components.",
    "2-NI-04": "Model the role of protocols in transmitting data across networks and the Internet.",
    "2-NI-05": "Explain how physical and digital security measures protect electronic information.",
    "2-NI-06": "Apply multiple methods of encryption to model the secure transmission of information.",
    "2-DA-07": "Represent data using multiple encoding schemes.",
    "2-DA-08": "Collect data using computational tools and transform the data to make it more useful and reliable.",
    "2-DA-09": "Refine computational models based on the data they have generated.",
    "2-AP-10": "Use flowcharts and/or pseudocode to address complex problems as algorithms.",
    "2-AP-11": "Create clearly named variables that represent different data types and perform operations on their values.",
    "2-AP-12": "Design and iteratively develop programs that combine control structures, including nested loops and compound conditionals.",
    "2-AP-13": "Decompose problems and subproblems into parts to facilitate the design, implementation, and review of programs.",
    "2-AP-14": "Create procedures with parameters to organize code and make it easier to reuse.",
    "2-AP-15": "Seek and incorporate feedback from team members and users to refine a solution that meets user needs.",
    "2-AP-16": "Incorporate existing code, media, and libraries into original programs, and give attribution.",
    "2-AP-17": "Systematically test and refine programs using a range of test cases.",
    "2-AP-18": "Distribute tasks and maintain a project timeline when collaboratively developing computational artifacts.",
    "2-AP-19": "Document programs in order to make them easier to follow, test, and debug.",
    "2-IC-20": "Compare tradeoffs associated with computing technologies that affect people's everyday activities and career options.",
    "2-IC-21": "Discuss issues of bias and accessibility in the design of existing technologies.",
    "2-IC-22": "Collaborate with many contributors through strategies such as crowdsourcing or surveys when creating a computational artifact.",
    "2-IC-23": "Describe tradeoffs between allowing information to be public and keeping information private and secure.",
    "3A-CS-01": "Explain how abstractions hide the underlying implementation details of computing systems embedded in everyday objects.",
    "3A-CS-02": "Compare levels of abstraction and interactions between application software, system software, and hardware layers.",
    "3A-CS-03": "Develop guidelines that convey systematic troubleshooting strategies that others can use to identify and fix errors.",
    "3A-NI-04": "Evaluate the scalability and reliability of networks, by describing the relationship between routers, switches, servers, topology, and addressing.",
    "3A-NI-05": "Give examples to illustrate how sensitive data can be affected by malware and other attacks.",
    "3A-NI-06": "Recommend security measures to address various scenarios based on factors such as efficiency, feasibility, and ethical impacts.",
    "3A-NI-07": "Compare various security measures, considering tradeoffs between the usability and security of a computing system.",
    "3A-NI-08": "Explain tradeoffs when selecting and implementing cybersecurity recommendations.",
    "3A-DA-09": "Translate between different bit representations of real-world phenomena, such as characters, numbers, and images.",
    "3A-DA-10": "Evaluate the tradeoffs in how data elements are organized and where data is stored.",
    "3A-DA-11": "Create interactive data visualizations using software tools to help others better understand real-world phenomena.",
    "3A-DA-12": "Create computational models that represent the relationships among different elements of data collected from a phenomenon or process.",
    "3A-AP-13": "Create prototypes that use algorithms to solve computational problems by leveraging prior student knowledge and personal interests.",
    "3A-AP-14": "Use lists to simplify solutions, generalizing computational problems instead of repeatedly using simple variables.",
    "3A-AP-15": "Justify the selection of specific control structures when tradeoffs involve implementation, readability, and program performance, and explain the benefits and drawbacks of choices made.",
    "3A-AP-16": "Design and iteratively develop computational artifacts for practical intent, personal expression, or to address a societal issue by using events to initiate instructions.",
    "3A-AP-17": "Decompose problems into smaller components through systematic analysis, using constructs such as procedures, modules, and/or objects.",
    "3A-AP-18": "Create artifacts by using procedures within a program, combinations of data and procedures, or independent but interrelated programs.",
    "3A-AP-19": "Systematically design and develop programs for broad audiences by incorporating feedback from users.",
    "3A-AP-20": "Evaluate licenses that limit or restrict use of computational artifacts when using resources such as libraries.",
    "3A-AP-21": "Evaluate and refine computational artifacts to make them more usable and accessible.",
    "3A-AP-22": "Design and develop computational artifacts working in team roles using collaborative tools.",
    "3A-AP-23": "Document design decisions using text, graphics, presentations, and/or demonstrations in the development of complex programs.",
    "3A-IC-24": "Evaluate the ways computing impacts personal, ethical, social, economic, and cultural practices.",
    "3A-IC-25": "Test and refine computational artifacts to reduce bias and equity deficits.",
    "3A-IC-26": "Demonstrate ways a given algorithm applies to problems across disciplines.",
    "3A-IC-27": "Use tools and methods for collaboration on a project to increase connectivity of people in different cultures and career fields.",
    "3A-IC-28": "Explain the beneficial and harmful effects that intellectual property laws can have on innovation.",
    "3A-IC-29": "Explain the privacy concerns related to the collection and generation of data through automated processes that may not be evident to users.",
    "3A-IC-30": "Evaluate the social and economic implications of privacy in the context of safety, law, or ethics.",
    "3B-CS-01": "Categorize the roles of operating system software.",
    "3B-CS-02": "Illustrate ways computing systems implement logic, input, and output through hardware components.",
    "3B-NI-03": "Describe the issues that impact network functionality (e.g., bandwidth, load, delay, topology).",
    "3B-NI-04": "Compare ways software developers protect devices and information from unauthorized access.",
    "3B-DA-05": "Use data analysis tools and techniques to identify patterns in data representing complex systems.",
    "3B-DA-06": "Select data collection tools and techniques to generate data sets that support a claim or communicate information.",
    "3B-DA-07": "Evaluate the ability of models and simulations to test and support the refinement of hypotheses.",
    "3B-AP-08": "Describe how artificial intelligence drives many software and physical systems.",
    "3B-AP-09": "Implement an artificial intelligence algorithm to play a game against a human opponent or solve a problem.",
    "3B-AP-10": "Use and adapt classic algorithms to solve computational problems.",
    "3B-AP-11": "Evaluate algorithms in terms of their efficiency, correctness, and clarity.",
    "3B-AP-12": "Compare and contrast fundamental data structures and their uses.",
    "3B-AP-13": "Illustrate the flow of execution of a recursive algorithm.",
    "3B-AP-14": "Construct solutions to problems using student-created components, such as procedures, modules and/or objects.",
    "3B-AP-15": "Analyze a large-scale computational problem and identify generalizable patterns that can be applied to a solution.",
    "3B-AP-16": "Demonstrate code reuse by creating programming solutions using libraries and APIs.",
    "3B-AP-17": "Plan and develop programs for broad audiences using a software life cycle process.",
    "3B-AP-18": "Explain security issues that might lead to compromised computer programs.",
    "3B-AP-19": "Develop programs for multiple computing platforms.",
    "3B-AP-20": "Use version control systems, integrated development environments (IDEs), and collaborative tools and practices (code documentation) in a group software project.",
    "3B-AP-21": "Develop and use a series of test cases to verify that a program performs according to its design specifications.",
    "3B-AP-22": "Modify an existing program to add additional functionality and discuss intended and unintended implications (e.g., breaking other functionality).",
    "3B-AP-23": "Evaluate key qualities of a program through a process such as a code review.",
    "3B-AP-24": "Compare multiple programming languages and discuss how their features make them suitable for solving different types of problems.",
    "3B-IC-25": "Evaluate computational artifacts to maximize their beneficial effects and minimize harmful effects on society.",
    "3B-IC-26": "Evaluate the impact of equity, access, and influence on the distribution of computing resources in a global society.",
    "3B-IC-27": "Predict how computational innovations that have revolutionized aspects of our culture might evolve.",
    "3B-IC-28": "Debate laws and regulations that impact the development and use of software.",
}

In [None]:
# stores learning objectives in groups accessed by abbreviated one letter group name
group_names = {
    'C': "Computing Systems",
    'N': "Networks and the Internet",
    'D': "Data and Analysis",
    'A': "Algorithms and Programming",
    'I': "Impacts of Computing",
}

lo_code_groups = {
    'C': [],
    'N': [],
    'D': [],
    'A': [],
    'I': [],
}

In [None]:
'''schema conversion:
First character 1A - 1, 1B - 2, 2 - 3, 3A - 4, 3B - 5
Second character is first character after dash
last two characters is the number at the end
'''
lo_list = dict()

level_conversions = {
    "1A": "1",
    "1B": "2",
    "2": "3",
    "3A": "4",
    "3B": "5",
}

# Convert raw objectives' full 6-character codes to the abbreviated 4-character codes and categorize into the lo_code_groups
for k, v in raw_objectives.items():
    code_parts = k.split("-")
    level = code_parts[0]
    lo_group = code_parts[1][0]
    q_num = code_parts[2]

    abbrev_code = level_conversions[level] + lo_group + q_num
    lo_list[abbrev_code] = v
    lo_code_groups[lo_group].append(abbrev_code)


In [None]:
# list of learning standards (objectives) with abbreviated 4-character codes
lo_list = {
    '1C01': 'Select and operate appropriate software to perform a variety of tasks, and recognize that users have different needs and preferences for the technology they use.',
    '1C02': 'Use appropriate terminology in identifying and describing the function of common physical components of computing systems (hardware).',
    '1C03': 'Describe basic hardware and software problems using accurate terminology.',
    '1N04': 'Explain what passwords are and why we use them, and use strong passwords to protect devices and information from unauthorized access.',
    '1D05': 'Store, copy, search, retrieve, modify, and delete information using a computing device and define the information stored as data.',
    '1D06': 'Collect and present the same data in various visual formats.',
    '1D07': 'Identify and describe patterns in data visualizations, such as charts or graphs, to make predictions.',
    '1A08': 'Model daily processes by creating and following algorithms (sets of step-by-step instructions) to complete tasks.',
    '1A09': 'Model the way programs store and manipulate data by using numbers or other symbols to represent information.',
    '1A10': 'Develop programs with sequences and simple loops, to express ideas or address a problem.',
    '1A11': 'Decompose (break down) the steps needed to solve a problem into a precise sequence of instructions.',
    '1A12': 'Develop plans that describe a program’s sequence of events, goals, and expected outcomes.',
    '1A13': 'Give attribution when using the ideas and creations of others while developing programs.',
    '1A14': 'Debug (identify and fix) errors in an algorithm or program that includes sequences and simple loops.',
    '1A15': 'Using correct terminology, describe steps taken and choices made during the iterative process of program development.',
    '1I16': 'Compare how people live and work before and after the implementation or adoption of new computing technology.',
    '1I17': 'Work respectfully and responsibly with others online.',
    '1I18': 'Keep login information private, and log off of devices appropriately.',
    '2C01': 'Describe how internal and external parts of computing devices function to form a system.',
    '2C02': 'Model how computer hardware and software work together as a system to accomplish tasks.',
    '2C03': 'Determine potential solutions to solve simple hardware and software problems using common troubleshooting strategies.',
    '2N04': 'Model how information is broken down into smaller pieces, transmitted as packets through multiple devices over networks and the Internet, and reassembled at the destination.',
    '2N05': 'Discuss real-world cybersecurity problems and how personal information can be protected.',
    '2D06': 'Organize and present collected data visually to highlight relationships and support a claim.',
    '2D07': 'Use data to highlight or propose cause-and-effect relationships, predict outcomes, or communicate an idea.',
    '2A08': 'Compare and refine multiple algorithms for the same task and determine which is the most appropriate.',
    '2A09': 'Create programs that use variables to store and modify data.',
    '2A10': 'Create programs that include sequences, events, loops, and conditionals.',
    '2A11': 'Decompose (break down) problems into smaller, manageable subproblems to facilitate the program development process.',
    '2A12': "Modify, remix, or incorporate portions of an existing program into one's own work, to develop something new or add more advanced features.",
    '2A13': "Use an iterative process to plan the development of a program by including others' perspectives and considering user preferences.",
    '2A14': 'Observe intellectual property rights and give appropriate attribution when creating or remixing programs.',
    '2A15': 'Test and debug (identify and fix errors) a program or algorithm to ensure it runs as intended.',
    '2A16': 'Take on varying roles, with teacher guidance, when collaborating with peers during the design, implementation, and review stages of program development.',
    '2A17': 'Describe choices made during program development using code comments, presentations, and demonstrations.',
    '2I18': 'Discuss computing technologies that have changed the world, and express how those technologies influence, and are influenced by, cultural practices.',
    '2I19': 'Brainstorm ways to improve the accessibility and usability of technology products for the diverse needs and wants of users.',
    '2I20': 'Seek diverse perspectives for the purpose of improving computational artifacts.',
    '2I21': 'Use public domain or creative commons media, and refrain from copying or using material created by others without permission.',
    '3C01': 'Recommend improvements to the design of computing devices, based on an analysis of how users interact with the devices.',
    '3C02': 'Design projects that combine hardware and software components to collect and exchange data.',
    '3C03': 'Systematically identify and fix problems with computing devices and their components.',
    '3N04': 'Model the role of protocols in transmitting data across networks and the Internet.',
    '3N05': 'Explain how physical and digital security measures protect electronic information.',
    '3N06': 'Apply multiple methods of encryption to model the secure transmission of information.',
    '3D07': 'Represent data using multiple encoding schemes.',
    '3D08': 'Collect data using computational tools and transform the data to make it more useful and reliable.',
    '3D09': 'Refine computational models based on the data they have generated.',
    '3A10': 'Use flowcharts and/or pseudocode to address complex problems as algorithms.',
    '3A11': 'Create clearly named variables that represent different data types and perform operations on their values.',
    '3A12': 'Design and iteratively develop programs that combine control structures, including nested loops and compound conditionals.',
    '3A13': 'Decompose problems and subproblems into parts to facilitate the design, implementation, and review of programs.',
    '3A14': 'Create procedures with parameters to organize code and make it easier to reuse.',
    '3A15': 'Seek and incorporate feedback from team members and users to refine a solution that meets user needs.',
    '3A16': 'Incorporate existing code, media, and libraries into original programs, and give attribution.',
    '3A17': 'Systematically test and refine programs using a range of test cases.',
    '3A18': 'Distribute tasks and maintain a project timeline when collaboratively developing computational artifacts.',
    '3A19': 'Document programs in order to make them easier to follow, test, and debug.',
    '3I20': "Compare tradeoffs associated with computing technologies that affect people's everyday activities and career options.",
    '3I21': 'Discuss issues of bias and accessibility in the design of existing technologies.',
    '3I22': 'Collaborate with many contributors through strategies such as crowdsourcing or surveys when creating a computational artifact.',
    '3I23': 'Describe tradeoffs between allowing information to be public and keeping information private and secure.',
    '4C01': 'Explain how abstractions hide the underlying implementation details of computing systems embedded in everyday objects.',
    '4C02': 'Compare levels of abstraction and interactions between application software, system software, and hardware layers.',
    '4C03': 'Develop guidelines that convey systematic troubleshooting strategies that others can use to identify and fix errors.',
    '4N04': 'Evaluate the scalability and reliability of networks, by describing the relationship between routers, switches, servers, topology, and addressing.',
    '4N05': 'Give examples to illustrate how sensitive data can be affected by malware and other attacks.',
    '4N06': 'Recommend security measures to address various scenarios based on factors such as efficiency, feasibility, and ethical impacts.',
    '4N07': 'Compare various security measures, considering tradeoffs between the usability and security of a computing system.',
    '4N08': 'Explain tradeoffs when selecting and implementing cybersecurity recommendations.',
    '4D09': 'Translate between different bit representations of real-world phenomena, such as characters, numbers, and images.',
    '4D10': 'Evaluate the tradeoffs in how data elements are organized and where data is stored.',
    '4D11': 'Create interactive data visualizations using software tools to help others better understand real-world phenomena.',
    '4D12': 'Create computational models that represent the relationships among different elements of data collected from a phenomenon or process.',
    '4A13': 'Create prototypes that use algorithms to solve computational problems by leveraging prior student knowledge and personal interests.',
    '4A14': 'Use lists to simplify solutions, generalizing computational problems instead of repeatedly using simple variables.',
    '4A15': 'Justify the selection of specific control structures when tradeoffs involve implementation, readability, and program performance, and explain the benefits and drawbacks of choices made.',
    '4A16': 'Design and iteratively develop computational artifacts for practical intent, personal expression, or to address a societal issue by using events to initiate instructions.',
    '4A17': 'Decompose problems into smaller components through systematic analysis, using constructs such as procedures, modules, and/or objects.',
    '4A18': 'Create artifacts by using procedures within a program, combinations of data and procedures, or independent but interrelated programs.',
    '4A19': 'Systematically design and develop programs for broad audiences by incorporating feedback from users.',
    '4A20': 'Evaluate licenses that limit or restrict use of computational artifacts when using resources such as libraries.',
    '4A21': 'Evaluate and refine computational artifacts to make them more usable and accessible.',
    '4A22': 'Design and develop computational artifacts working in team roles using collaborative tools.',
    '4A23': 'Document design decisions using text, graphics, presentations, and/or demonstrations in the development of complex programs.',
    '4I24': 'Evaluate the ways computing impacts personal, ethical, social, economic, and cultural practices.',
    '4I25': 'Test and refine computational artifacts to reduce bias and equity deficits.',
    '4I26': 'Demonstrate ways a given algorithm applies to problems across disciplines.',
    '4I27': 'Use tools and methods for collaboration on a project to increase connectivity of people in different cultures and career fields.',
    '4I28': 'Explain the beneficial and harmful effects that intellectual property laws can have on innovation.',
    '4I29': 'Explain the privacy concerns related to the collection and generation of data through automated processes that may not be evident to users.',
    '4I30': 'Evaluate the social and economic implications of privacy in the context of safety, law, or ethics.',
    '5C01': 'Categorize the roles of operating system software.',
    '5C02': 'Illustrate ways computing systems implement logic, input, and output through hardware components.',
    '5N03': 'Describe the issues that impact network functionality (e.g., bandwidth, load, delay, topology).',
    '5N04': 'Compare ways software developers protect devices and information from unauthorized access.',
    '5D05': 'Use data analysis tools and techniques to identify patterns in data representing complex systems.',
    '5D06': 'Select data collection tools and techniques to generate data sets that support a claim or communicate information.',
    '5D07': 'Evaluate the ability of models and simulations to test and support the refinement of hypotheses.',
    '5A08': 'Describe how artificial intelligence drives many software and physical systems.',
    '5A09': 'Implement an artificial intelligence algorithm to play a game against a human opponent or solve a problem.',
    '5A10': 'Use and adapt classic algorithms to solve computational problems.',
    '5A11': 'Evaluate algorithms in terms of their efficiency, correctness, and clarity.',
    '5A12': 'Compare and contrast fundamental data structures and their uses.',
    '5A13': 'Illustrate the flow of execution of a recursive algorithm.',
    '5A14': 'Construct solutions to problems using student-created components, such as procedures, modules and/or objects.',
    '5A15': 'Analyze a large-scale computational problem and identify generalizable patterns that can be applied to a solution.',
    '5A16': 'Demonstrate code reuse by creating programming solutions using libraries and APIs.',
    '5A17': 'Plan and develop programs for broad audiences using a software life cycle process.',
    '5A18': 'Explain security issues that might lead to compromised computer programs.',
    '5A19': 'Develop programs for multiple computing platforms.',
    '5A20': 'Use version control systems, integrated development environments (IDEs), and collaborative tools and practices (code documentation) in a group software project.',
    '5A21': 'Develop and use a series of test cases to verify that a program performs according to its design specifications.',
    '5A22': 'Modify an existing program to add additional functionality and discuss intended and unintended implications (e.g., breaking other functionality).',
    '5A23': 'Evaluate key qualities of a program through a process such as a code review.',
    '5A24': 'Compare multiple programming languages and discuss how their features make them suitable for solving different types of problems.',
    '5I25': 'Evaluate computational artifacts to maximize their beneficial effects and minimize harmful effects on society.',
    '5I26': 'Evaluate the impact of equity, access, and influence on the distribution of computing resources in a global society.',
    '5I27': 'Predict how computational innovations that have revolutionized aspects of our culture might evolve.',
    '5I28': 'Debate laws and regulations that impact the development and use of software.'
}

Create examples list.

In [None]:
# few-shot example string for topic extraction from default questions
topic_extraction_examples = """Your job is to extract the general topic of the following JSON-formatted question in three words or less without focusing on its technical aspects. The following are some examples, 
Input: '''{
    'Question': '<p><span class="qname">Chris</span> is eating cold cereal for breakfast before school.   Which of the following best describes the <em>algorithm</em> <span class="qname">Chris</span> should use to make breakfast?</p>',
    'Answer1': 'Cereal, Milk, Bowl, Spoon',
    'Answer2': 'After waking up, ask parent to make cornflakes for breakfast',
    'Answer3': 'Get out bowl and spoon.  Pour cereal out of box into bowl. Pour milk into bowl. Eat with spoon until cereal and milk are gone.',
    'Answer4': 'Eat.  Put pot on stove, turn on stove, and boil for seven minutes. Turn off stove.  Puts two eggs in pot.  Put water in pot.',
    'CorrectAnswer': 3,
    'Explanation': '<p>An <em>algorithm</em> is a step-by-step series of instructions to accomplish a given task.</p>',
}'''
Output: "Making Breakfast"

Input: '''{
    'Question': '<p>&quot;\'Boolean\', that\'s a funny name.   Is it a kind of bean?&quot; asked <span class="qname">Wisdom</span> in their computer science class.</p><p>&quot;No,&quot; replied <span class="qname">Wisdom</span>\'s teacher.  &quot;A \'Boolean\' is a variable representing yes or no, true or false, on or off...&quot;</p><p>What is another representation of a <em>Boolean</em> that <span class="qname">Wisdom</span>\'s teacher could have added to the explanation?</p>',
    'Answer1': "'A', 'B', 'C', or 'D'",
    'Answer2': '0 or 1',
    'Answer3': 'The text strings &quot;true&quot; or &quot;false&quot;',
    'Answer4': 'The temperature of an object',
    'CorrectAnswer': 2,
    'Explanation': '<p><em>Boolean</em> variables are often stored as binary 0 or 1 internally within a computer program.   Boolean variables can only have two possible values.  String values which can store many possible values cannot be Boolean variables.</p>',
}'''
Output: "Teacher - student conversation"

Input: '''{
    'Question': 'Kai has the following metal image for an icon for their club to use at several school events.  Kai has a drawing program with a programming language that make it easy to draw regular polygons in multiple colors.    How should Kai decompose or break down the generation of this image into a small number of steps?',
    'Answer1': 'Loop through each pixel in the image and set the color as appropriate',
    'Answer2': 'Draw the borders of each figure in the image and then paint each resulting shape with the appropriate color using the paint can function',
    'Answer3': 'Develop software functions that can draw polygons filled with any color',
    'Answer4': 'Draw a blue circle using the programming language in the drawing program.  Then draw a red hexagon and a green triangle using same programming language',
    'Explanation': 'Decomposition is breaking down the steps needed to solve a problem into a precise sequence of instructions.  Programming languages that support drawing have functions that can draw circles and other polygons and fill them with a color.   While the other solutions do break down the problem somewhat, each instruction in the sequence is unclear. '
}'''
Output: "School art project"
"""

In [None]:
### Create Examples List

examples = []

for i, dq in enumerate(default_questions):
    q = dq['QuestionJSON']

    q['CorrectAnswer'] = str(q['CorrectAnswer'])

    remove_fields(q)
    for k in q.keys():
        q[k] = strip_tags(q[k])

    example = {
        'MQCode': dq['MQCode'],
        'learning_objective': lo_list[dq['MQCode']],
        'topic': None,
        'question_str': json.dumps(q, indent=4)
    }

    topic = chat_turbo([
        SystemMessage(content=topic_extraction_examples),
        HumanMessage(content=json.dumps(q, indent=4))
    ]).content
    
    print(i, topic)
    example['topic'] = topic

    examples.append(example)

Define interest area categories and subtopics.

In [None]:
# old interest list
'''interests = [
    'Music',
    'Social Media',
    'Fashion and style',
    'Movies and TV shows',
    'Books and literature',
    'Travel',
    'Photography',
    'Poetry',
    'Dance',
    'Environmentalism and sustainability',
    'Personal growth and self-improvement',
    'Community service and volunteering',
    'Psychology and human behavior',
    'Sports',
    'Video games',
    'Cars',
    'Fitness and exercise',
    'Comedy',
    'Space and astronomy',
    'Politics and current events',
    'Military history and strategy',
    'Social justice and activism',
    'Cooking and food',
    'Friendships and relationships',
    'Religion and sprirituality',
    'Cultural identity and heritage',
    'Language learning',
    'Beauty and cosmetics',
    'Basketball',
    'Football',
    'Volleyball',
    'Tennis',
    'Swimming',
    'Running',
    'Gymnastics',
    'Lacrosse',
    'Golf',
    'Wrestling',
    'Baseball',
    'Softball',
    'Hockey',
    'Cheerleading',
    'Skateboarding',
    'Rock climbing and boldering',
    'Surfing',
    'Ultimate Frisbee',
    'Cycling',
    'Racial justice and equality',
    'Police brutality and conduct',
    'Immigration rights and reform',
    'LGBTQ+ rights and equality',
    "Women's rights and gender equality",
    'Disability rights and accessibility',
    'Income inequality and economic justice',
    'Environmental justice and climate change',
    'Gun control and safety',
    'Voting rights and democracy',
    'Mass incarceration and criminal justice reform',
    'Domestic violence and sexual assault',
    'Educational equity and access',
    'Healthcare access and affordability',
    'Worker\'s rights and fair labor practices',
    'Fair and affordable housing',
    'Homelessness',
    'Religious freedom and tolerance',
    'Native American rights and sovereignty',
    'Animal rights',
    'Net neutrality and digital privacy',
    'Cybersecurity and internet safety',
    'Mental health',
    'Agriculture',
    'Art and design',
]
'''

In [None]:
# current interest areas
interest_areas = [
    "World History",
    "Business",
    "Sports and Athletics",
    "Science and engineering",
    "Everyday Life/School Life",
    "Social justice", 
    "Diversity and inclusion",
    "Agriculture",
    "Math",
    "Music",
    "Art and Design",
    "(Video) Gaming",
]

In [None]:
# dictionary of subtopics within each interest area
interests = {
    "World History": [
        "Ancient Civilizations",
        "Classical Empires (e.g., Roman, Greek, Persian)",
        "Middle Ages and Feudalism",
        "Renaissance and Enlightenment",
        "Colonialism and Imperialism",
        "World Wars (WWI and WWII)",
        "Cold War and the Space Race",
        "Decolonization and Independence Movements",
        "Globalization and the Modern World",
        "Revolutionary Movements and Social Change",
    ],
    "Business": [
        "Entrepreneurship and Startups",
        "Business Strategy and Planning",
        "Marketing and Advertising",
        "Finance and Accounting",
        "Supply Chain Management",
        "E-commerce and Online Business",
        "Human Resources and Talent Management",
        "Business Ethics and Corporate Social Responsibility",
        "International Business and Globalization",
        "Business Innovation and Technology",
    ],
    "Sports and Athletics": [
        "Football (Soccer)",
        "Basketball",
        "Weight lifting",
        "Tennis",
        "Running and Track Events",
        "Gymnastics",
        "Swimming and Aquatics",
        "Team Sports",
        "Extreme Sports (e.g., Surfing, Skateboarding)",
        "Sports Psychology and Performance",
    ],
    "Science and engineering": [
        "Physics",
        "Chemistry",
        "Biology",
        "Archeology",
        "Electrical Engineering",
        "Mechanical Engineering",
        "Civil Engineering",
        "Environmental Science",
        "Astronomy and Astrophysics",
        "Genetics and Biotechnology",
    ],
    "Everyday Life/School Life": [
        "Time Management and Organization",
        "Study Skills and Techniques",
        "Effective Communication",
        "Goal Setting and Motivation",
        "Mental/Physical Health and Wellness",
        "Relationships and Social Skills",
        "Personal Finance and Budgeting",
        "Technology and Digital Skills",
        "Creativity and Hobbies",
        "Stress Management and Self-Care",
        "Public Speaking and Presentation Skills",
        "Career Planning and Job Search",
        "Mindfulness and Meditation",
        "Conflict Resolution and Negotiation",
        "Self-Reflection and Personal Growth",
    ],
    "Social justice": [
        "Human Rights",
        "Racial Justice",
        "Gender Equality",
        "LGBTQ+ Rights",
        "Immigrant Rights",
        "Environmental Justice",
        "Disability Rights",
        "Criminal Justice Reform",
        "Social Inequality and Poverty",
        "Indigenous Rights and Sovereignty",
        "Income inequality and economic justice",
        "Educational equity and access",
        "Homelessness",
        "Healthcare access and affordability",
        "Religious freedom and tolerance",
        "Worker\'s rights and fair labor practices",
        "Fair and affordable housing",
        "Animal rights",
        "Net neutrality and digital privacy",
    ],
    "Diversity and inclusion": [
        "Cultural Competence",
        "Implicit Bias",
        "Intersectionality",
        "Ethnic and Racial Diversity",
        "Gender Diversity",
        "Religious Diversity",
        "Disability Inclusion",
        "Age Inclusivity",
        "LGBTQ+ Inclusion",
        "Inclusive Leadership",
    ],
    "Agriculture": [
        "Crop Production",
        "Livestock Farming",
        "Sustainable Agriculture",
        "Organic Farming",
        "Precision Agriculture",
        "Agribusiness and Farm Management",
        "Food Security and Hunger",
        "Crop Protection and Pest Management",
        "Soil Science and Fertility",
        "Agricultural Technology and Innovation",
    ],
    "Math": [
        "Algebra",
        "Geometry",
        "Trigonometry",
        "Probability and Statistics",
        "Functions and Graphs",
        "Exponents and Logarithms",
        "Quadratic Equations",
    ],
    "Music": [
        "Music Theory",
        "Music History",
        "Instrumental Techniques",
        "Vocal Techniques",
        "Music Composition",
        "Music Performance",
        "Music Notation",
        "Music Technology",
        "Music in Film and Media",
        "Music Therapy",
    ],
    "Art and Design": [
        "Drawing and Sketching",
        "Painting Techniques",
        "Sculpture and 3D Design",
        "Graphic Design",
        "Typography",
        "Photography and Photo Editing",
        "Fashion Design",
        "Illustration",
        "Art History",
        "Digital Art and Animation",
    ],
    "(Video) Gaming": [
        "Game Development",
        "Game Design",
        "Game Programming",
        "Game Art and Animation",
        "Game Sound Design",
        "Game Testing and Quality Assurance",
        "Game Marketing and Publishing",
        "Game Accessibility",
        "Esports and Competitive Gaming",
        "Gaming Communities and Online Platforms",
    ],
}

In [None]:
# categorize examples by learning objective group
ex_groups = {
    'C': [],
    'N': [],
    'D': [],
    'A': [],
    'I': [],
}

for ex in examples:
    ex_groups[ex['MQCode'][1]].append(ex)

Create Few-Shot Example Question Template

In [None]:
# Create custom example selector that chooses examples from the same 
# learning objective group as the target learning objective

class CustomExampleSelector(BaseExampleSelector):
    def __init__(self, examples: List[Dict[str, str]]):
        self.examples = examples
    def add_example(self, example: Dict[str, str]) -> None:
        # Add new example to store for a key.
        self.examples.append(example)
    def select_examples(self, input_variables: Dict[str, str]) -> List[dict]:
        # Select which examples to use based on the inputs.
        for code, obj in lo_list.items():
            if obj == input_variables['learning_objective']:
                similar_exs = ex_groups[code[1]]
                few_exs = random.sample(similar_exs, min(len(similar_exs), 3))
                return few_exs

In [None]:
# Instruct GPT-4 Few-Shot Example Template

question_template = """Learning objective: {learning_objective}\nTopic: {topic}\nQuestion:\n```json\n{{{question_str}}}\n```"""

example_prompt = PromptTemplate(
    input_variables=["learning_objective", "topic", "question_str"],
    template=question_template,
)

example_selector = CustomExampleSelector(examples)

few_shot_prompt = FewShotPromptTemplate(
    example_selector=example_selector,
    example_prompt=example_prompt,
    prefix="""System Message: You are a high school computer science teacher who wants to make computer science more appealing to your students by connecting CS concepts to the diverse backgrounds of your students so that they can better see themselves in computer science. Questions from different cultures and different fields of study often have the same solution in code and teach the same underlying CS principles. Students can better learn the computer science concept of abstraction by seeing the same technical solution solve questions from various cultures/contexts. And by seeing questions from a wide variety of cultures and fields of discipline, students can better appreciate that CS is everywhere, for everyone, and that the essence of good computer science is to find the common patterns in problems from all walks of life and provide solutions for them. The learning objectives of your computer science class include (but are not limited to) abstraction, decomposition, pattern recognition, password strength, data encoding, data transmission, data storage, variables, variable types, using functions, conditionals, iteration, using lists, and understanding objects. Your goal is to teach these computer science concepts by connecting them to the experiences and interests of your students.\n\nTask: Prompted with a CS learning objective, a topic of interest to your students, and a question difficulty level (1-5), generate a question in JSON format of the specified difficulty level that applies the learning objective to a scenario related to the provided topic. Specify the content of any charts or graphics referenced in your question.""",    
    suffix="Learning objective: {learning_objective}\nTopic: {topic}\nQuestion JSON: ",
    input_variables=["learning_objective", "topic"]
)

In [None]:
# Not currently used
'''
def get_new_mqcode(old_mqcode):
    num = max([int(code[2:]) for code in lo_code_groups[old_mqcode[1]] if code[:2] == old_mqcode[:2]])
    new_code = old_mqcode[:2] + str(num + 1) if num >= 9 else old_mqcode[:2] + '0' + str(num + 1)
    return new_code
'''

Generate Questions.

In [None]:
# generate questions for each interest and learning objective group
generated_qs = []
for area, topic_list in interests.items():
    for topic in topic_list[:1]:
        group_code, group_name = random.choice(list(group_names.items()))
        lo_code = random.choice(lo_code_groups[group_code])
        obj = lo_list[lo_code]
        print(f"Interest: {topic}\nGroup: '{group_name}'\nLearning Objective: {lo_code}. {obj}\n")
        prompt = few_shot_prompt.format(learning_objective=obj, topic=topic)
        print(prompt)
        res = llm(prompt)
        q_str = "{" + res.split("{")[-1].split("}")[0] + "}"
        print(q_str + "\n")
        q = {
            'lo_group': group_name,
            'learning_objective': obj,
            'topic': topic,
            'MQCode': lo_code,
            'question_str': q_str,
        }
        generated_qs.append(q)
