# Analyzing course evaluations
This notebook provides sample [EDSL](https://pypi.org/project/edsl/) code for using a language model to analyze course evaluations. The analysis is designed as a survey of questions about the evaluations that we prompt an AI agent to answer, using a language model to generate the responses as a dataset.

[EDSL is an open-source library](https://github.com/expectedparrot/edsl) for simulating surveys, experiments and other research with AI agents and large language models. 
Before running the code below, please ensure that you have [installed the EDSL library](https://docs.expectedparrot.com/en/latest/installation.html) and either [activated remote inference](https://docs.expectedparrot.com/en/latest/remote_inference.html) from your [Coop account](https://docs.expectedparrot.com/en/latest/coop.html) or [stored API keys](https://docs.expectedparrot.com/en/latest/api_keys.html) for the language models that you want to use with EDSL. Please also see our [documentation page](https://docs.expectedparrot.com/) for tips and tutorials on getting started using EDSL.

## Create questions
We start by creating questions about a set of course evaluations for an agent to answer. EDSL comes with a [variety of question types](https://docs.expectedparrot.com/en/latest/questions.html) that we can choose from based on the form of the response that we want to get back from a model (multiple choice, linear scale, checkbox, free text, etc.). We can use a `{{ placeholder }}` in the question texts to parameterize them with each evaluation. This allows us to create different "scenarios" of the questions that we can administer at once.

We start by importing some question types and composing questions in the relevant templates (see [examples of all types](https://docs.expectedparrot.com/en/latest/questions.html#question-type-classes) in the docs):

In [1]:
from edsl import QuestionList, QuestionMultipleChoice

In [2]:
q_sentiment = QuestionMultipleChoice(
    question_name="sentiment",
    question_text="What is the overall sentiment of this evaluation: {{ evaluation }}",
    question_options=["Positive", "Neutral", "Negative"],
)

q_themes = QuestionList(
    question_name="themes",
    question_text="Summarize the key points of this evaluation: {{ evaluation }}",
    max_list_items=3,  # Optional
)

q_improvements = QuestionList(
    question_name="improvements",
    question_text="Identify areas for improvement based on this evaluation: {{ evaluation }}",
    max_list_items=3,
)

## Construct a survey
Next we combine our questions into a survey. This allows us to administer the questions asynchronously (by default), or according to any desired [survey logic or rules](https://docs.expectedparrot.com/en/latest/surveys.html) that we want to add, such as skip/stop rules or giving an agent "memories" of other questions in the survey. Here we create a simple asynchronous survey by passing the list of questions to a `Survey` object:

In [3]:
from edsl import Survey

survey = Survey(questions=[q_sentiment, q_themes, q_improvements])

## Select data for review
Next we identify the data to be analyzed. Here we use some mock evaluations for an Econ 101 course stored as a list of texts:

In [4]:
evaluations = [
    "I found the course very engaging and informative. The professor did an excellent job breaking down complex concepts, making them accessible to those of us new to economics. However, the pace was a bit fast, and I sometimes struggled to keep up with the weekly readings.",
    "This class was a struggle for me. The material felt dry and difficult to connect with real-world applications, which I think could have made it more interesting. More examples from current events would definitely have helped spark my interest.",
    "Excellent introductory course! The professor was enthusiastic and always willing to offer extra help during office hours. The interactive lectures and the practical assignments made the theory much more digestible and engaging.",
    "As someone with a strong background in math, I appreciated the analytical rigor of this course. However, I wish there had been more discussions that connected the theories we learned to everyday economic issues. It felt a bit isolated from practical realities at times.",
    "I enjoyed the course, especially the group projects, which were both challenging and rewarding. It was great to apply economic concepts to solve real-life problems. I did feel, however, that the feedback on assignments could be more detailed to help us understand our mistakes.",
    "The course content was well-organized, but the lectures were somewhat monotonous and hard to follow. I would suggest incorporating more visual aids and maybe some guest lectures from industry professionals to liven up the sessions.",
    "This was my favorite class this semester! The mix of theory and case studies was perfect, and the exams were fair. I also really appreciated the diversity of perspectives we explored in class, especially in terms of global economic policies.",
    "I found the textbook to be overly complex for an introductory course. It often used jargon that hadn't been explained in lectures, which was confusing. Simpler reading materials or more explanatory lectures would make a big difference for newcomers to economics.",
    "The professor was knowledgeable and clearly passionate about economics, but I felt the course relied too heavily on tests rather than more creative forms of assessment. More varied assignments would make the course more accessible to students with different learning styles.",
    "This class was a solid introduction to economics, though it leaned heavily on theoretical aspects. I would have liked more opportunities to discuss the real-world implications of economic theories, which I believe would enhance understanding and retention of the material.",
]

## Add data to the questions
Next we create a `ScenarioList` with a `Scenario` containing a key/value for each evaluation that we will add to the questions when we run the survey. EDSL provides [methods for generating scenarios from many data sources](https://docs.expectedparrot.com/en/latest/scenarios.html) (PDFs, CSVs, images, tables, dicts, etc.); here we import a list and match the key to our question texts placeholder:

In [5]:
from edsl import ScenarioList

scenarios = ScenarioList.from_list("evaluation", evaluations)

## Design AI agents
Next we design agents with relevant traits and personas for a language model to use in answering the questions. This can be useful if we want to compare responses among different audiences. We do this by passing dictionaries of `traits` to `Agent` objects. We can also choose whether to give an agent additional instructions for answering the survey (independent of individual question texts). Please see documentation for more [details and example code for creating agents to use with surveys](https://docs.expectedparrot.com/en/latest/agents.html).

Here we create a persona for the professor of the course and pass it some special instructions:

In [6]:
from edsl import Agent

persona = "You are a professor reviewing student evaluations for your recent Econ 101 course."
instruction = "Be very specific and constructive in providing feedback and suggestions."

agent = Agent(traits={"persona": persona}, instruction=instruction)

## Select language models
[EDSL works with many popular language models](https://docs.expectedparrot.com/en/latest/language_models.html) that we can use to generate responses for our survey. We can see a current list of all available models:

In [7]:
from edsl import Model

In [8]:
# Model.available() # uncomment and run to see the list

We select models to use with a survey by creating `Model` objects for them. The default model is GPT 4 Preview, meaning that EDSL will use it to run our survey if we do not specify a different model. Here's we'll specify that GPT 4o should be used:

In [9]:
model = Model("gpt-4o")

## Run the survey
Next we add the scenarios and agent to the survey, and then run it with the specified model. This will generate a dataset of `Results` that we can store and begin analyzing:

In [10]:
results = survey.by(scenarios).by(agent).by(model).run(raise_validation_errors=True)

## Analyzing results
EDSL comes with [built-in methods for analyzing results](https://docs.expectedparrot.com/en/latest/results.html) in data tables, dataframes, SQL queries and other formats. We can print a list of all the components that can be accessed. Here we will just look at the first 5:

In [11]:
results.columns[:5]

['agent.agent_instruction',
 'agent.agent_name',
 'agent.persona',
 'answer.improvements',
 'answer.sentiment']

For example, we can transform the results into a dataframe:

In [12]:
df = results.to_pandas()
df.head()

Unnamed: 0,answer.improvements,answer.sentiment,answer.themes,scenario.evaluation,agent.agent_name,agent.persona,agent.agent_instruction,model.max_tokens,model.temperature,model.frequency_penalty,...,question_options.themes_question_options,question_options.improvements_question_options,question_options.sentiment_question_options,question_type.sentiment_question_type,question_type.improvements_question_type,question_type.themes_question_type,comment.k_comment,generated_tokens.improvements_generated_tokens,generated_tokens.sentiment_generated_tokens,generated_tokens.themes_generated_tokens
0,"['Incorporate more real-world examples', 'Rela...",Negative,"['Material felt dry', 'Difficult to connect wi...",This class was a struggle for me. The material...,Agent_0,You are a professor reviewing student evaluati...,Be very specific and constructive in providing...,1000,0.5,0,...,,,"['Positive', 'Neutral', 'Negative']",multiple_choice,list,list,These suggestions are aimed at making the mate...,"[""Incorporate more real-world examples"", ""Rela...",Negative\n\nThe student describes the class as...,"[""Material felt dry"", ""Difficult to connect wi..."
1,"['Incorporate more visual aids into lectures',...",Neutral,"['Well-organized course content', 'Monotonous ...","The course content was well-organized, but the...",Agent_0,You are a professor reviewing student evaluati...,Be very specific and constructive in providing...,1000,0.5,0,...,,,"['Positive', 'Neutral', 'Negative']",multiple_choice,list,list,These suggestions directly address the feedbac...,"[""Incorporate more visual aids into lectures"",...",Neutral\n\nThe evaluation acknowledges positiv...,"[""Well-organized course content"", ""Monotonous ..."
2,"['Provide more diverse examples in lectures', ...",Positive,"['Enthusiastic professor', 'Helpful during off...",Excellent introductory course! The professor w...,Agent_0,You are a professor reviewing student evaluati...,Be very specific and constructive in providing...,1000,0.5,0,...,,,"['Positive', 'Neutral', 'Negative']",multiple_choice,list,list,"The evaluation is very positive, but to furthe...","[""Provide more diverse examples in lectures"", ...",Positive\n\nThe evaluation highlights several ...,"[""Enthusiastic professor"", ""Helpful during off..."
3,['Incorporate more real-world economic example...,Neutral,"['Appreciated analytical rigor', 'Desire for m...","As someone with a strong background in math, I...",Agent_0,You are a professor reviewing student evaluati...,Be very specific and constructive in providing...,1000,0.5,0,...,,,"['Positive', 'Neutral', 'Negative']",multiple_choice,list,list,These suggestions aim to bridge the gap betwee...,"[""Incorporate more real-world economic example...",Neutral\n\nThe student appreciates the analyti...,"[""Appreciated analytical rigor"", ""Desire for m..."
4,['Increase opportunities for student participa...,Positive,"['Perfect mix of theory and case studies', 'Fa...",This was my favorite class this semester! The ...,Agent_0,You are a professor reviewing student evaluati...,Be very specific and constructive in providing...,1000,0.5,0,...,,,"['Positive', 'Neutral', 'Negative']",multiple_choice,list,list,These suggestions are aimed at enhancing the o...,"[""Increase opportunities for student participa...",Positive\n\nThe student expresses high satisfa...,"[""Perfect mix of theory and case studies"", ""Fa..."


Here we select just the responses to the questions and display them in a table:

In [13]:
results.select("sentiment", "themes", "themes_generated_tokens", "improvements").print(format="rich")

We can do a quick tally of the sentiments:

In [14]:
results.select("sentiment").tally().print(format = "rich")

We can also use pandas methods by first converting:

In [15]:
df_sentiment = results.to_pandas()["answer.sentiment"]
df_sentiment.value_counts()

answer.sentiment
Neutral     4
Positive    4
Negative    2
Name: count, dtype: int64

## Use responses to construct new questions
We can use the responses to our initial questions to construct more questions about the texts. For example, we can prompt a model to condense the individual lists of themes and areas for improvement into short lists, and then use the new lists to quantify the topics across the set of evaluations.

Here we take the lists of themes in each evaluation, flatten them into a (duplicative) list, and then create a new question prompting a model to condense it for us:

In [16]:
results.select("themes", "themes_generated_tokens").print(format = "rich")

In [17]:
themes = results.select("themes").to_list(flatten = True)

Next we construct a question to condense the list into a new list:

In [18]:
q_condensed_themes = QuestionList(
    question_name="condensed_themes",
    question_text="""Combine the following list of themes extracted from the evaluations 
    into a consolidated, non-redundant list: """
    + ", ".join(themes),
    max_list_items=10,
)

Now we run the question and select the new list. Note that we can choose whether we want to use the agent for this question by not adding it to the question when we run it:

In [19]:
condensed_themes = q_condensed_themes.run().select("condensed_themes").to_list()[0]


Now we can create a question to identify all the themes in the list that appear in each evaluation (our new list becomes the list of answer options):

In [20]:
from edsl.questions import QuestionCheckBox

q_themes_list = QuestionCheckBox(
    question_name="themes_list",
    question_text="Select all of the themes that are mentioned in this evaluation: {{ evaluation }}",
    question_options=condensed_themes,
)

Here we run the question and show a table listing all the themes for each evaluation in the results:

In [21]:
themes_lists = q_themes_list.by(scenarios).by(agent).run()
themes_lists.select("evaluation", "themes_list").print(format="rich")

In [22]:
wide_evaluation_themes = themes_lists.select("evaluation", "themes_list").to_scenario_list().expand("themes_list").rename({"themes_list": "theme"})
wide_evaluation_themes.print(max_rows = 10)

evaluation,theme
"The course content was well-organized, but the lectures were somewhat monotonous and hard to follow. I would suggest incorporating more visual aids and maybe some guest lectures from industry professionals to liven up the sessions.",Material felt dry and lectures were monotonous
"The course content was well-organized, but the lectures were somewhat monotonous and hard to follow. I would suggest incorporating more visual aids and maybe some guest lectures from industry professionals to liven up the sessions.",Well-organized course content
"The course content was well-organized, but the lectures were somewhat monotonous and hard to follow. I would suggest incorporating more visual aids and maybe some guest lectures from industry professionals to liven up the sessions.","Need for more visual aids, guest lectures, and varied assignments"
"I found the course very engaging and informative. The professor did an excellent job breaking down complex concepts, making them accessible to those of us new to economics. However, the pace was a bit fast, and I sometimes struggled to keep up with the weekly readings.",Helpful and enthusiastic professor
"I found the course very engaging and informative. The professor did an excellent job breaking down complex concepts, making them accessible to those of us new to economics. However, the pace was a bit fast, and I sometimes struggled to keep up with the weekly readings.",Textbook and readings overly complex for an introductory course
"As someone with a strong background in math, I appreciated the analytical rigor of this course. However, I wish there had been more discussions that connected the theories we learned to everyday economic issues. It felt a bit isolated from practical realities at times.",Difficult to connect with real-world applications
"As someone with a strong background in math, I appreciated the analytical rigor of this course. However, I wish there had been more discussions that connected the theories we learned to everyday economic issues. It felt a bit isolated from practical realities at times.",More current event examples and practical discussions needed
"This class was a solid introduction to economics, though it leaned heavily on theoretical aspects. I would have liked more opportunities to discuss the real-world implications of economic theories, which I believe would enhance understanding and retention of the material.",Difficult to connect with real-world applications
"This class was a solid introduction to economics, though it leaned heavily on theoretical aspects. I would have liked more opportunities to discuss the real-world implications of economic theories, which I believe would enhance understanding and retention of the material.",More current event examples and practical discussions needed
"The professor was knowledgeable and clearly passionate about economics, but I felt the course relied too heavily on tests rather than more creative forms of assessment. More varied assignments would make the course more accessible to students with different learning styles.","Need for more visual aids, guest lectures, and varied assignments"


In [23]:
wide_evaluation_themes.tally("theme").print(format="rich")

We can do the same thing with the areas of improvement:

In [24]:
improvements = results.select("improvements").to_list(flatten=True)
improvements

['Incorporate more real-world examples',
 'Relate material to current events',
 'Increase engagement with interactive activities',
 'Incorporate more visual aids into lectures',
 'Invite guest lecturers from industry professionals',
 'Enhance lecture delivery to make it more engaging',
 'Provide more diverse examples in lectures',
 'Improve the pacing of the course material',
 'Incorporate more real-world case studies',
 'Incorporate more real-world economic examples into lectures',
 'Facilitate discussions on the application of theories to current economic issues',
 'Include case studies or guest speakers from industry',
 'Increase opportunities for student participation',
 'Provide additional resources for complex topics',
 'Offer more real-world application projects',
 'Simplify textbook selection',
 'Align lecture content with textbook',
 'Provide supplementary explanatory materials',
 'Incorporate diverse assessment methods',
 'Include more project-based assignments',
 'Offer alte

In [25]:
q_condensed_improvements = QuestionList(
    question_name="condensed_improvements",
    question_text="""Combine the following list of areas for improvement from the evaluations 
    into a consolidated, non-redundant list: """
    + ", ".join(improvements),
    max_list_items=10,
)

In [26]:
condensed_improvements = (
    q_condensed_improvements.run().select("condensed_improvements").to_list()[0]
)
condensed_improvements

['Incorporate more real-world examples and case studies',
 'Relate material to current events',
 'Increase engagement with interactive activities and discussions',
 'Incorporate more visual aids into lectures',
 'Invite guest lecturers from industry professionals',
 'Enhance lecture delivery to make it more engaging',
 'Improve the pacing of the course material',
 'Provide additional resources and supplementary materials',
 'Include diverse assessment methods and project-based assignments',
 'Offer additional office hours and detailed feedback on assignments']

In [27]:
q_improvements_list = QuestionCheckBox(
    question_name="improvements_list",
    question_text="Select all of the improvements that are mentioned in this evaluation: {{ evaluation }}",
    question_options=condensed_improvements,
)

In [28]:
improvements_lists = q_improvements_list.by(scenarios).by(agent).run()
improvements_lists.select("evaluation", "improvements_list").print(format="rich")

In [29]:
wide_themes = (improvements_lists
               .select("evaluation", "improvements_list")
               .to_scenario_list()
               .expand("improvements_list")
               .rename({"improvements_list": "theme"})
)

In [30]:
wide_themes.tally("theme").print(format="rich")

In [31]:
improvements_summary = wide_themes.tally("theme")

In [32]:
summary_string = improvements_summary.print(format = "markdown", return_string = True)

## Summarize the review
Here we create another question prompting the agent to summarize the analysis that was done, using the results of the prior steps:

In [33]:
from edsl.questions import QuestionFreeText

q_summary = QuestionFreeText(
    question_name="summary",
    question_text="Consider the following analyses of the evaluations and draft a paragraph summarizing them."
    + "Evaluation counts by theme: " + 
    wide_evaluation_themes.tally("theme").print(format = "markdown", return_string = True)
    + "Evaluation counts by area of improvement:"
    + summary_string
)

summary = q_summary.by(agent).run()
summary.select("summary").print(format="rich")

## Other examples
Please check out the [EDSL Docs](https://docs.expectedparrot.com/en/latest/index.html) for examples of other methods and templates for use cases, and [join our Discord channel](https://discord.com/invite/mxAYkjfy9m) to ask questions and with other users!

## Posting to the Coop
The [Coop](https://www.expectedparrot.com/explore) is a platform for creating, storing and sharing LLM-based research.
It is fully integrated with EDSL and accessible from your workspace or Coop account page.
Learn more about [creating an account](https://www.expectedparrot.com/login) and [using the Coop](https://docs.expectedparrot.com/en/latest/coop.html).

We can post any EDSL object to the Coop by calling the `push()` method on it, including this notebook:

In [34]:
from edsl import Notebook

In [35]:
n = Notebook(path = "analyze_evaluations.ipynb")

In [36]:
n.push(description = "Example code for analyzing course evaluations", visibility = "public")

{'description': 'Example code for analyzing course evaluations',
 'object_type': 'notebook',
 'url': 'https://www.expectedparrot.com/content/62786506-21c2-45cb-9a8e-6103002d314b',
 'uuid': '62786506-21c2-45cb-9a8e-6103002d314b',
 'version': '0.1.33.dev1',
 'visibility': 'public'}

To update an object at the Coop:

In [37]:
n = Notebook(path = "analyze_evaluations.ipynb")

In [38]:
n.patch(uuid = "62786506-21c2-45cb-9a8e-6103002d314b", value = n)

{'status': 'success'}