# Data labeling agents

This notebook shows how to use EDSL to perform complex data labeling tasks. This is accomplished with the following generalized steps: 
<br>
1. We identify data to be labeled. <br>
2. We construct the data labeling tasks as a question or series of questions about the data, e.g., <i>Select the most relevant category for the following text: {{ text }}.</i> The questions can be qualitative or quantitative, and be formatted in typical types of survey questions (multiple choice, free text, linear scale, etc.). <br>
3. We draft personas for AI agents to reference in responding to the questions, e.g., <i>You are an expert in ...</i> <br>
4. We administer the survey to the agents with the data as inputs to the questions. <br>
<br>
<img src="general_survey.png">
<br><br>

## Conducting agent-specific tasks
We can add a layer of complexity to this generalized flow by administering the survey to each agent with a specific subset of the data, such as <i>data that is relevant to the agent's persona</i>. This can be useful if our data is sorted (or sortable) in some way that is important to our task. We can also use the tools to sort the data as needed.

We can visualize this modified flow as follows:
<img src="agent_specific_survey.png">
<br><br>

## An example case: Evaluating job posts 
Using a dataset of job posts as an example, we show how to create AI agents with relevant backgrounds and prompt them to evaluate and categorize the posts in a variety of ways. This exercise consists of the following steps:
<br>
1. We use the tools to create a dataset of job categories and mock job posts.<br>
2. We construct questions that we will ask about each of the job posts and combine them into a survey. <br>
3. We create AI agents with category expertise for each of the job categories. <br>
4. We administer the survey to each agent with (only) the job posts for the relevant category. <br>
5. We show how to access the results using built-in methods for analysis. <br>

## Technical setup
Before running the code below please see instructions on:

* [Installing EDSL](https://docs.expectedparrot.com/en/latest/installation.html) and
* [Storing API Keys](https://docs.expectedparrot.com/en/latest/api_keys.html) for the language models that you want to use.

Our [Starter Tutorial](https://docs.expectedparrot.com/en/latest/starter_tutorial.html) provides examples of EDSL basic components.

A simpler [data labeling example notebook](https://docs.expectedparrot.com/en/latest/notebooks/data_labeling_example.html) may also be useful to you.

## Importing the tools
We start by selecting question types and survey components that we will use.
Please see the [EDSL Docs](https://docs.expectedparrot.com/en/latest/index.html) for examples of all question types and details on these basic components.

In [1]:
# ! pip install edsl

In [2]:
from edsl.questions import QuestionMultipleChoice, QuestionFreeText, QuestionLinearScale, QuestionList, QuestionNumerical
from edsl import Scenario, Survey, Agent, Model

## Identifying data for review
Next we identify a dataset for review. For purposes of demonstration, we use the tools to create a dataset of job categories and mock job posts. 

<b>Skip these steps to import your own data.</b>

In [3]:
# Example method for importing your data
# import csv
# data = []
# with open("data.csv", "r") as f: 
#     reader = csv.reader(f)
#     header = next(reader)
#     for row in reader: 
#         data.append(row)

Our dataset will consist of a column of job categories and a column of job posts for those categories. We'll go into more detail on the methods that we use to do this in later steps.

In [4]:
# Skip this step and upload your real dataset, modifying columns as needed.

import pandas as pd 

def create_job_categories(num_categories, model):
    # Create a list of job categories
    q_job_categories = QuestionList(
        question_name = "job_categories",
        question_text = f"""{ num_categories } categories of jobs commonly posted at an 
        online labor marketplace (e.g., 'Graphic Design'). Return each category as an item of the list."""
    )
    job_categories_list = q_job_categories.by(model).run().select("job_categories").to_list()[0]
    return job_categories_list

def create_job_posts(num_posts, job_category, model):
    # Create job posts for a category
    q_job_posts = QuestionList(
        question_name = "job_posts",
        question_text = f"""Draft descriptions for { num_posts } job posts in the following 
        category of an online labor marketplace: { job_category }."""
    )
    job_posts_list = q_job_posts.by(model).run().select("job_posts").to_list()[0]
    return job_posts_list

def create_data(num_categories, num_posts, model):
    jobs_data = pd.DataFrame(columns=["job_category", "job_post"])
    job_categories_list = create_job_categories(num_categories, model)    
        
    for job_category in job_categories_list:
        # Because of how job posts are typically structured, we expect this to return a list with a
        # dict for each job post. We turn each job post dict into a string to add it to our dataset.
        job_posts_list = create_job_posts(num_posts, job_category, model)

        for job_post in job_posts_list:
            row_df = pd.DataFrame([[job_category, job_post]], columns=["job_category", "job_post"])
            jobs_data = pd.concat([jobs_data, row_df], ignore_index=True)
    
    return jobs_data

In [5]:
df = create_data(num_categories=3, num_posts=3, model=Model('gpt-4-1106-preview'))
print(df)

      job_category                                           job_post
0   Graphic Design  {'title': 'Creative Graphic Designer', 'descri...
1   Graphic Design  {'title': 'Junior Graphic Designer', 'descript...
2   Graphic Design  {'title': 'Freelance Graphic Designer', 'descr...
3  Web Development  {'title': 'Front-End Developer', 'description'...
4  Web Development  {'title': 'Back-End Developer', 'description':...
5  Web Development  {'title': 'Full Stack Developer', 'description...
6  Content Writing  {'Title': 'Freelance Lifestyle Blog Writer', '...
7  Content Writing  {'Title': 'Tech Industry Content Creator', 'De...
8  Content Writing  {'Title': 'Social Media Copywriter', 'Descript...


## Constructing questions about the data
Next we construct questions to ask about the job posts, selecting question types based on the form of the response that we want (multiple choice, linear scale, free text, numerical, etc.--see [examples of all question types](https://docs.expectedparrot.com/en/latest/questions.html)). We design the questions with placeholders that we will use to parameterize each question with each job post and category when we run it:

In [6]:
q_skills = QuestionList(
    question_name = "skills",
    question_text = """
        Consider the following job category and job post at an online labor marketplace. 
        Job category: {{ job_category }}
        Job post: {{ job_post }}
        What are some key skills required for this job?"""
)

q_experience = QuestionMultipleChoice(
    question_name = "experience",
    question_text = """
        Consider the following job category and job post at an online labor marketplace. 
        Job category: {{ job_category }}
        Job post: {{ job_post }}
        What level of experience is required for this job?""",
    question_options = [
        "Entry-level",
        "Mid-level",
        "Senior-level"]
)

q_days = QuestionNumerical(
    question_name = "days",
    question_text = """
        Consider the following job category and job post at an online labor marketplace. 
        Job category: {{ job_category }}
        Job post: {{ job_post }}
        Estimate the number of days until this job post is fulfilled."""
)

## Combining questions into a Survey
Next we combine our questions into a survey that will be administered to the AI agents. By default, the questions will be administered asynchronously. If desired, we can also specify survey rules (skip/stop logic) and within-survey memories of prior questions and responses. See the EDSL Docs for details on methods for [applying survey rules](https://docs.expectedparrot.com/en/latest/surveys.html#applying-survey-rules).

In [7]:
from edsl import Survey

jobs_survey = Survey(questions = [q_skills, q_experience, q_days])

## Creating personas for Agents
Next we draft personas for AI agents that will answer the questions. For each job category we construct an AI agent that is an expert in the category. Agents are constructed by passing a dictionary of `traits` to an `Agent` object. We can use the `.example()` method to see an example:

In [8]:
Agent.example()

An agent can also take an optional `name` and parameterized `traits`. For example:

In [9]:
job_category = "Web design"
base_persona = "You are an experienced freelancer on online labor marketplaces."
expertise = f"You regularly perform jobs in the following category: { job_category }."

example_agent = Agent(name = "Example agent", traits = {"base_persona": base_persona, "expertise": expertise})
example_agent.print()

## Parameterizing questions with Scenarios
We will have each agent answer the survey for the set of job posts that is relevant to the agent's expertise. We do this by creating a "scenario" for each question. We can call the `example.()` method again to see how a `Scenario` is constructed:

In [10]:
Scenario.example()

Here we create a scenario for each job category/job post pair in our dataset:

In [11]:
scenarios = [Scenario({"job_category": row["job_category"], "job_post": row["job_post"]}) for _, row in df.iterrows()]
scenarios[0].print()

## Selecting language models
EDSL lets us select individual language models to use when generating survey results. We can check a current list of available language models:

In [12]:
from edsl import Model

Model.available()

[['01-ai/Yi-34B-Chat', 'deep_infra', 0],
 ['Austism/chronos-hermes-13b-v2', 'deep_infra', 1],
 ['Gryphe/MythoMax-L2-13b', 'deep_infra', 2],
 ['HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1', 'deep_infra', 3],
 ['Phind/Phind-CodeLlama-34B-v2', 'deep_infra', 4],
 ['bigcode/starcoder2-15b', 'deep_infra', 5],
 ['claude-3-haiku-20240307', 'anthropic', 6],
 ['claude-3-opus-20240229', 'anthropic', 7],
 ['claude-3-sonnet-20240229', 'anthropic', 8],
 ['codellama/CodeLlama-34b-Instruct-hf', 'deep_infra', 9],
 ['codellama/CodeLlama-70b-Instruct-hf', 'deep_infra', 10],
 ['cognitivecomputations/dolphin-2.6-mixtral-8x7b', 'deep_infra', 11],
 ['databricks/dbrx-instruct', 'deep_infra', 12],
 ['deepinfra/airoboros-70b', 'deep_infra', 13],
 ['gemini-pro', 'google', 14],
 ['google/gemma-1.1-7b-it', 'deep_infra', 15],
 ['gpt-3.5-turbo', 'openai', 16],
 ['gpt-3.5-turbo-0125', 'openai', 17],
 ['gpt-3.5-turbo-0301', 'openai', 18],
 ['gpt-3.5-turbo-0613', 'openai', 19],
 ['gpt-3.5-turbo-1106', 'openai', 20],
 ['gp

We can also check the models for which we have properly stored API Keys (see [instructions](https://docs.expectedparrot.com/en/latest/api_keys.html)).

In [13]:
# Model.check_models()

And then select any that we want to use for our survey:

In [14]:
model = Model('gpt-4-1106-preview')

Note that if we do not specify a model when we run a survey, GPT 4 is used by default.

## Running the survey
We administer our survey by appending the components with the `.by()` method and then calling `.run()` method. In the simplest case where we want a single agent or list of agents to answer all questions with the same scenarios, this takes the following form:

`results = survey.by(scenarios).by(agents).by(models).run()`

We can create a method to have individual agents answer the questions for category-specific job posts:

In [15]:
def data_labeling(df, survey):
    results = {}
    job_categories = df["job_category"].unique()
    for job_category in job_categories:
        # print(job_category)
        
        # We create an agent with expertise in the job category
        base_persona = "You are an experienced freelancer on online labor marketplaces."
        expertise = f"You regularly perform jobs in the following category: { job_category }."
        agent = Agent(name = job_category, traits = {"base_persona":base_persona, "expertise":expertise})
        # agent.print()
    
        # We take the job posts in the job category as scenarios for the survey
        df_category = df[df["job_category"] == job_category]
        scenarios = [Scenario({"job_category": row["job_category"], "job_post": row["job_post"]}) for _, row in df_category.iterrows()]
        # print(scenarios)
        
        # We administer the survey to the agent with our selected LLM
        job_category_results = survey.by(scenarios).by(agent).by(model).run()
        # job_category_results.print()
        
        results[job_category] = job_category_results
        
    return results

In [16]:
results = data_labeling(df, jobs_survey)

## Accessing Results
In the previous step we created independent `Results` objects for our individual agents' survey results and stored them as a dictionary by job category for easy reference. (We also could have just created them separately, or as a list or some other convenient type.) In the next steps we show how to access results with built-in print and analytical methods.

We can identify the column names to select the fields that we want to inspect:

In [17]:
results["Graphic Design"].to_pandas().columns

Index(['agent.agent_name', 'agent.base_persona', 'agent.expertise',
       'answer.days', 'answer.days_comment', 'answer.experience',
       'answer.experience_comment', 'answer.skills', 'answer.skills_comment',
       'iteration.iteration', 'model.frequency_penalty', 'model.logprobs',
       'model.max_tokens', 'model.model', 'model.presence_penalty',
       'model.temperature', 'model.top_logprobs', 'model.top_p',
       'prompt.days_system_prompt', 'prompt.days_user_prompt',
       'prompt.experience_system_prompt', 'prompt.experience_user_prompt',
       'prompt.skills_system_prompt', 'prompt.skills_user_prompt',
       'raw_model_response.days_raw_model_response',
       'raw_model_response.experience_raw_model_response',
       'raw_model_response.skills_raw_model_response', 'scenario.job_category',
       'scenario.job_post'],
      dtype='object')

We can select individual fields in a variety of ways:

In [18]:
(results["Graphic Design"]
 .select("job_post", "skills", "experience", "days")
 .print()
)

scenario.job_post,answer.skills,answer.experience,answer.days
"{'title': 'Creative Graphic Designer', 'description': 'Seeking a passionate and innovative Graphic Designer to join our creative team. The ideal candidate will have a strong portfolio showcasing skills in Adobe Creative Suite, branding, and digital design. Responsibilities include creating visual concepts, developing graphics for product illustrations, logos, and websites, and selecting colors, images, text style, and layout. Minimum 3 years of professional design experience required.', 'qualifications': ['Proven graphic designing experience', 'A strong portfolio of illustrations or other graphics', 'Familiarity with design software and technologies (such as InDesign, Illustrator, Dreamweaver, Photoshop)', 'A keen eye for aesthetics and details', 'Excellent communication skills', 'Ability to work methodically and meet deadlines', 'Degree in Design, Fine Arts or related field is a plus'], 'type': 'Full-time'}","['Adobe Creative Suite', 'branding', 'digital design', 'creating visual concepts', 'developing graphics', 'product illustrations', 'logo design', 'website design', 'selecting colors', 'image selection', 'text styling', 'layout design', 'graphic design experience', 'strong portfolio', 'InDesign', 'Illustrator', 'Dreamweaver', 'Photoshop', 'aesthetics', 'attention to detail', 'excellent communication skills', 'ability to meet deadlines']",Mid-level,14
"{'title': 'Junior Graphic Designer', 'description': 'We are looking for a talented Junior Graphic Designer to create engaging and on-brand graphics for a variety of media. The candidate should have a basic understanding of design principles and be proficient in Adobe Photoshop and Illustrator. Tasks include photo editing, creating promotional materials, and assisting senior designers with larger projects. This is a perfect opportunity for recent graduates looking for industry experience.', 'qualifications': ['Basic knowledge of layouts, typography, line composition, color, and other graphic design fundamentals', 'Experience with graphic design software and tools', 'Strong analytical skills', 'Excellent eye for detail', 'Ability to absorb and apply constructive criticism', 'Relevant education or training'], 'type': 'Entry-level, Part-time'}","['Basic understanding of design principles', 'Proficiency in Adobe Photoshop', 'Proficiency in Adobe Illustrator', 'Photo editing', 'Creation of promotional materials', 'Collaboration with senior designers', 'Layouts', 'Typography', 'Line composition', 'Color theory', 'Graphic design fundamentals', 'Strong analytical skills', 'Excellent eye for detail', 'Receptiveness to constructive criticism']",Entry-level,14
"{'title': 'Freelance Graphic Designer', 'description': 'Looking for a freelance Graphic Designer to work on a project basis. Must be able to translate requirements into design, communicate with clients to ensure the final graphics and layouts are visually appealing and on-brand. Should have experience in both print and electronic media and be able to take direction from written or spoken ideas and convert them seamlessly into images, layouts, and other designs.', 'qualifications': ['Significant experience as a graphic designer or in related field', 'Demonstrable graphic design skills with a strong portfolio', 'Proficiency with required desktop publishing tools, including Photoshop, InDesign, and Illustrator', 'A strong eye for visual composition', 'Able to give and receive constructive criticism', 'Experience with computer-aided design'], 'type': 'Contract'}","['Graphic design experience', 'Strong portfolio', 'Proficiency in Photoshop', 'Proficiency in InDesign', 'Proficiency in Illustrator', 'Visual composition', 'Client communication', 'Ability to translate requirements into design', 'Print and electronic media design', 'Ability to convert ideas into visual designs', 'Constructive criticism', 'Computer-aided design']",Mid-level,14


We can apply some labels to our table for readability. Note that each question field also automatically includes a `<question>_comment` field for any commentary by the LLM on the question:

In [19]:
(results["Graphic Design"]
 .select("job_post", "experience", "experience_comment")
 .print(pretty_labels = {
     "scenario.job_post":"Job post description",
     "answer.experience":"Experience level",
     "answer.experience_comment":"Comment"})
)

Job post description,Experience level,Comment
"{'title': 'Creative Graphic Designer', 'description': 'Seeking a passionate and innovative Graphic Designer to join our creative team. The ideal candidate will have a strong portfolio showcasing skills in Adobe Creative Suite, branding, and digital design. Responsibilities include creating visual concepts, developing graphics for product illustrations, logos, and websites, and selecting colors, images, text style, and layout. Minimum 3 years of professional design experience required.', 'qualifications': ['Proven graphic designing experience', 'A strong portfolio of illustrations or other graphics', 'Familiarity with design software and technologies (such as InDesign, Illustrator, Dreamweaver, Photoshop)', 'A keen eye for aesthetics and details', 'Excellent communication skills', 'Ability to work methodically and meet deadlines', 'Degree in Design, Fine Arts or related field is a plus'], 'type': 'Full-time'}",Mid-level,"The job post specifies a minimum of 3 years of professional design experience, which typically aligns with mid-level expertise. Entry-level positions often require less than 2 years of experience, while senior-level positions usually demand significantly more than 3 years and often involve leadership responsibilities that are not mentioned in this job post."
"{'title': 'Junior Graphic Designer', 'description': 'We are looking for a talented Junior Graphic Designer to create engaging and on-brand graphics for a variety of media. The candidate should have a basic understanding of design principles and be proficient in Adobe Photoshop and Illustrator. Tasks include photo editing, creating promotional materials, and assisting senior designers with larger projects. This is a perfect opportunity for recent graduates looking for industry experience.', 'qualifications': ['Basic knowledge of layouts, typography, line composition, color, and other graphic design fundamentals', 'Experience with graphic design software and tools', 'Strong analytical skills', 'Excellent eye for detail', 'Ability to absorb and apply constructive criticism', 'Relevant education or training'], 'type': 'Entry-level, Part-time'}",Entry-level,"The job post is for a 'Junior Graphic Designer' and it is described as a perfect opportunity for recent graduates looking for industry experience. The qualifications required are basic and fundamental knowledge of design principles, which aligns with entry-level expectations."
"{'title': 'Freelance Graphic Designer', 'description': 'Looking for a freelance Graphic Designer to work on a project basis. Must be able to translate requirements into design, communicate with clients to ensure the final graphics and layouts are visually appealing and on-brand. Should have experience in both print and electronic media and be able to take direction from written or spoken ideas and convert them seamlessly into images, layouts, and other designs.', 'qualifications': ['Significant experience as a graphic designer or in related field', 'Demonstrable graphic design skills with a strong portfolio', 'Proficiency with required desktop publishing tools, including Photoshop, InDesign, and Illustrator', 'A strong eye for visual composition', 'Able to give and receive constructive criticism', 'Experience with computer-aided design'], 'type': 'Contract'}",Mid-level,"The job post requires significant experience as a graphic designer, a strong portfolio, proficiency with advanced design software, and the ability to work with both print and electronic media. These requirements suggest that the job is not for entry-level candidates but rather for those with a mid-level of experience who have established skills and a track record of professional work."


We can also access results as a SQL table (called `self`) with the `.sql()` method, choosing between a "wide" horizontal view of all fields and a "long" vertical view, and optionally removing the column name prefixes 'agent', 'model', 'prompt', etc.:

In [20]:
results["Graphic Design"].sql("select * from self", shape="long")

Unnamed: 0,id,data_type,key,value
0,0,agent,base_persona,You are an experienced freelancer on online la...
1,0,agent,expertise,You regularly perform jobs in the following ca...
2,0,agent,agent_name,Graphic Design
3,0,scenario,job_category,Graphic Design
4,0,scenario,job_post,"{'title': 'Creative Graphic Designer', 'descri..."
...,...,...,...,...
91,2,raw_model_response,days_raw_model_response,{'id': 'chatcmpl-9K6aG92IB7RyObilBBdepJnm5aozh...
92,2,iteration,iteration,0
93,2,question_text,skills_question_text,\n Consider the following job category ...
94,2,question_text,experience_question_text,\n Consider the following job category ...


## Other methods for data labeling
We are working on a set of data labeling templates for different use cases. Please check the [EDSL Docs](https://docs.expectedparrot.com/en/latest/index.html): Notebooks for updates!