# ALLY Quickstart

In this notebook, we are going to run through some of the common tasks for creating data labeling agents with ALLY. In this example, we're going to create a data labeling agent for a text classification task - labeling our text samples as either "Subjective or "Objective" statements. 

This agent will be LLM-based, so we will use [OpenAI's API](https://platform.openai.com/). You will to generate an API key and set it as an environment variable as follows: 


Now, let's begin. 

## Dataset Creation
First, let's use a dataset of product reviews stored in pandas dataframe. This will help us manage our data as we add more attributes, like predictions and labels for subjectivity and objectivity over time. 

In [1]:
import pandas as pd

df = pd.DataFrame([
  ["The mic is great.", "Subjective"],
  ["Will order from them again!", "Subjective"],
  ["Not loud enough and doesn't turn on like it should.", "Objective"],
  ["The phone doesn't seem to accept anything except CBR mp3s", "Objective"],
  ["All three broke within two months of use.", "Objective"]
], columns=["text", "ground_truth"])

df

Unnamed: 0,text,ground_truth
0,The mic is great.,Subjective
1,Will order from them again!,Subjective
2,Not loud enough and doesn't turn on like it sh...,Objective
3,The phone doesn't seem to accept anything exce...,Objective
4,All three broke within two months of use.,Objective


We instantiate Dataset that uses this pandas dataframe as a data source. Dataset object takes care of input data schema and data streaming:

In [2]:
import sys
sys.path.append('../')

from ally.datasets.dataframe import DataFrameDataset

dataset = DataFrameDataset(df=df)

## Create Agent

To create Agent, we need to to define 2 things:

**Skills** - Agent's abilities are defined as _Skills_. Each agent can possess many different skills. In our case, this agent only has one labeling skill, to produce a classification of Subjective or Objective for a given piece of text.  To define this skill, we will leverage an LLM, passing it instructions and the set of labeles we expect to receive back. 

**Environment** - that is where the Agent receives ground truth signal to improve its skill. Since we already created ground truth dataset, we can simply refer to the column from the dataframe. In the real world scenario, you may consider using a different environment where ground truth signal can be obtained asynchoronously by gathering real human feedback during agent's learning phase.

In [3]:
from rich import print

from ally.agents.base import Agent
from ally.environments.base import BasicEnvironment
from ally.runtimes.openai import OpenAIRuntime
from ally.skills.labeling.classification import ClassificationSkill
from app.core.settings import settings


agent = Agent(
    # define the agent's labeling skill that should classify text onto 2 categories
    skills=ClassificationSkill(
      name='subjectivity_detection',
      description='Understanding subjective and objective statements from text.',
      instruction_template='Classify a product review as either expressing "Subjective" or "Objective" statements.',
      input_template='Review: {text}',
    ),
    
    # basic environment extracts ground truth signal from the input records
    environment=BasicEnvironment(
      ground_truth_dataset=dataset,
      ground_truth_columns={'subjectivity_detection': 'ground_truth'}
    ),
    
    runtimes = {
      # You can specify your OPENAI API KEY here via `OpenAIRuntime(..., api_key='your-api-key')`
      'openai': OpenAIRuntime(
        verbose=True,
        api_key=settings.openai_api_key,
        gpt_model_name="gpt-3.5-turbo",
      )
    },
    default_runtime='openai',
    
    teacher_runtimes = {
      'openai-gpt3': OpenAIRuntime(
        verbose=True,
        api_key=settings.openai_api_key,
        gpt_model_name="gpt-3.5-turbo",
        ),
      'openai-gpt4': OpenAIRuntime(
          verbose=True,
          api_key=settings.openai_api_key,
          gpt_model_name="gpt-4",
        )
    },
    
    # NOTE! If you don't have an access to gpt4 - replace it with "openai-gpt3"
    default_teacher_runtime='openai-gpt4'
)

print(agent)

## Learning Agent

We will now let Agent learn from the ground truth. After every action, Agent returns its _Experience_, where it stores various observations like predicted data, errors, accuracy, etc.

In [4]:
runtime = agent.get_runtime()
teacher_runtime = agent.get_teacher_runtime()

dataset = agent.environment.as_dataset()

In [5]:
ground_truth_signal = agent.learn(learning_iterations=3, accuracy_threshold=0.95)

  0%|          | 0/5 [00:00<?, ?it/s]

 40%|████      | 2/5 [00:04<00:07,  2.38s/it]

 60%|██████    | 3/5 [00:28<00:22, 11.38s/it]

 80%|████████  | 4/5 [00:33<00:09,  9.08s/it]

100%|██████████| 5/5 [00:40<00:00,  8.17s/it]

100%|██████████| 5/5 [00:46<00:00,  9.29s/it]


  0%|          | 0/3 [00:00<?, ?it/s]

 67%|██████▋   | 2/3 [00:05<00:02,  2.86s/it]

100%|██████████| 3/3 [00:15<00:00,  5.94s/it]

100%|██████████| 3/3 [00:27<00:00,  9.13s/it]


  0%|          | 0/3 [00:00<?, ?it/s]

 67%|██████▋   | 2/3 [00:03<00:01,  1.69s/it]

100%|██████████| 3/3 [00:07<00:00,  2.61s/it]

100%|██████████| 3/3 [00:13<00:00,  4.51s/it]


  0%|          | 0/3 [00:00<?, ?it/s]

 67%|██████▋   | 2/3 [00:05<00:02,  2.77s/it]

100%|██████████| 3/3 [00:12<00:00,  4.40s/it]

100%|██████████| 3/3 [00:22<00:00,  7.48s/it]


  0%|          | 0/5 [00:00<?, ?it/s]

 40%|████      | 2/5 [00:16<00:24,  8.20s/it]

 60%|██████    | 3/5 [00:20<00:12,  6.48s/it]

 80%|████████  | 4/5 [00:24<00:05,  5.59s/it]

100%|██████████| 5/5 [00:31<00:00,  6.13s/it]

100%|██████████| 5/5 [00:35<00:00,  7.19s/it]


  0%|          | 0/3 [00:00<?, ?it/s]

 67%|██████▋   | 2/3 [00:04<00:02,  2.50s/it]

100%|██████████| 3/3 [00:08<00:00,  3.06s/it]

100%|██████████| 3/3 [00:12<00:00,  4.29s/it]


  0%|          | 0/3 [00:00<?, ?it/s]

 67%|██████▋   | 2/3 [00:03<00:01,  1.95s/it]

100%|██████████| 3/3 [00:08<00:00,  2.95s/it]

100%|██████████| 3/3 [00:12<00:00,  4.16s/it]


  0%|          | 0/3 [00:00<?, ?it/s]

 67%|██████▋   | 2/3 [00:06<00:03,  3.31s/it]

100%|██████████| 3/3 [00:12<00:00,  4.51s/it]

100%|██████████| 3/3 [00:18<00:00,  6.14s/it]


  0%|          | 0/5 [00:00<?, ?it/s]

 40%|████      | 2/5 [00:05<00:08,  2.86s/it]

 60%|██████    | 3/5 [00:11<00:07,  3.89s/it]

 80%|████████  | 4/5 [00:20<00:05,  5.90s/it]

100%|██████████| 5/5 [00:36<00:00,  9.49s/it]

100%|██████████| 5/5 [00:55<00:00, 11.18s/it]


  0%|          | 0/3 [00:00<?, ?it/s]

 67%|██████▋   | 2/3 [00:11<00:05,  5.82s/it]

100%|██████████| 3/3 [00:19<00:00,  6.55s/it]

100%|██████████| 3/3 [00:23<00:00,  7.74s/it]


  0%|          | 0/3 [00:00<?, ?it/s]

 67%|██████▋   | 2/3 [00:05<00:02,  2.74s/it]

100%|██████████| 3/3 [00:08<00:00,  2.94s/it]

100%|██████████| 3/3 [00:16<00:00,  5.64s/it]


  0%|          | 0/3 [00:00<?, ?it/s]

 67%|██████▋   | 2/3 [00:06<00:03,  3.05s/it]

100%|██████████| 3/3 [00:11<00:00,  3.83s/it]

100%|██████████| 3/3 [00:19<00:00,  6.37s/it]


  0%|          | 0/5 [00:00<?, ?it/s]

 40%|████      | 2/5 [00:06<00:09,  3.07s/it]

 60%|██████    | 3/5 [00:11<00:07,  3.94s/it]

 80%|████████  | 4/5 [00:25<00:07,  7.92s/it]

100%|██████████| 5/5 [00:35<00:00,  8.48s/it]

100%|██████████| 5/5 [00:54<00:00, 10.99s/it]


Let's see the final instructions:

In [7]:
print(agent.skills)

... and predictions created by the skill:

In [8]:
agent.run(dataset)

  0%|          | 0/5 [00:00<?, ?it/s]

 40%|████      | 2/5 [00:05<00:08,  2.97s/it]

 60%|██████    | 3/5 [00:37<00:30, 15.05s/it]

 80%|████████  | 4/5 [00:46<00:12, 12.56s/it]

100%|██████████| 5/5 [00:51<00:00,  9.88s/it]

100%|██████████| 5/5 [01:06<00:00, 13.24s/it]


Unnamed: 0,text,ground_truth,subjectivity_detection,score
0,The mic is great.,Subjective,Objective,-0.000001
1,Will order from them again!,Subjective,Subjective,-0.999
2,Not loud enough and doesn't turn on like it sh...,Objective,"Subjective, Objective","0.9999, 0.9999"
3,The phone doesn't seem to accept anything exce...,Objective,Objective,0.9999
4,All three broke within two months of use.,Objective,Objective,-0.999


## Applying learned skills to the real data

Now as we have our Agent with evolved "subjectivity detection" skill, we can apply it to the real dataset without ground truth data:

In [9]:
test_df = pd.DataFrame([
    "Doesn't hold charge.",
    "Excellent bluetooth headset",
    "I love this thing!",
    "VERY DISAPPOINTED."
], columns=['text'])
test_df

Unnamed: 0,text
0,Doesn't hold charge.
1,Excellent bluetooth headset
2,I love this thing!
3,VERY DISAPPOINTED.


In [10]:
predictions = agent.run(test_df)

  0%|          | 0/4 [00:00<?, ?it/s]

 50%|█████     | 2/4 [00:04<00:04,  2.33s/it]

 75%|███████▌  | 3/4 [00:11<00:04,  4.11s/it]

100%|██████████| 4/4 [00:17<00:00,  5.03s/it]

100%|██████████| 4/4 [00:23<00:00,  5.93s/it]


In [11]:
predictions

Unnamed: 0,text,subjectivity_detection,score
0,Doesn't hold charge.,Objective,-0.999
1,Excellent bluetooth headset,Objective,-0.003
2,I love this thing!,Subjective,-0.999
3,VERY DISAPPOINTED.,Subjective,-0.999
