## Exploring the LEDGAR dataset using Autolabel

#### Setup the API Keys for providers that you want to use

In [1]:
import os

# provide your own OpenAI API key here
os.environ['OPENAI_API_KEY'] = 'sk-xxxxxxxxxxxxxxxxx'

#### Install the autolabel library

In [2]:
!pip install 'refuel-autolabel[openai]'





#### Download the dataset

In [3]:
from autolabel import get_data

get_data('ledgar')

  from .autonotebook import tqdm as notebook_tqdm


Downloading seed example dataset to "seed.csv"...
100% [........................................................] 139187 / 139187

Downloading test dataset to "test.csv"...
100% [......................................................] 1376364 / 1376364

This downloads two datasets:
* `test.csv`: This is the larger dataset we are trying to label using LLMs
* `seed.csv`: This is a small dataset where we already have human-provided labels

## Start the labeling process!

Labeling with Autolabel is a 3-step process:
* First, we specify a labeling configuration (see `config.json` below)
* Next, we do a dry-run on our dataset using the LLM specified in `config.json` by running `agent.plan`
* Finally, we run the labeling with `agent.run`

### First labeling run

In [4]:
import json

from autolabel import LabelingAgent

In [5]:
# load the config
with open('config_ledgar.json', 'r') as f:
     config = json.load(f)

Let's review the configuration file below. You'll notice the following useful keys:
* `task_type`: `classification` (since it's a classification task)
* `model`: `{'provider': 'openai', 'name': 'gpt-3.5-turbo'}` (use a specific OpenAI model)
* `prompt.task_guidelines`: `'You are an expert at understanding legal contracts...` (how we describe the task to the LLM)
* `prompt.labels`: `['Agreements',
   'Amendments',
   'Anti-Corruption Laws',
   'Applicable Laws',
   'Approvals',
   'Arbitration',
   'Assignments',
   'Assigns', ...]` (the full list of labels to choose from)
* `prompt.few_shot_num`: 4 (how many labeled examples to provide to the LLM)

In [8]:
config

{'task_name': 'LegalProvisionsClassification',
 'task_type': 'classification',
 'dataset': {'label_column': 'label', 'delimiter': ','},
 'model': {'provider': 'openai', 'name': 'gpt-3.5-turbo'},
 'prompt': {'task_guidelines': 'You are an expert at understanding legal contracts. Your job is to correctly classify legal provisions in contracts into one of the following categories.\nCategories:{labels}\n',
  'labels': ['Agreements',
   'Amendments',
   'Anti-Corruption Laws',
   'Applicable Laws',
   'Approvals',
   'Arbitration',
   'Assignments',
   'Assigns',
   'Authority',
   'Authorizations',
   'Base Salary',
   'Benefits',
   'Binding Effects',
   'Books',
   'Brokers',
   'Capitalization',
   'Change In Control',
   'Closings',
   'Compliance With Laws',
   'Confidentiality',
   'Consent To Jurisdiction',
   'Consents',
   'Construction',
   'Cooperation',
   'Costs',
   'Counterparts',
   'Death',
   'Defined Terms',
   'Definitions',
   'Disability',
   'Disclosures',
   'Duties

In [10]:
# create an agent for labeling
agent = LabelingAgent(config=config)

In [11]:
from autolabel import AutolabelDataset
ds = AutolabelDataset("test.csv", config=config)
agent.plan(ds)

You are an expert at understanding legal contracts. Your job is to correctly classify legal provisions in contracts into one of the following categories.
Categories:Agreements
Amendments
Anti-Corruption Laws
Applicable Laws
Approvals
Arbitration
Assignments
Assigns
Authority
Authorizations
Base Salary
Benefits
Binding Effects
Books
Brokers
Capitalization
Change In Control
Closings
Compliance With Laws
Confidentiality
Consent To Jurisdiction
Consents
Construction
Cooperation
Costs
Counterparts
Death
Defined Terms
Definitions
Disability
Disclosures
Duties
Effective Dates
Effectiveness
Employment
Enforceability
Enforcements
Entire Agreements
Erisa
Existence
Expenses
Fees
Financial Statements
Forfeitures
Further Assurances
General
Governing Laws
Headings
Indemnifications
Indemnity
Insurances
Integration
Intellectual Property
Interests
Interpretations
Jurisdictions
Liens
Litigations
Miscellaneous
Modifications
No Conflicts
No Defaults
No Waivers
Non-Disparagement
Notices
Organizations
Parti

In [12]:
# now, do the actual labeling
ds = agent.run(ds, max_items=100)

2023-06-14 13:31:32 openai INFO: error_code=None error_message='That model is currently overloaded with other requests. You can retry your request, or contact us through our help center at help.openai.com if the error persists. (Please include the request ID 9ff1d233ffc8b6842e6039eeb99aa39e in your message.)' error_param=None error_type=server_error message='OpenAI API error received' stream_error=False


2023-06-14 13:31:55 openai INFO: error_code=None error_message='Rate limit reached for default-gpt-3.5-turbo in organization org-etZVkYhAIYGmLcxLmarMmAPo on tokens per min. Limit: 90000 / min. Current: 88790 / min. Contact us through our help center at help.openai.com if you continue to have issues.' error_param=None error_type=tokens message='OpenAI API error received' stream_error=False
2023-06-14 13:31:57 openai INFO: error_code=None error_message='Rate limit reached for default-gpt-3.5-turbo in organization org-etZVkYhAIYGmLcxLmarMmAPo on tokens per min. Limit: 90000 / min. Current: 89075 / min. Contact us through our help center at help.openai.com if you continue to have issues.' error_param=None error_type=tokens message='OpenAI API error received' stream_error=False
2023-06-14 13:32:00 openai INFO: error_code=None error_message='Rate limit reached for default-gpt-3.5-turbo in organization org-etZVkYhAIYGmLcxLmarMmAPo on tokens per min. Limit: 90000 / min. Current: 88918 / min. C

2023-06-14 13:32:17 openai INFO: error_code=None error_message='Rate limit reached for default-gpt-3.5-turbo in organization org-etZVkYhAIYGmLcxLmarMmAPo on tokens per min. Limit: 90000 / min. Current: 89085 / min. Contact us through our help center at help.openai.com if you continue to have issues.' error_param=None error_type=tokens message='OpenAI API error received' stream_error=False
2023-06-14 13:32:19 openai INFO: error_code=None error_message='Rate limit reached for default-gpt-3.5-turbo in organization org-etZVkYhAIYGmLcxLmarMmAPo on tokens per min. Limit: 90000 / min. Current: 88803 / min. Contact us through our help center at help.openai.com if you continue to have issues.' error_param=None error_type=tokens message='OpenAI API error received' stream_error=False
2023-06-14 13:32:20 openai INFO: error_code=None error_message='Rate limit reached for default-gpt-3.5-turbo in organization org-etZVkYhAIYGmLcxLmarMmAPo on tokens per min. Limit: 90000 / min. Current: 88534 / min. C

Actual Cost: 0.2181


We are at 71% accuracy when labeling the first 100 examples. Let's see if we can use confidence scores to improve accuracy further by removing the less confident examples from our labeled set.

### Compute confidence scores


In [24]:
# Start computing confidence scores (using Refuel's LLMs)
os.environ['REFUEL_API_KEY'] = 'xxxxxxxxxxxxxxxxx'

In [25]:
# set `compute_confidence` -> True
config["model"]["compute_confidence"] = True

In [26]:
agent = LabelingAgent(config=config)

In [27]:
from autolabel import AutolabelDataset
ds = AutolabelDataset("data/ledgar/test.csv", config=config)
agent.plan(ds)



You are an expert at understanding legal contracts. Your job is to correctly classify legal provisions in contracts into one of the following categories.
Categories:Agreements
Amendments
Anti-Corruption Laws
Applicable Laws
Approvals
Arbitration
Assignments
Assigns
Authority
Authorizations
Base Salary
Benefits
Binding Effects
Books
Brokers
Capitalization
Change In Control
Closings
Compliance With Laws
Confidentiality
Consent To Jurisdiction
Consents
Construction
Cooperation
Costs
Counterparts
Death
Defined Terms
Definitions
Disability
Disclosures
Duties
Effective Dates
Effectiveness
Employment
Enforceability
Enforcements
Entire Agreements
Erisa
Existence
Expenses
Fees
Financial Statements
Forfeitures
Further Assurances
General
Governing Laws
Headings
Indemnifications
Indemnity
Insurances
Integration
Intellectual Property
Interests
Interpretations
Jurisdictions
Liens
Litigations
Miscellaneous
Modifications
No Conflicts
No Defaults
No Waivers
Non-Disparagement
Notices
Organizations
Parti

In [28]:
ds = agent.run(ds, max_items=100)

2023-06-14 15:04:00 autolabel.labeler INFO: Task run already exists.


Metric: auroc: 0.5


You are an expert at understanding legal contracts. Your job is to correctly classify legal provisions in contracts into one of the following categories.
Categories:Agreements
Amendments
Anti-Corruption Laws
Applicable Laws
Approvals
Arbitration
Assignments
Assigns
Authority
Authorizations
Base Salary
Benefits
Binding Effects
Books
Brokers
Capitalization
Change In Control
Closings
Compliance With Laws
Confidentiality
Consent To Jurisdiction
Consents
Construction
Cooperation
Costs
Counterparts
Death
Defined Terms
Definitions
Disability
Disclosures
Duties
Effective Dates
Effectiveness
Employment
Enforceability
Enforcements
Entire Agreements
Erisa
Existence
Expenses
Fees
Financial Statements
Forfeitures
Further Assurances
General
Governing Laws
Headings
Indemnifications
Indemnity
Insurances
Integration
Intellectual Property
Interests
Interpretations
Jurisdictions
Liens
Litigations
Miscellaneous
Modifications
No Conflicts
No Defaults
No Waivers
Non-Disparagement
Notices
Organizations
Parti

Waivers


n


2023-06-14 15:08:29 openai INFO: error_code=None error_message='The server had an error while processing your request. Sorry about that!' error_param=None error_type=server_error message='OpenAI API error received' stream_error=False
  "error": {
    "message": "The server had an error while processing your request. Sorry about that!",
    "type": "server_error",
    "param": null,
    "code": null
  }
}
 500 {'error': {'message': 'The server had an error while processing your request. Sorry about that!', 'type': 'server_error', 'param': None, 'code': None}} {'Date': 'Wed, 14 Jun 2023 22:08:29 GMT', 'Content-Type': 'application/json', 'Content-Length': '176', 'Connection': 'keep-alive', 'access-control-allow-origin': '*', 'openai-model': 'gpt-3.5-turbo-0301', 'openai-organization': 'refuel', 'openai-processing-ms': '331', 'openai-version': '2020-10-01', 'strict-transport-security': 'max-age=15724800; includeSubDomains', 'x-ratelimit-limit-requests': '3500', 'x-ratelimit-limit-tokens': 

Metric: auroc: 0.7217
Actual Cost: 0.01


Looking at the table above, we can see that if we set the confidence threshold at `0.755`, we are able to label at 80.82% accuracy and getting a completion rate of 73%. This means, we would ignore all the data points where confidence score is less than `0.755` (which would end up being around 27% of all samples). This would, however, guarantee a very high quality labeled dataset for us. 