# LLM-Based Automatic Policy Text Classifier
This notebook demonstrates a multi-step approach to automatically annotate policy articles using an LLM (Anthropic Claude), informed by the coding scheme in the Appendix and Codebook.

## 0. Install Required Packages

In [None]:
# Install required packages if not already installed
!pip install python-dotenv requests numpy

## 1. Import Required Modules

In [2]:
import os
import sys
import json
from pathlib import Path

# Add parent directory to path to import our modules
sys.path.append('..')

# Import our custom modules
from src.utils import (
    get_project_root, 
    load_coding_scheme, 
    filter_coding_scheme,
    load_raw_text,
    load_curated_annotations,
    create_extended_coding_scheme
)

from src.annotation import (
    annotate_article,
    load_few_shot_examples,
    prepare_annotation_prompt
)

from src.evaluation import (
    evaluate_article,
    batch_evaluate
)

# Load environment variables
from dotenv import load_dotenv
load_dotenv()

ANTHROPIC_API_KEY = os.getenv('ANTHROPIC_API_KEY')
assert ANTHROPIC_API_KEY, 'Please set your ANTHROPIC_API_KEY in the .env file.'

## 2. Load and Explore the Coding Scheme

In [3]:
# Load the full coding scheme
coding_scheme = load_coding_scheme()
print(f"Full coding scheme has {len(coding_scheme.get('layers', []))} layers")

# Display all layer and tagset names
for layer in coding_scheme.get('layers', []):
    print(f"\nLayer: {layer.get('layer')}")
    for tagset in layer.get('tagsets', []):
        tag_count = len(tagset.get('tags', []))
        print(f"  - Tagset: {tagset.get('tagset')} ({tag_count} tags)")

Full coding scheme has 3 layers

Layer: Policydesigncharacteristics
  - Tagset: Objective (4 tags)
  - Tagset: Reference (3 tags)
  - Tagset: Actor (8 tags)
  - Tagset: Resource (3 tags)
  - Tagset: Time (5 tags)
  - Tagset: Compliance (2 tags)
  - Tagset: Reversibility (1 tags)

Layer: Technologyandapplicationspecificity
  - Tagset: EnergySpecificity (2 tags)
  - Tagset: ApplicationSpecificity (2 tags)
  - Tagset: TechnologySpecificity (2 tags)

Layer: Instrumenttypes
  - Tagset: InstrumentType (10 tags)


## 3. Filter Coding Scheme for Specific Layers/Tagsets

In [9]:
# Filter to focus on Policydesigncharacteristics/Actor
target_layer = "Policydesigncharacteristics"
target_tagset = "Actor"

#target_layer = "Instrumenttypes"
#target_tagset = "InstrumentType"


filtered_scheme = filter_coding_scheme(
    coding_scheme, 
    layers=[target_layer],
    tagsets=[target_tagset]
)

# Display the filtered tags
for layer in filtered_scheme.get('layers', []):
    for tagset in layer.get('tagsets', []):
        print(f"Tags in {layer.get('layer')}/{tagset.get('tagset')}:")
        for tag in tagset.get('tags', []):
            print(f"  - {tag.get('tag_name')}: {tag.get('tag_description')[:100]}...")

Tags in Policydesigncharacteristics/Actor:
  - Addressee_default: Default addressee: 
The individual or entity that the rule applies to and needs to ensure its imple...
  - Addressee_monitored: Monitored addressee: 
An individual or entity monitored for the outcome of the policy, through repo...
  - Addressee_resource: Resources addressee [Addressee_resource]: The actor that receives a resource. Formerly resources_rec...
  - Addressee_sector: Sector addressee: 
Relevant sectors that are covered by the policy. Formerly scope....
  - Authority_default: Default authority: 
The individual or entity that is making the rule, ensuring its implementation, ...
  - Authority_established: Newly established authority: 
A newly established entity that is ensuring the policy’s implementati...
  - Authority_legislative: Legislative authority: 
The individual or entity that is drafting or voting on legislation....
  - Authority_monitoring: Monitoring authority: 
An individual or entity responsible for

## 4. Create Extended Coding Scheme with Examples

The extended coding scheme enhances the original by adding real-world examples for each tag extracted from annotated data. This helps the LLM better understand what to look for.

In [5]:
# Create the extended coding scheme
extended_scheme_path = create_extended_coding_scheme(
    output_name="Coding_Scheme_Extended",
    min_occurrences=2,  # Each example must appear at least twice
    max_examples=10      # Maximum of 5 examples per tag
)

Extended coding scheme created at /Users/johannesmuller/Documents/github/POLIANNA-AI-CC-Project/data/01_policy_info/Coding_Scheme_Extended.json
Added examples to 48 out of 42 tags
Minimum occurrences: 2, Maximum examples per tag: 10


In [6]:
# Load the extended scheme to examine it
extended_scheme = load_coding_scheme(scheme_name="Coding_Scheme_Extended")

# Count tags with examples
tags_with_examples = 0
total_tags = 0
total_examples = 0

for layer in extended_scheme.get('layers', []):
    for tagset in layer.get('tagsets', []):
        for tag in tagset.get('tags', []):
            total_tags += 1
            examples = tag.get('tag_examples', [])
            if examples:
                tags_with_examples += 1
                total_examples += len(examples)

print(f"Extended scheme has examples for {tags_with_examples} out of {total_tags} tags")
print(f"Total examples: {total_examples}")
print(f"Average examples per tag with examples: {total_examples / tags_with_examples:.2f}" if tags_with_examples > 0 else "No tags have examples yet")

# View examples for our target layer/tagset
print(f"\nExamples for {target_layer}/{target_tagset}:")
for layer in extended_scheme.get('layers', []):
    if layer.get('layer') == target_layer:
        for tagset in layer.get('tagsets', []):
            if tagset.get('tagset') == target_tagset:
                for tag in tagset.get('tags', []):
                    examples = tag.get('tag_examples', [])
                    print(f"\n  - {tag.get('tag_name')}: {len(examples)} examples")
                    for i, example in enumerate(examples):
                        print(f"    {i+1}. \"{example}\"")

Extended scheme has examples for 42 out of 42 tags
Total examples: 364
Average examples per tag with examples: 8.67

Examples for Policydesigncharacteristics/Actor:

  - Addressee_default: 10 examples
    1. "member states"
    2. "member states"
    3. "member state"
    4. "customers"
    5. "member state"
    6. "final customers"
    7. "commission"
    8. "vertically integrated undertaking"
    9. "regulatory authority"
    10. "market participants"

  - Addressee_monitored: 10 examples
    1. "member states"
    2. "member state"
    3. "member state"
    4. "member states"
    5. "transmission system operator"
    6. "union"
    7. "manufacturer"
    8. "transmission system operators"
    9. "manufacturers"
    10. "economic operators"

  - Addressee_resource: 10 examples
    1. "member state"
    2. "final customers"
    3. "member states"
    4. "developing countries"
    5. "end-users"
    6. "consumers"
    7. "public"
    8. "member state"
    9. "architects"
    10. "member

## 5. Load a Sample Article

In [7]:
# Article ID to work with
article_id = "EU_32018R1999_Title_0_Chapter_6_Section_3_Article_37"

# Load the raw text
raw_text = load_raw_text(article_id)
print(f"Article text ({len(raw_text)} characters):\n")
print(raw_text)

Article text (3411 characters):

article 37
union and national inventory systems
1.   by 1 january 2021, member states shall establish, operate and seek to continuously improve national inventory systems to estimate anthropogenic emissions by sources and removals by sinks of greenhouse gases listed in part 2 of annex v and to ensure the timeliness, transparency, accuracy, consistency, comparability and completeness of their greenhouse gas inventories.
2.   member states shall ensure that their competent inventory authorities have access to the information specified in annex xii to this regulation, make use of reporting systems established pursuant to article 20 of regulation (eu) no 517/2014 to improve the estimate of fluorinated gases in the national greenhouse gas inventories and are able to undertake the annual consistency checks referred to in points (i) and (j) of part 1 of annex v to this regulation.
3.   a union inventory system to ensure the timeliness, transparency, accuracy, 

## 6. Load Curated Annotations for the Article

In [8]:
# Load the curated annotations
curated_annotations = load_curated_annotations(article_id)

# Filter to our target layer/tagset
filtered_annotations = []
for ann in curated_annotations:
    if ann.get('layer') == target_layer and ann.get('feature') == target_tagset:
        # Create a clean version without metadata fields
        clean_ann = {k: v for k, v in ann.items() if k not in ['span_id', 'tokens']}
        filtered_annotations.append(clean_ann)

print(f"Found {len(filtered_annotations)} annotations in {target_layer}/{target_tagset}:")
for ann in filtered_annotations:
    print(f"- {ann['tag']}: '{ann['text']}'")

Found 19 annotations in Policydesigncharacteristics/Actor:
- Addressee_default: 'member states'
- Authority_monitoring: 'member states'
- Addressee_default: 'member states'
- Authority_monitoring: 'competent inventory authorities'
- Authority_monitoring: 'commission'
- Authority_monitoring: 'commission'
- Addressee_monitored: 'member states'
- Addressee_monitored: 'member states'
- Addressee_monitored: 'member states'
- Addressee_monitored: 'member state'
- Authority_monitoring: 'commission'
- Addressee_monitored: 'member state'
- Addressee_monitored: 'member state'
- Authority_default: 'commission'
- Authority_default: 'commission'
- Authority_default: 'climate change committee'
- Authority_default: 'commission'
- Authority_default: 'commission'
- Authority_default: 'commission'


## 7. Generate Few-Shot Examples

In [38]:
# Load few-shot examples
few_shot_examples = load_few_shot_examples(
    num_examples=2,
    layers=[target_layer],
    tagsets=[target_tagset],
    exclude_article_ids=[article_id]
)

print(f"Loaded {len(few_shot_examples)} few-shot examples")

# Display the first example
if few_shot_examples:
    example = few_shot_examples[0]
    print(f"\nExample Text:\n{example['text'][:200]}...")
    print(f"\nExample Annotations ({len(example['annotations'])}):\n")
    for ann in example['annotations']:
        print(f"- {ann['tag']}: '{ann['text']}'")

Loaded 1 few-shot examples

Example Text:
article 30
addressees
this directive is addressed to the member states....

Example Annotations (2):

- Addressee_default: 'addressees'
- Addressee_default: 'member states'


## 8. Create LLM Prompt with Extended Scheme

In [50]:
# Create the annotation prompt with extended coding scheme
prompt = prepare_annotation_prompt(
    raw_text=raw_text,
    coding_scheme=extended_scheme,  # Use the extended scheme with examples
    layers=[target_layer],
    tagsets=[target_tagset],
    few_shot_examples=few_shot_examples,
    use_extended_scheme=True  # Enable special formatting for examples
)

# Display a shortened version of the prompt
print(f"Prompt length: {len(prompt)} characters")
print("Prompt preview (first 1000 characters):")
print(prompt[:1000] + "...\n[truncated]...")

print(prompt)

Prompt length: 8939 characters
Prompt preview (first 1000 characters):
# Policy Text Annotation Task

You are an expert policy analyst helping to annotate policy text with a specific coding scheme.

## Annotation Focus

Focus on these layers only: Policydesigncharacteristics
Focus on these tagsets only: Actor

## Coding Scheme

The coding scheme includes descriptions and examples for each tag:

### Layer: Policydesigncharacteristics

#### Tagset: Actor

##### Tag: Addressee_default

Description: Default addressee: 
The individual or entity that the rule applies to and needs to ensure its implementation.

Examples:
- "member states"
- "member states"
- "member state"
- "customers"
- "member state"
- "final customers"
- "commission"
- "vertically integrated undertaking"
- "regulatory authority"
- "market participants"

##### Tag: Addressee_monitored

Description: Monitored addressee: 
An individual or entity monitored for the outcome of the policy, through report, review, or audit. Forme

## 9. Annotate the Article with Standard Scheme

In [40]:
# Annotate the article with the standard coding scheme
standard_annotations = annotate_article(
    article_id=article_id,
    layers=[target_layer],
    tagsets=[target_tagset],
    num_examples=2,
    save_result=True,
    use_extended_scheme=False,  # Use standard scheme
    scheme_name="Coding_Scheme"  # Explicitly use the standard scheme
)

print(f"Generated {len(standard_annotations)} annotations with standard scheme")
for ann in standard_annotations:
    print(f"- {ann['layer']}/{ann['feature']}/{ann['tag']}: '{ann['text']}'")

Generated 13 annotations with standard scheme
- Policydesigncharacteristics/Actor/Addressee_default: 'member states'
- Policydesigncharacteristics/Actor/Authority_default: 'member states'
- Policydesigncharacteristics/Actor/Addressee_monitored: 'member states'
- Policydesigncharacteristics/Actor/Addressee_default: 'competent inventory authorities'
- Policydesigncharacteristics/Actor/Authority_monitoring: 'competent inventory authorities'
- Policydesigncharacteristics/Actor/Authority_default: 'commission'
- Policydesigncharacteristics/Actor/Authority_monitoring: 'commission'
- Policydesigncharacteristics/Actor/Authority_legislative: 'commission'
- Policydesigncharacteristics/Actor/Authority_legislative: 'commission'
- Policydesigncharacteristics/Actor/Authority_legislative: 'commission'
- Policydesigncharacteristics/Actor/Authority_legislative: 'climate change committee'
- Policydesigncharacteristics/Actor/Authority_legislative: 'bodies of the unfccc or of the paris agreement'
- Policyd

## 10. Annotate the Article with Extended Scheme

In [41]:
# Annotate the article with the extended coding scheme
extended_annotations = annotate_article(
    article_id=article_id,
    layers=[target_layer],
    tagsets=[target_tagset],
    num_examples=2,
    save_result=True,
    use_extended_scheme=True,  # Use extended scheme with examples
    scheme_name="Coding_Scheme_Extended"  # Explicitly use the extended scheme
)

print(f"Generated {len(extended_annotations)} annotations with extended scheme")
for ann in extended_annotations:
    print(f"- {ann['layer']}/{ann['feature']}/{ann['tag']}: '{ann['text']}'")

  - Policydesigncharacteristics/Actor/Authority_monitoring: 'bodies of the paris agreement'
Generated 10 annotations with extended scheme
- Policydesigncharacteristics/Actor/Addressee_default: 'member states'
- Policydesigncharacteristics/Actor/Authority_default: 'member states'
- Policydesigncharacteristics/Actor/Addressee_monitored: 'member states'
- Policydesigncharacteristics/Actor/Authority_monitoring: 'competent inventory authorities'
- Policydesigncharacteristics/Actor/Authority_default: 'commission'
- Policydesigncharacteristics/Actor/Authority_monitoring: 'commission'
- Policydesigncharacteristics/Actor/Authority_legislative: 'commission'
- Policydesigncharacteristics/Actor/Authority_established: 'climate change committee'
- Policydesigncharacteristics/Actor/Authority_monitoring: 'bodies of the unfccc'
- Policydesigncharacteristics/Actor/Authority_monitoring: 'bodies of the paris agreement'


## 11. Evaluate Both Annotation Methods

In [42]:
# Save both annotation sets to different files for comparison
root_dir = get_project_root()
article_dir = os.path.join(root_dir, 'data', '03b_processed_to_json', article_id)

# Save standard annotations to a separate file
standard_path = os.path.join(article_dir, 'Generated_Annotations_Standard.json')
with open(standard_path, 'w') as f:
    json.dump(standard_annotations, f, indent=2)

# Save extended annotations to a separate file
extended_path = os.path.join(article_dir, 'Generated_Annotations_Extended.json')
with open(extended_path, 'w') as f:
    json.dump(extended_annotations, f, indent=2)

# Evaluate standard annotations
standard_results = evaluate_article(
    article_id=article_id,
    generated_path=standard_path,
    layers=[target_layer],
    tagsets=[target_tagset],
    save_results=False
)

# Evaluate extended annotations
extended_results = evaluate_article(
    article_id=article_id,
    generated_path=extended_path,
    layers=[target_layer],
    tagsets=[target_tagset],
    save_results=False
)

# Compare the results
print("Comparison of Standard vs. Extended Scheme:\n")
print("Metric                  | Standard    | Extended    | Difference")
print("-----------------------|-------------|-------------|------------")
print(f"Span F1                 | {standard_results['summary']['span_f1']:.4f}      | {extended_results['summary']['span_f1']:.4f}      | {extended_results['summary']['span_f1'] - standard_results['summary']['span_f1']:.4f}")
print(f"Tag Accuracy            | {standard_results['tag_assignment']['tag_accuracy']:.4f}      | {extended_results['tag_assignment']['tag_accuracy']:.4f}      | {extended_results['tag_assignment']['tag_accuracy'] - standard_results['tag_assignment']['tag_accuracy']:.4f}")
print(f"Full Match Accuracy     | {standard_results['tag_assignment']['full_match_accuracy']:.4f}      | {extended_results['tag_assignment']['full_match_accuracy']:.4f}      | {extended_results['tag_assignment']['full_match_accuracy'] - standard_results['tag_assignment']['full_match_accuracy']:.4f}")
print(f"Combined Score          | {standard_results['summary']['combined_score']:.4f}      | {extended_results['summary']['combined_score']:.4f}      | {extended_results['summary']['combined_score'] - standard_results['summary']['combined_score']:.4f}")

Comparison of Standard vs. Extended Scheme:

Metric                  | Standard    | Extended    | Difference
-----------------------|-------------|-------------|------------
Span F1                 | 0.4375      | 0.5000      | 0.0625
Tag Accuracy            | 0.4286      | 0.5714      | 0.1429
Full Match Accuracy     | 0.4286      | 0.5714      | 0.1429
Combined Score          | 0.4330      | 0.5357      | 0.1027


## 12. Compare Against Curated Annotations

In [43]:
# Display curated, standard, and extended annotations side by side
print("Curated vs. Generated Annotations:\n")

print("Curated Annotations:")
for ann in filtered_annotations:
    print(f"- {ann['tag']}: '{ann['text']}' ({ann['start']}:{ann['stop']})")

print("\nStandard Scheme Annotations:")
for ann in standard_annotations:
    print(f"- {ann['tag']}: '{ann['text']}' ({ann['start']}:{ann['stop']})")
    
print("\nExtended Scheme Annotations:")
for ann in extended_annotations:
    print(f"- {ann['tag']}: '{ann['text']}' ({ann['start']}:{ann['stop']})")

Curated vs. Generated Annotations:

Curated Annotations:
- Addressee_default: 'member states' (74:87)
- Authority_monitoring: 'member states' (74:87)
- Addressee_default: 'member states' (431:444)
- Authority_monitoring: 'competent inventory authorities' (469:500)
- Authority_monitoring: 'commission' (1116:1126)
- Authority_monitoring: 'commission' (1547:1557)
- Addressee_monitored: 'member states' (1673:1686)
- Addressee_monitored: 'member states' (1757:1770)
- Addressee_monitored: 'member states' (1816:1829)
- Addressee_monitored: 'member state' (1988:2000)
- Authority_monitoring: 'commission' (2093:2103)
- Addressee_monitored: 'member state' (2164:2176)
- Addressee_monitored: 'member state' (2225:2237)
- Authority_default: 'commission' (2253:2263)
- Authority_default: 'commission' (2384:2394)
- Authority_default: 'climate change committee' (2412:2436)
- Authority_default: 'commission' (2780:2790)
- Authority_default: 'commission' (3028:3038)
- Authority_default: 'commission' (3301:3

## 13. Batch Evaluation (Optional)

In [None]:
# Get a list of all articles
from src.utils import get_articles_paths
articles = get_articles_paths()

# Use a small sample (first 3 articles) for demonstration
sample_article_ids = [article['id'] for article in articles[:3]]
print(f"Running batch evaluation on {len(sample_article_ids)} articles...")

# First annotate all sample articles with extended scheme (comment out if already done)
for article_id in sample_article_ids:
    print(f"Annotating article {article_id}...")
    try:
        annotate_article(
            article_id=article_id,
            layers=[target_layer],
            tagsets=[target_tagset],
            save_result=True,
            use_extended_scheme=True,
            scheme_name="Coding_Scheme_Extended"
        )
    except Exception as e:
        print(f"Error annotating {article_id}: {e}")

# Run batch evaluation
batch_results = batch_evaluate(
    article_ids=sample_article_ids,
    layers=[target_layer],
    tagsets=[target_tagset],
    save_results=True
)

# Display the results
print("\nBatch Evaluation Results:")
print(f"Articles evaluated: {batch_results['metadata']['article_count']}")
print(f"Span Identification - F1: {batch_results['summary']['span_f1']:.4f}")
print(f"Tag Assignment - Full Match: {batch_results['summary']['full_tag_accuracy']:.4f}")
print(f"Combined Score: {batch_results['summary']['combined_score']:.4f}")