# Demo of the Pillar Assessment Tool

### 0. Setup

First import neccissary modules from `scripts\climate_policy_pipelines\cp1`

In [2]:
# Import necessary modules
import sys
import os
from pathlib import Path

# Get the absolute path of the project root directory
notebook_dir = Path(os.getcwd())  
project_root = notebook_dir.parent.parent  # Go up TWO levels instead of one

# Add project root to Python path
if str(project_root) not in sys.path:
    sys.path.insert(0, str(project_root))
    print(f"Added {project_root} to sys.path")

Added c:\Users\User\GitHub\group-6-final-project to sys.path


In [4]:
from scripts.climate_policy_pipelines.cp1.pipeline import run_cp1a_assessment
from scripts.climate_policy_pipelines.cp1.pipeline import run_cp1a_assessment_large_context
from scripts.climate_policy_pipelines.cp1.pipeline import run_cp1b_assessment
from scripts.climate_policy_pipelines.cp1.pipeline import get_chunks_for_cp1a

In [5]:
from scripts.climate_policy_pipelines.cp1.prompts import cp1a_criterion_1_prompt
from scripts.climate_policy_pipelines.cp1.prompts import cp1a_final_assessment_prompt
from scripts.climate_policy_pipelines.cp1.prompts import comprehensive_assessment_prompt

# CP1a Asessment

CP1a is defined as: **Does the country have a framework climate law or equivalent?**

Based on the ASCOR methodology, a country is assessed as ‘Yes’ if it has a framework climate law that fulfils either 1&2&3 or just 4 of 
the following criteria:

1. It sets a strategic direction for decarbonisation

2. It is enshrined in law

3. It sets out at least one of the following obligations

4. Also check this In exceptional cases, the combination of a broad environmental law and a clearly linked executive climate strategy may be sufficient to meet these criteria

Each criteria is handled by a different LLM, which recieves a criteria-specific prompt + retrieved context relevant to that critera.

We can see the prompt for criteria 1 here:

In [7]:
# Print the system message template
print("System Message Template:")
print(cp1a_criterion_1_prompt.messages[0].prompt.template)

print("\n" + "="*50 + "\n")

# Print the human message template  
print("Human Message Template:")
print(cp1a_criterion_1_prompt.messages[1].prompt.template)

System Message Template:
You are an expert legal analyst specializing in climate legislation. 
    Your task is to evaluate whether a climate law sets a strategic direction for decarbonisation.
    
    A law meets this criterion if it includes a clear statement to meet the goals of the Paris Agreement 
    OR a national long-term decarbonisation target.
    
    For any claims you make, you **MUST** include the page number and document citation in the format (page X, doc Y).

     
    Respond with only 'YES' or 'NO' followed by a brief explanation.


Human Message Template:
Context: {context}

Does this law set a strategic direction for decarbonisation?


According to ASCOR methodology, if 1,2,3 are satisfied or 4 is satisfied, then CP1a is answered as Yes.

We therefore use another LLM to evalute the overall prompt based on the other LLM's responses. Lets have a look at its prompt:

In [8]:
# Print the system message template
print("System Message Template:")
print(cp1a_final_assessment_prompt.messages[0].prompt.template)

print("\n" + "="*50 + "\n")

# Print the human message template  
print("Human Message Template:")
print(cp1a_final_assessment_prompt.messages[1].prompt.template)

System Message Template:
You are an expert legal analyst making a final assessment of climate legislation.
    
    A country is assessed as 'YES' for having framework climate law if:
    - Criteria 1, 2, AND 3 are all satisfied, OR
    - Criterion 4 is satisfied (exceptional case)
   
    For any claims you make, you **MUST** include the page number and document citation in the format (page X, doc Y).

    Based on the individual assessments, provide a final 'YES' or 'NO' answer with reasoning.


Human Message Template:
Individual criterion assessments:
    Criterion 1 (Strategic direction): {criterion_1_result}
    Criterion 2 (Enshrined in law): {criterion_2_result}
    Criterion 3 (Obligations): {criterion_3_result}
    Criterion 4 (Exceptional case): {criterion_4_result}
    
    What is the final assessment?



### What content does the assessment use?

Lets have a look at the chunks it retrieves in order to inform its assessment of each criteria.

If you look at `get_chunks_for_cp1a` in `scripts/climate_policy_pipelines/cp1.pipeline`, you can see that these are 4 sentences it embeds and retrives chunks for, retrieving 100 chunks for each CP1a criteria 

```
cp1a_prompts = [
    "strategic direction for decarbonisation Paris Agreement national long-term target",
    "climate law enshrined in law legally binding framework",
    "obligations carbon budgets emissions targets monitoring requirements",
    "environmental law executive climate strategy broad framework"
]
```

Lets have a look at just what `cp1a_criteria_1_search_prompt` retrieves:

In [None]:
cp1a_criteria_1_search_prompt = "strategic direction for decarbonisation Paris Agreement national long-term target"
get_chunks_for_cp1a(cp1a_criteria_1_search_prompt)

### 3. Demo assessment 

Lets run an assessment for Albania

In [9]:
run_cp1a_assessment(country="POL")

No documents found for country code: POL
No documents found for country code: POL
No documents found for country code: POL
No documents found for country code: POL


AIMessage(content="Based on the individual criterion assessments, the final assessment is:\n\nNO\n\nReasoning: Since Criteria 1, 2, and 3 are all 'NO', the country does not meet the first condition for having a framework climate law. Additionally, Criterion 4 is also 'NO', which means the exceptional case condition is not satisfied either. Therefore, the country does not have a framework climate law. (page X, doc Y)", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 89, 'prompt_tokens': 292, 'total_tokens': 381, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_name': 'meta-llama/Meta-Llama-3.1-70B-Instruct', 'system_fingerprint': None, 'id': 'chatcmpl-519a50bb2b784a2086b662720ce8179b', 'service_tier': None, 'finish_reason': 'stop', 'logprobs': None}, id='run--ee1a111d-aa12-46c3-b157-8105eca1d995-0', usage_metadata={'input_tokens': 292, 'output_tokens': 89, 'total_tokens': 381, 'input_token_details': {}, 'output_token_

# CP1a Large Context Window Asessment

Another tool just uses one, more powerful large context window LLM to assess all components of CP1a togehter instead of seperatley. Here is the prompt:

In [10]:
# Print the system message template
print("System Message Template:")
print(comprehensive_assessment_prompt.messages[0].prompt.template)

print("\n" + "="*50 + "\n")

# Print the human message template  
print("Human Message Template:")
print(comprehensive_assessment_prompt.messages[1].prompt.template)

System Message Template:
You are an expert legal analyst specializing in climate legislation assessment. 

Your task is to evaluate whether a country has a framework climate law based on specific criteria and provide a structured markdown assessment.

EVALUATION CRITERIA:
A country is assessed as 'YES' if it has a framework climate law that fulfils ALL of criteria 1, 2, AND 3, OR criterion 4:

1. STRATEGIC DIRECTION: Sets a strategic direction for decarbonisation (must include a clear statement to meet the goals of the Paris Agreement OR a national long-term decarbonisation target)

2. ENSHRINED IN LAW: Is enshrined in law (must be legislative rather than executive, except in particular political systems)

3. OBLIGATIONS: Sets out at least one of the following obligations:
   - Meeting a national target
   - Developing, revising, implementing or complying with domestic plans, strategies or policies
   - Developing policy instruments such as regulation, taxation or public spending in su

Lets see how it performs on the same country:

In [11]:
run_cp1a_assessment_large_context(country="POL")


No documents found for country code: POL
No documents found for country code: POL
No documents found for country code: POL
No documents found for country code: POL
Large Context Assessment:
Based on the provided context, I must inform that there is no information available to assess the country's framework climate law. The context is empty, and I couldn't find any relevant information to evaluate the criteria.

However, I will provide a structured markdown assessment as per your request:

```markdown
# Climate Legislation Assessment: CP 1.a Framework Climate Law

## Individual Criterion Evaluation

### Criterion 1: Strategic Direction for Decarbonisation
**Result:** NO
**Reasoning:** No information is available to determine if the law includes clear Paris Agreement goals or long-term decarbonisation targets.

### Criterion 2: Enshrined in Law
**Result:** NO
**Reasoning:** No information is available to determine if this is legislative rather than executive.

### Criterion 3: Sets Out O

AIMessage(content="Based on the provided context, I must inform that there is no information available to assess the country's framework climate law. The context is empty, and I couldn't find any relevant information to evaluate the criteria.\n\nHowever, I will provide a structured markdown assessment as per your request:\n\n```markdown\n# Climate Legislation Assessment: CP 1.a Framework Climate Law\n\n## Individual Criterion Evaluation\n\n### Criterion 1: Strategic Direction for Decarbonisation\n**Result:** NO\n**Reasoning:** No information is available to determine if the law includes clear Paris Agreement goals or long-term decarbonisation targets.\n\n### Criterion 2: Enshrined in Law\n**Result:** NO\n**Reasoning:** No information is available to determine if this is legislative rather than executive.\n\n### Criterion 3: Sets Out Obligations\n**Result:** NO\n**Reasoning:** No information is available to determine if any obligations are present.\n\n### Criterion 4: Exceptional Case\n

# CP1b Assessment

We have also built a tool that automatically evalutes CP1b

CP1b is defined by ASCOR as: **Does the country’s framework climate law specify key accountability elements?**


A country is assessed as ‘Yes’ if its framework climate law contains all three of the following accountability elements: 
1. Specification of who is accountable to whom for at least one stated obligation (e.g. accountability of executive to parliament, or private parties to executive authorities) 
2. Specification of how compliance is assessed for at least one stated obligation (e.g. transparency mechanisms in the form of monitoring, reporting and verification, parliamentary oversight, expert assessments, court proceedings) 
3. Specification of what happens in the case of non-compliance for at least one stated obligation (e.g. parliamentary intervention, judicial orders, financial penalties). 

Like with CP1a, these are evaluated separately and then evaluted together by a evaluator LLM. There is additional guidance in the methodology on how to asses these criteria wich is included in the prompts (see `scripts/climate_policy_pipeline/cp1/prompts`)

Lets see how the tool performs:

In [12]:
run_cp1b_assessment(country="POL")

No documents found for country code: POL
No documents found for country code: POL
No documents found for country code: POL
Detailed Assessment:
Based on the individual criterion assessments, the final assessment is:

NO

Reasoning: Since all three criteria (Criterion 1, Criterion 2, and Criterion 3) are assessed as 'NO' due to a lack of information, the country's framework climate law does not meet the requirements for a 'YES' assessment. Without sufficient information to evaluate the law, it is impossible to determine whether the law specifies the necessary accountability elements.


AIMessage(content="Based on the individual criterion assessments, the final assessment is:\n\nNO\n\nReasoning: Since all three criteria (Criterion 1, Criterion 2, and Criterion 3) are assessed as 'NO' due to a lack of information, the country's framework climate law does not meet the requirements for a 'YES' assessment. Without sufficient information to evaluate the law, it is impossible to determine whether the law specifies the necessary accountability elements.", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 89, 'prompt_tokens': 299, 'total_tokens': 388, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_name': 'meta-llama/Meta-Llama-3.1-70B-Instruct', 'system_fingerprint': None, 'id': 'chatcmpl-2448a8b2374044fe9e93faf1b1a1a56c', 'service_tier': None, 'finish_reason': 'stop', 'logprobs': None}, id='run--249e3f80-4d31-4c0b-bec4-f36c1af6a08a-0', usage_metadata={'input_tokens': 299, 'output_tokens': 89, 'total_tokens

If you want to easily automate assessment so do not want to have justification and only output the yes/no answer, you can set `detailed=False` in the function:

In [None]:
run_cp1b_assessment(country='ALB', detailed=False)

Note that the assessment will be the same always because model temperature is set to 0.