# Pipeline de evaluación de las respuestas de Compliance

Este proceso evalúa las respuestas generadas con el RAG para validar el compliance primero respondiendo a las preguntas generadas en el paso anterior y comparando con la respuesta que se debería haber obtenido a partir de la documentación. Para esto vamos a utilizar un framework de evaluación con un LLM-as-a-judge que permite evaluar tanto el retrieval (RAG) como la generación de las respuestas.  

# RAGAS

RAGAS's vision is to facilitate the continous improvement of LLM and RAG applications by embracing the ideology of [[Metrics Driven Development - MDD]]. 

They want to establish an open-source standard for applying MDD to LLM and RAG applications.
- RAG Evaluation: Enables you to assess LLM applications and conduct experiments in a metric-assisted manner, ensuring high dependability and reproducibility. 
- Monitoring: It allows you to gain valuable and actionable insights from production data points, facilitating the continuous improvement of the quality of your LLM application. 


## Metrics

### Faithfulness
Measures the factual consistency of the generated answer vs the fiven context. It is calculated from answer and retrieved context. The answer is scaled to (0,1) range. Higher the better. 

Answer is regarded as faithful if all the claims made in the answer can be inferred from the given context. To calculate this, 
1. A set of claims from the generated answers are identified. 
2. Each of these claims is cross-checked with the given context to determine if it can be inferred from the context. The faithfulness score is given by:


$$ Faithfulness\:Score = \frac{|Number\:of\:claims\:in\:the \:generated\: answer \:that \: can \: be \: inferred \: from \:given \:context|}{Total \:number \: of \:claims \: in \: the \: generated \: answer }$$

#### Example
Let’s examine how faithfulness was calculated using the low faithfulness answer:
- **Step 1:** Break the generated answer into individual statements.
    
    - Statements:
        
        - Statement 1: “Einstein was born in Germany.”
            
        - Statement 2: “Einstein was born on 20th March 1879.”
            
- **Step 2:** For each of the generated statements, verify if it can be inferred from the given context.
    
    - Statement 1: Yes
        
    - Statement 2: No
        
- **Step 3:** Use the formula depicted above to calculate faithfulness.

$$ Faithfulness = \frac{1}{2} = 0.5$$

### Answer Relevancy
Assesses how pertinent a generated answer is to the given prompt. A lower score is assigned to answers that are incomplete or contain redundant information and higher scores indicate better relevancy. This is calculated using `question`, `context` and `answer` (ground truth). 

An answer is deemed relevant when it directly and appropriately addresses the original question. This assessment penalizes lacks of completeness or contains redundant details. To calculate the score an LLM is prompted to generate an appropriate questions for the generated answer multiple times, and the mean cosine similarity between these generated questions and the original question is measured. The idea is that if the generated answer accurately addresses the initial questions, the LLM should be able to generate questions from the answers that align with the original question. 

The answer relevancy is defined as a mean cosine similarity of the original `question` to a number of artificial questions, which were generated (reverse engineered) based on the answer:

$$ Answer\: Relevancy = \frac{1}{N} \sum_{i=1}^{N} cos(E_g, E_o)$$
$$ Answer\: Relevancy = \frac{1}{N} \sum_{i=1}^{N} \frac{E_g. E_o}{||E_{g_i}||.||E_o||}$$
Where:
- $E_{g_i}$ is the embedding of the generated question $i$.
- $E_o$ is the embedding of the original question. 
- $N$ is the number of generated questions, which is 3 by default. 

#### Example

>[!Hint]
**Question**: Where is France and what is it’s capital?
**Low relevance answer**: France is in western Europe.
**High relevance answer**: France is in western Europe and Paris is its capital.

To calculate the relevance of the answer to the given question, we follow two steps:

- **Step 1:** Reverse-engineer ‘n’ variants of the question from the generated answer using a Large Language Model (LLM). For instance, for the first answer, the LLM might generate the following possible questions:
    
    - _Question 1:_ “In which part of Europe is France located?”
        
    - _Question 2:_ “What is the geographical location of France within Europe?”
        
    - _Question 3:_ “Can you identify the region of Europe where France is situated?”
        
- **Step 2:** Calculate the mean cosine similarity between the generated questions and the actual question.
    

The underlying concept is that if the answer correctly addresses the question, it is highly probable that the original question can be reconstructed solely from the answer.

### Context Recall

It measures the extent to which the retrieved context aligns with the annotated answer, treated as the ground truth. 

Computed using `question`, `ground truth` and the retrieved `context`, and the values between 0 and 1, with higher values indicating better performance.

To estimate context recall from the ground truth (GT) answer, each claim in the ground truth answer is analyzed to determine wheter it can be attributed to the retrieved context or not. 

$$ context \: recall = \frac{| GT\: claims \: that \: can \: be \: attributed \: to \: context|}{|Number \: of \: claims \: in \: GT|}$$

#### Example

>[!Example]
>
>Question: Where is France and what is it’s capital?
>
>Ground truth: France is in Western Europe and its capital is Paris.
>
>High context recall: France, in Western Europe, encompasses medieval cities, alpine villages and Mediterranean beaches. Paris, its capital, is famed for its fashion houses, classical art museums including the Louvre and monuments like the Eiffel Tower.
>
>Low context recall: France, in Western Europe, encompasses medieval cities, alpine villages and Mediterranean beaches. The country is also renowned for its wines and sophisticated cuisine. Lascaux’s ancient cave drawings, Lyon’s Roman theater and the vast Palace of Versailles attest to its rich history.


#### Calculation

- **Step 1**: Break ground truth answer into individual statements. 
	- Statements:
		- Statement 1: "France is in Western Europe"
		- Statement 2: "Its capital is Paris"
- **Step 2**: For each of the ground truth statements, verify if it can be attributed to the retrieved context. 
	- Statement 1: Yes
	- Statement 2: No
- **Step 3**: Use the formula depicted above to calculate context recall. 
$$ context \: recall = \frac{1}{2} = 0.5 $$ 


### Context Precision

Context precision evaluates if the ground-truth relevant items present in the contexts are ranked higher or not. Ideally all relevant chunks must appear at the top ranks.  

It uses: `question`, `ground_truth` and the `contexts`. With values between 0 and 1, where higher is better precision. 


$$ Context \: Precision@K = \frac{\sum_{k=1}^{K}Precision@K \cdot v_k}{Total\: number \:of\: relevant\: items\: in\: top\: K\: results}$$
$$ Precision@k = \frac{true\: positives @k}{true\: positives @k + false\: positives @k}$$

Where $K$ is the total number of chunks in `contexts` and $v_k \in {0,1}$ is the relevance indicator at rank $k$.

#### Example calculation 


>[!Example]
>Question: Where is France and what is it’s capital? Ground truth: France is in Western Europe and its capital is Paris.
>
>High context precision: [“France, in Western Europe, encompasses medieval cities, alpine villages and Mediterranean beaches. Paris, its capital, is famed for its fashion houses, classical art museums including the Louvre and monuments like the Eiffel Tower”, “The country is also renowned for its wines and sophisticated cuisine. Lascaux’s ancient cave drawings, Lyon’s Roman theater and the vast Palace of Versailles attest to its rich history.”]
>
>Low context precision: [“The country is also renowned for its wines and sophisticated cuisine. Lascaux’s ancient cave drawings, Lyon’s Roman theater and”, “France, in Western Europe, encompasses medieval cities, alpine villages and Mediterranean beaches. Paris, its capital, is famed for its fashion houses, classical art museums including the Louvre and monuments like the Eiffel Tower”,]

Calculation for the low context precision example:

1. **Step 1**: For each chunk in retrieved context, check if it is relevant or not relevant to arrive at the ground truth for the given question. 
2. Calculate precision@k for each chunk in the context.
	1. $$ Precision@1 = \frac{0}{1} = 0$$
	2. $$ Precision@2 = \frac{1}{2} $$
3. Calculate the mean of $precision@k$ to arrive at the final context precision score. 
4. $$Context \: Precision = \frac{0+0.5}{1} = 0.5$$



## RAGAS

Este framework permite evaluar los pipelines de Retrieval Augmented Generation (RAG).

In [10]:
import sys
import os

# Get the absolute path to the src directory
src_path = os.path.abspath("..")

# Add the src directory to the sys.path
if src_path not in sys.path:
    sys.path.insert(0, src_path)

# Set the __package__ attribute to simulate running as a package
__package__ = "src"

In [None]:
import requests
import json

# Vamos a generar una query al endpoint de http://localhost:6000/questions-from-standard para obtener las preguntas de un estándar en particular

# Definimos el endpoint
url = "http://localhost:6000/questions-from-standard"

# Realizamos la petición
response = requests.get(url)

In [4]:
questions = json.loads(response.text)
questions = questions["questions"]
questions[0:4]

['1. Has the borrower prepared a Pest Management Plan (PMP) for projects involving significant pest management issues or activities that may lead to such issues?',
 '2. Are there measures in place to improve the efficient consumption of resources such as energy, water, and raw materials?',
 '3. Has the borrower assessed the potential cumulative impacts of water use associated with the project?',
 '4. What controls are in place for the use of hazardous materials in the project?']

In [12]:
# Now you can import your modules
from src import config
from src.models import Models

[32m2024-09-25 18:13:40.747[0m | [1mINFO    [0m | [36msrc.config[0m:[36m<module>[0m:[36m13[0m - [1mSecrets loaded from .env[0m


In [13]:
llm, embed_model = Models.initialize_models()

test_embedding = embed_model.get_text_embedding(
    "Open AI new Embeddings models is great."
)

print(test_embedding[0:4])
# print(test_embedding.shape)

[-0.014704804867506027, 0.010918455198407173, -0.0006071067182347178, -0.030538631603121758]


In [9]:
from ragas.metrics import (
    context_precision,
    answer_relevancy,
    faithfulness,
    context_recall,
)
from ragas.metrics.critique import harmfulness

# list of metrics we're going to use
metrics = [
    faithfulness,
    answer_relevancy,
    context_recall,
    context_precision,
    harmfulness,
]


from ragas.run_config import RunConfig

thread_timeout = 240
timeout = 240
max_retries = 10
max_wait = 120
run_config = RunConfig(
    timeout=timeout,
    max_retries=max_retries,
    # thread_timeout=thread_timeout
)

  from .autonotebook import tqdm as notebook_tqdm


In [7]:
from .index import IndexManager

storage_dir = "../storage_vector_store"
index = IndexManager.load_index(storage_dir)

In [18]:
import requests
import json


async def get_answers_from_project(query):
    # Definimos el endpoint
    url = "http://localhost:8540/query-project"

    json_data = {
        "query": query,
    }

    # Realizamos la petición
    response = requests.post(url, json=json_data)
    return json.loads(response.text)


# Definimos el endpoint

response_json = await get_answers_from_project("Jindal?")

  response_json = await get_answers_from_project("Jindal?")


In [20]:
# response_json = json.loads(response_json)
import pprint

print("Response: ")
pprint.pprint(response_json["response"]["response"])
print("Source nodes: ")
pprint.pprint(response_json["response"]["source_nodes"][0]["node"]["text"])
pprint.pprint(response_json["response"]["source_nodes"][1]["node"]["text"])

Response: 
('Jindal Iron Ore (Pty) Ltd is involved in a project known as the Jindal MIOP '
 '(Mining and Infrastructure Optimization Project). This project aims to '
 'contribute to economic development and inclusive growth through revenue and '
 'tax generation, as well as the creation of employment opportunities. The '
 'project also aligns with various strategic frameworks and policies, such as '
 'the National Development Plan (NDP) and the Provincial Growth Development '
 'Strategy (PGDS) of KwaZulu-Natal (KZN), by supporting local economic '
 'development, social investment, and the inclusion of vulnerable groups in '
 'economic activities.')
Source nodes: 
('Jindal Iron Ore (Pty) Ltd SLR Project No: 720.10023.00001 \n'
 'Jindal MIOP EIA & EMPr   July 2023 \n'
 ' \n'
 ' \n'
 ' \n'
 ' 74  \n'
 ' \n'
 'Jindal Iron Ore Mine ESIA and EMPr - 09072023 FINAL Importantly, the NDP '
 'notes that while minerals beneficiation is a good way to increase '
 'productivity and export \n'
 'reven

In [21]:
expert_questions = [
    "Energy Efficiency: What measures are in place to ensure the efficient use of energy in mining operations? Are there any energy-saving technologies or practices being implemented?",
    "Water Management: How is water usage monitored and managed to ensure efficiency? Are there any systems in place for recycling or reusing water within the mining operations?",
    "Raw Material Usage: What steps are being taken to optimize the use of raw materials and minimize waste? Are there any benchmarking data available to compare the mine's resource efficiency with industry standards?",
    "Air Quality Management: What measures are in place to monitor and control air emissions, including greenhouse gases and particulate matter, from mining activities?",
    "Water Pollution Control: How is wastewater from mining operations treated before being discharged? Are there any measures to prevent contamination of local water bodies?",
    "Soil and Land Management: What practices are in place to prevent soil erosion and land degradation due to mining activities? Are there any land reclamation plans post-mining?",
    "Hazardous Waste Management: How is hazardous waste generated by the mine identified, stored, and disposed of? Are there any protocols for handling spills or accidental releases?",
    "Non-Hazardous Waste Management: What systems are in place for the management of non-hazardous waste, including recycling and reduction initiatives?",
]

In [22]:
import pandas as pd

df = pd.DataFrame()
for question in expert_questions:
    response_json = await get_answers_from_project(question)
    print("Question: ", question)
    print("Response: ")
    pprint.pprint(response_json["response"]["response"])
    print("Source nodes: ")
    pprint.pprint(response_json["response"]["source_nodes"][0]["node"]["text"])
    pprint.pprint(response_json["response"]["source_nodes"][1]["node"]["text"])
    print("")
    # now I want to save this into an excel file
    import pandas as pd
    import numpy as np

    df_ = pd.DataFrame(
        {
            "Question": [question],
            "Response": [response_json["response"]["response"]],
            "Source node 1": [
                response_json["response"]["source_nodes"][0]["node"]["text"]
            ],
            "Source node 2": [
                response_json["response"]["source_nodes"][1]["node"]["text"]
            ],
        }
    )

    df = pd.concat([df, df_], axis=0, ignore_index=True)

    # Now we can save the dataframe into an excel
    df.to_excel("project-compliance-validation.xlsx", index=False)

    # Create a Pandas Excel writer using XlsxWriter as the engine.
    # writer = pd.ExcelWriter('output.xlsx', engine='openpyxl')

Question:  Energy Efficiency: What measures are in place to ensure the efficient use of energy in mining operations? Are there any energy-saving technologies or practices being implemented?
Response: 
('To ensure the efficient use of energy in mining operations, several measures '
 'are in place. These include the decarbonisation of the electricity supply, '
 'which can be achieved through the integration of renewable energy sources '
 'into the national grid or the installation of on-site renewable energy '
 'systems. Additionally, the electrification of the mine vehicle fleet is '
 'being considered to reduce fuel consumption in mobile machinery. Regular '
 'servicing of vehicles and machinery is also emphasized to maintain optimal '
 'fuel efficiency.')
Source nodes: 
('Engineer/ EO Visual inspection \n'
 'Grievance mechanism Bi-annually throughout \n'
 'operational phase \n'
 '12 \uf0b7 Use of construction \n'
 'equipment and other \n'
 'machinery during \n'
 'construction activiti

In [23]:
df

Unnamed: 0,Question,Response,Source node 1,Source node 2
0,Energy Efficiency: What measures are in place ...,To ensure the efficient use of energy in minin...,Engineer/ EO Visual inspection \nGrievance mec..., The planned mining activities should further...
1,Water Management: How is water usage monitored...,Water usage is monitored and managed through s...,Jindal Iron Ore (Pty) Ltd SLR Project No: 720....,Jindal Iron Ore (Pty) Ltd \nJindal MIOP EIA & ...
2,Raw Material Usage: What steps are being taken...,To optimize the use of raw materials and minim...,Engineer/ RE/ EO/ \nEnvironmental Manager Sign...,Engineer/ EO Visual inspection \nGrievance mec...
3,Air Quality Management: What measures are in p...,Measures to monitor and control air emissions ...,Jindal Iron Ore (Pty) Ltd SLR Project No: 720....,Jindal Iron Ore (Pty) Ltd SLR Project No: 720...
4,Water Pollution Control: How is wastewater fro...,Wastewater from mining operations is treated u...,Jindal Iron Ore (Pty) Ltd SLR Project No: 720....,Jindal Iron Ore (Pty) Ltd SLR Project No: 720....
5,Soil and Land Management: What practices are i...,To prevent soil erosion and land degradation d...,Jindal Iron Ore (Pty) Ltd SLR Project No: 720...,Jindal Iron Ore (Pty) Ltd \nJindal MIOP EIA & ...
6,Hazardous Waste Management: How is hazardous w...,Hazardous waste generated by the mine is ident...,Jindal Iron Ore (Pty) Ltd \nJindal MIOP EIA & ...,Jindal Iron Ore (Pty) Ltd \nJindal MIOP EIA & ...
7,Non-Hazardous Waste Management: What systems a...,The management of non-hazardous waste includes...,Jindal Iron Ore (Pty) Ltd SLR Project No: 720....,Jindal Iron Ore (Pty) Ltd SLR Project No: 720....


# Comparison against expert responses

In [2]:
questions = [
    "Are there technically and financially feasible and cost-effective options implemented to avoid or minimise project-related air emissions during the design, construction, and operation phases?",
    "Is the project in compliance with existing requirements for the management of hazardous wastes, including national legislation and applicable international conventions?",
]

expert_responses = [
    """The EIA and EMP report provides some information about measures that should be implemented to avoid or minimise project-related air emissions during the design, construction, and operation phases of the Jindal MIOP. The design phase should include technical specifications for process emissions from the proposed milling plant.

    Design Phase

    Mitigation Hierarchy: Specific details about air emission mitigation measures during the design phase, such as utilising renewable energy sources or incorporating air quality considerations in infrastructure placement, are not mentioned.
    Construction Phase:

    Dust Management: The report emphasises dust control during construction, including minimising drop heights for materials, utilising water suppression techniques, and covering vehicles transporting dry materials.
    Timing of construction: Restricting construction activities to daytime hours as defined by the DMRE is mentioned as a way to mitigate noise pollution. While this primarily targets noise reduction, it indirectly helps minimise air emissions by avoiding activities during potential temperature inversions which can trap pollutants.
    Vehicle Maintenance: This indirectly contributes to air quality control because exhaust emissions from well-maintained vehicles should not exceed the design limits.
    Operational Phase:

    Air Quality Monitoring: Continuous air quality monitoring for dust and other pollutants is required, with a focus on areas near the project boundary and potential sensitive receptors like nearby farms.
    Dust Suppression: The plan includes ongoing dust suppression measures, including the use of surfactants and dust suppressants during watering, especially near project boundaries. Maintaining existing vegetation and planting additional native trees is also encouraged to act as windbreaks and reduce dust.
    Operational Practices: The plan mandates practices to minimise dust generation, such as covering trucks carrying dry materials, using chutes at material transfer points, minimising vehicle speeds on paved and unpaved roads, and maintaining haul routes with less erodible material.
    Blasting Management: The plan outlines measures to control dust during blasting, including limiting blasting during specific wind conditions, avoiding blasting in fog or low clouds, restricting blasting to daytime hours, and considering pre-watering or palliative application on blast areas.
    Energy Use Reduction: While specific details are limited, the plan encourages the exploration of energy reduction options, including decarbonising the electricity supply (potentially through on-site power generation through renewables) and electrifying the mining vehicle fleet.
    Conclusion:

    The EIA and EMP outline various measures to manage and mitigate air emissions, but the sources lack specific details about the technical specifications of some proposed solutions, particularly regarding energy use reduction strategies. There is also no mention about emissions from the proposed milling and magnetic separation processing plant.""",
    """The EIA and EMPr report does provide information regarding the Jindal MIOP's intention to comply with hazardous waste management requirements, referencing both national legislation and international conventions.

    National Legislation

    National Environmental Management: Waste Act, 2008 (NEM:WA): This Act serves as the primary legislation governing waste management in South Africa. The Jindal MIOP acknowledges that it must apply for a Waste Management Licence (WML) in accordance with NEM:WA and its associated regulations. No evidence was provided about whether the licence application has started.
    List of Waste Management Activities (GNR 921 of 2013): The project identifies specific waste management activities within the scope of GNR 921, categorising them into Categories A, B, and C. Activities in Categories A and B require impact assessment and licensing processes.
    Regulations Regarding Residue Stockpiles and Deposits (GNR 632 of 2015): The project recognises the applicability of these regulations, emphasising that environmental impact assessments must consider the potential impacts of residue stockpiles and deposits.
    Hazardous Substances Act, 1973 (HSA): The report states that the project will adhere to the HSA in planning, assessing, and managing hazardous substances associated with its operations.
    International Conventions

    Basel Convention on the Control of Transboundary Movements of Hazardous Wastes and their Disposal: The project identifies the Basel Convention as a relevant international framework guiding its hazardous waste management practices.
    Specific Measures

    Waste Management Plan (WMP): The project proposes to implement a comprehensive WMP adhering to waste minimisation, segregation, reuse, recycling, treatment, and disposal principles.
    Hazardous Waste Storage: The project mandates the storage of hazardous materials, such as fuel, oil, and chemicals, in bunded facilities designed to contain potential spills.
    Spill Response: The report outlines procedures for immediate spill cleanup and proper disposal of contaminated materials.
    Environmental Management Programme (EMPr): The EMPr includes provisions for managing hazardous waste throughout all project phases, from construction and operation to decommissioning. It emphasises good housekeeping, proper waste storage, and employee training on hazardous materials handling.
    Conclusion

    The EIA and EMPr report demonstrates compliance with existing hazardous waste management requirements by acknowledging relevant national legislation, identifying applicable international conventions, and outlining specific measures in its EMPr and associated plans. The project emphasises minimising waste generation, properly handling and storing hazardous materials, and implementing appropriate spill response procedures. However, the report often lacks detail on how to put these intentions into practice.
    """,
]

system_responses = [
    """"Yes, there are several technically and financially feasible and cost-effective options implemented to avoid or minimize project-related air emissions during the design, construction, and operation phases. These include:

    1. **Reducing Vehicle Speeds**: Limiting vehicle speeds to 40 km/hr on paved site roads and 20 km/hr on unpaved haul routes.
    2. **Utilizing Chutes**: Using chutes at material handling transfer points to minimize dust.
    3. **Enclosing Processing Plant**: Ensuring the processing plant is enclosed and designing bag filters on extraction points for 30 mg/Nm3.
    4. **Covering Vehicles**: Covering vehicles carrying dry soil and other materials during travel.
    5. **Maintaining Haul Routes**: Using less erodible aggregate material for haul routes and maintaining them properly.
    6. **Air Quality Monitoring**: Implementing continuous ambient air quality monitoring during the test phase and long-term monitoring campaigns.
    7. **Blasting Restrictions**: Restricting blasting to low wind speeds, avoiding certain wind directions, and not blasting during fog, low overcast clouds, or in the dark.
    8. **Wind Speed Reduction**: Reducing wind speed around dusty areas through early planting of native tree species and strategic use of barriers.
    9. **Grievance Mechanism**: Implementing a grievance procedure for air quality issues to be raised and addressed transparently and promptly.

    These measures are designed to mitigate air emissions effectively throughout the various phases of the project.""",
    """Yes, the project is in compliance with existing requirements for the management of hazardous wastes. It adheres to national legislation such as the National Environmental Management: Waste Act, 2008, and follows guidelines and regulations for waste management activities, including the planning and management of residue stockpiles and deposits. Additionally, the project considers international conventions and guidelines relevant to health and environmental standards, ensuring comprehensive compliance.""",
]

source_nodes = [
    [
        """"Engineer/ EO Visual inspection 
        Grievance mechanism Bi-annually throughout 
        operational phase 
        12  Use of construction 
        equipment and other 
        machinery during 
        construction activities. Impact of the project 
        on climate change 12.1 Some potential options for energy use reduction to be considered include: 
         Decarbonisation of the electricity supply: This could come in several forms, such as the 
        decarbonisation of the grid emission factor as new renewable energy comes online in 
        the national grid system. Alternatively, some decarbonisation could be achieved 
        through the installation of on-site renewable energy for own use. 
         Electrification of the fleet: This option could mitigate emissions by electrifying the 
        mine vehicle fleet and reducing the fuel consumption in mobile machinery. The 
        electrification could be combined with renewable energy for further mitigation. Jindal Iron Ore 
        (Pty) Ltd/  Research on potential alternative 
        sources of power  
        Cost benefit analysis Annually throughout 
        operational phase 
        12.2 Regularly service vehicles and machinery to ensure optimal fuel efficiency. Engineer Maintenance Schedule 
        Maintenance records Annually throughout 
        operational phase""",
        """Engineer/ RE/ EO/ 
        Environmental Manager Signed engineering designs 
        Approved Integrated Water Use 
        Licence (IWUL) Prior to construction and 
        active mining and waste 
        rock dumping taking place 
        Minimise flooding of 
        infrastructure 11.2 Design and construct a berm to divert clean water around the mine infrastructure 
        and back into natural flowpaths in the environment. Engineer/ RE/ EO/ 
        Environmental Manager Signed engineering designs 
        Approved Integrated Water 
        Use Licence (IWUL) Prior to construction of the 
        processing plant and other 
        infrastructure 
        12 Traffic impact Minimise negative effects 
        associated with the increase in 
        traffic. 
        12.1 Access to and from the development area should be either via existing roads or 
        within the construction servitude. Construction Contractor/ EO Road design plan Daily inspections 
        throughout construction 
        12.2 Develop and implement a Traffic Management Plan (TMP) including strict 
        controls over driver training and qualifications, vehicle maintenance, vehicle 
        certifications, speed restrictions, appropriate road safety signage, and vehicle 
        loading and maintenance measures. Ensure that this TMP includes measures to Jindal Iron Ore (Pty) Ltd 
        (Safety Manager)  TMP  Prior to construction""",
    ],
    [
        """Jindal Iron Ore (Pty) Ltd SLR Project No: 720.10023.00001 
        Jindal MIOP EIA & EMPr   July 2023 
        
        
        
        67  
        
        Jindal Iron Ore Mine ESIA and EMPr - 09072023 FINAL environmental health management. The Director General (DG) is tasked to promulgate, and promote adherence 
        to, norms and standards on health matters, including conditions that constitute a health hazard and facilitate the 
        provision of indoor and outdoor environmental pollution control services.  
        Section 88 of the Act provides legal effect to environmental health investigations.  Any activity that gives rise to 
        offensive/injurious conditions or is dangerous to health (e.g. accumulation of refuse) may have a negative impact 
        on health and thus warrants being assessed in an Health Impact Assessment. 
        4.11 INTERNATIONAL LAW AND GUIDANCE 
        South Africa is a signatory to international conventions that may be applicable to the Project and these may be 
        seen to provide additional direction in the absence or limitation of local legislation or policy. Various international 
        bodies also provide relevant guidelines for health assessment. Those of relevance include:  
         The United Nations Declaration on Rights of the Indigenous Peoples. 
         Stockholm Convention on Persistent Organic Pollutants. 
         Basel Convention on the Control of Trans-boundary Movements of Hazardous Wastes and their Disposal. 
         United Nations Development Program. Global and Inclusive Agreement. 
         United Nations Environmental Program.  
         International Health Regulations as promulgated by the World Health Organization.  
         International Finance Corporation’s (IFC) Performance Standards and Equator Principles. 
        4.12 GUIDELINES, POLICIES, PLANS AND FRAMEWORKS 
        The guidelines, policies and plans that have been considered during the S&EIA process are listed in Table 4-2. 
        Table 4-2: Guideline and Policy Framework  
        Applicable legislation and guidelines 
        used to compile the report How does this development comply with and respond to the 
        policy and legislative context Reference 
        where applied 
        National Norms and Standards for the 
        Storage of Waste, published in terms 
        of NEM:WA in Government Notice 926 
        of 2013 These regulations have informed project planning and have 
        been taken into account in the assessment and management 
        of waste for the project. Table 3-5 
        National Waste Information 
        Regulations published in terms of 
        NEM:WA in Government Notice 625 of 
        2012 
        National Norms and Standards for the 
        Assessment of Waste for Landfill 
        Disposal, published in terms of the 
        NEM:WA in Government Notice 635 of 
        August 2013 
        Guideline on the Need and Desirability, 
        Department of Environmental Affairs, 
        2017 This guideline has been taken into account as part of project 
        planning. Section 5""",
        """Jindal Iron Ore (Pty) Ltd SLR Project No: 720.10023.00001 
        Jindal MIOP EIA & EMPr   July 2023 
        
        
        
        60  
        
        Jindal Iron Ore Mine ESIA and EMPr - 09072023 FINAL Applicable legislation and guidelines 
        used to compile the report How does this development comply with and respond to the 
        policy and legislative context Reference 
        where applied 
        National Environmental Management: 
        Waste Act, 2008 (Act No. 59 of 2008) 
        (NEM:WA) The WRD requires a Waste Management Licence in terms of 
        the NEM:WA. An integrated application for Environmental 
        Authorisation and a Waste Management License has be 
        submitted to the DMRE and is part of this EIA process. Section 4.4 and 
        4.4.1 and 4.4.2 
        List of Waste Management Activities 
        published in terms of NEM:WA in 
        Government Notice 921 of 29 
        November 2013 (as amended) 
        Waste Classification and Management 
        Regulations published in terms of 
        NEM:WA in Government Notice 634 of 
        2013 As from 8 December 2014 Government implemented the 
        One Environmental System. As a result, residue stockpiles 
        and residue deposits are no longer excluded from the ambit 
        of the NEM:WA. Accordingly, the aforesaid Regulations find 
        application to all waste types to be generated at the Jindal 
        Iron Ore Mine Project including residue to be generated as 
        part of the processing plant and mine operations.    
        Regulations Regarding the Planning 
        and Management of Residue 
        Stockpiles and Residue Deposits from a 
        Prospecting, Mining, Exploration, or 
        Production Operation (GNR 632 of 
        2015) Waste rock is a defined waste in terms of NEM:WA.  Section 4.4.2 
        National Water Act, 1998 (Act No. 36 
        of 1998) (NWA) A new WULA will be submitted to DWS and will cover Section 
        21 (a) (b) (c) (f) (g) (i) and (j). water uses prior to the 
        commencement of construction and operation activities 
        within the project site. Section 4.5 and 
        4.5.1 
        The regulations in terms of section 26 
        read in conjunction with section 12a of 
        the water act, 1956 (Act No. 54 of 
        1956) 
        National Ambient Air Quality 
        Standards, published in terms of 
        NEM:AQA in Government Notice 1210 
        of 2009 National Ambient Air Quality Standards (NAAQS) are 
        available for inhalable particulate matter less than 2.5 μm in 
        diameter (PM2.5) as gazetted on 29 June 2012 (no. 35463), 
        inhalable particulate matter less than 10 μm in diameter 
        (PM10), sulphur dioxide (SO 2), nitrogen dioxide (NO 2), ozone 
        (O3), carbon monoxide (CO), lead (Pb) and benzene as 
        gazetted on 24 December 2009. Section 4.6 and 
        4.6.1 
        National Dust Control Regulations, 
        published in terms of NEM:AQA in 
        Government Notice 827 of 2013 South Africa’s Draft National Dust Control Regulations were 
        published on 27 May 2011 with the dust fallout standards 
        passed and subsequently published on the 1st of November 
        2013 (Government Gazette No. 36974). These are called the 
        National Dust Control Regulations (NDCR). The purpose of 
        the regulations is to prescribe general measures for the 
        control of dust in all areas including residential and light 
        commercial areas. 
        The regulation also specifies that the method to be used for 
        measuring dust fall and the guideline for locating sampling 
        points shall be American Society for Testing Materials (ASTM)""",
    ],
]


## Building RAGAS dataset

In [3]:
data_samples = {
    "question": questions,
    "ground_truth": expert_responses,
    "answer": system_responses,
    "contexts": source_nodes,
}


In [4]:
from datasets import Dataset

dataset = Dataset.from_dict(data_samples)


  from .autonotebook import tqdm as notebook_tqdm


In [5]:
from ragas.run_config import RunConfig

thread_timeout = 240
timeout = 240
max_retries = 10
max_wait = 120
run_config = RunConfig(
    timeout=timeout,
    max_retries=max_retries,
    # thread_timeout=thread_timeout
)


In [6]:
from ragas.metrics import (
    context_precision,
    answer_relevancy,
    faithfulness,
    context_recall,
)
from ragas.metrics.critique import harmfulness

# list of metrics we're going to use
metrics = [
    faithfulness,
    answer_relevancy,
    context_recall,
    context_precision,
    harmfulness,
]


In [7]:
import os

azure_configs = {
    "base_url": os.getenv("AZURE_OPENAI_ENDPOINT"),
    "model_deployment": "gpt-4o",
    "model_name": "gpt-4o",
    "embedding_deployment": "text-embedding-ada-002",
    "embedding_name": "text-embedding-ada-002",  # most likely
}

azure_configs


{'base_url': None,
 'model_deployment': 'gpt-4o',
 'model_name': 'gpt-4o',
 'embedding_deployment': 'text-embedding-ada-002',
 'embedding_name': 'text-embedding-ada-002'}

In [10]:
from langchain_openai.chat_models import AzureChatOpenAI
from langchain_openai.embeddings import AzureOpenAIEmbeddings
from ragas import evaluate
from dotenv import find_dotenv, load_dotenv

# Set OpenAI API key
load_dotenv(find_dotenv())
os.environ["OPENAI_API_KEY"] = str(os.getenv("OPENAI_API_KEY"))

import sys
import os

# Get the absolute path to the src directory
src_path = os.path.abspath("..")

# Add the src directory to the sys.path
if src_path not in sys.path:
    sys.path.insert(0, src_path)

# Set the __package__ attribute to simulate running as a package
__package__ = "src"

azure_model = AzureChatOpenAI(
    openai_api_version="2023-05-15",
    azure_endpoint=azure_configs["base_url"],
    azure_deployment=azure_configs["model_deployment"],
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),
    model=azure_configs["model_name"],
    validate_base_url=False,
)

# init the embeddings for answer_relevancy, answer_correctness and answer_similarity
azure_embeddings = AzureOpenAIEmbeddings(
    openai_api_version="2024-02-01",
    azure_endpoint=azure_configs["base_url"],
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),
    azure_deployment=azure_configs["embedding_deployment"],
    model=azure_configs["embedding_name"],
)


In [11]:
messages = [
    (
        "system",
        "You are a helpful assistant that translates English to French. Translate the user sentence.",
    ),
    ("human", "I love programming."),
]
ai_msg = azure_model.invoke(messages)
ai_msg


AIMessage(content="J'aime la programmation.", response_metadata={'token_usage': {'completion_tokens': 5, 'prompt_tokens': 31, 'total_tokens': 36}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_80a1bad4c7', 'finish_reason': 'stop', 'logprobs': None, 'content_filter_results': {}}, id='run-4ca1933a-2c0f-4592-8b32-9302cc8e3665-0', usage_metadata={'input_tokens': 31, 'output_tokens': 5, 'total_tokens': 36})

In [12]:
embeddings = await azure_embeddings.aembed_documents(["hello world", "goodbye world"])

embeddings[0][:4]


[-0.0148448646068573,
 0.001334490138106048,
 -0.018493514508008957,
 -0.031138652935624123]

In [13]:
from ragas import evaluate

result = evaluate(
    dataset,
    metrics=metrics,
    llm=azure_model,
    embeddings=azure_embeddings,
    run_config=run_config,
)


Evaluating: 100%|██████████| 10/10 [01:16<00:00,  7.65s/it]


In [51]:
from ragas.run_config import RunConfig

thread_timeout = 240
timeout = 240
max_retries = 10
max_wait = 120
run_config = RunConfig(
    timeout=timeout,
    max_retries=max_retries,  # , thread_timeout=thread_timeout
)

result = evaluate(
    amnesty_qa["eval"],
    metrics=metrics,
    llm=azure_model,
    embeddings=azure_embeddings,
    run_config=run_config,
)

result


AttributeError: 'AzureOpenAI' object has no attribute 'set_run_config'