In [30]:
from openai import OpenAI
import pandas as pd

def load_csvs(file_paths):
    """
    Load multiple CSV files into a dictionary of DataFrames.
    
    Parameters:
        file_paths (list): List of file paths to CSVs.
    
    Returns:
        dict: A dictionary where keys are filenames and values are DataFrames.
    """
    data = {}
    for path in file_paths:
        try:
            df = pd.read_csv(path)
            data[path] = df
        except Exception as e:
            print(f"Error loading {path}: {e}")
    return data

def csvs_to_prompt(data):
    """
    Convert CSV data into a readable format for ChatGPT.
    
    Parameters:
        data (dict): A dictionary where keys are filenames and values are DataFrames.
    
    Returns:
        str: A formatted string summarizing the CSV content.
    """
    summaries = []
    for name, df in data.items():
        summaries.append(f"### {name}:\n{df.head().to_string(index=False)}\n")
    return "\n".join(summaries)

def chat_with_gpt(api_key, prompt, system_prompt=None, csv_data=None, model="gpt-4o", temperature=0.7, max_tokens=1000):
    """
    Interact with OpenAI's ChatGPT API, optionally including CSV data.

    Parameters:
        api_key (str): Your OpenAI API key.
        prompt (str): The user prompt.
        system_prompt (str): The system prompt to guide the assistant's behavior.
        csv_data (dict): Dictionary of data summaries to include.
        model (str): The model to use.
        temperature (float): Controls randomness in the output.
        max_tokens (int): Maximum number of tokens for the response.

    Returns:
        str: The response text from the model.
    """
    # openai.api_key = api_key

    # Prepare messages
    messages = []
    if system_prompt:
        messages.append({"role": "system", "content": system_prompt})
    if csv_data:
        csv_summary = csvs_to_prompt(csv_data)
        messages.append({"role": "system", "content": f"The following datasets are available:\n{csv_summary}"})
    messages.append({"role": "user", "content": prompt})


    client = OpenAI(
        api_key=api_key,  # This is the default and can be omitted
    )

    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=temperature,
        max_tokens=max_tokens
    )

    return response.choices[0].message.content

from IPython.display import Markdown, display

def show_markdown(md_text):
    """
    Display the given Markdown text in a Jupyter Notebook.
    
    Parameters:
        md_text (str): Markdown text to render.
    """
    display(Markdown(md_text))


In [36]:
API_KEY = os.environ.get("OPENAI_API_KEY")

SYSTEM_PROMPT = """
You are a helpful data analysis assistant that produces insights against KBAs, supported by data analysis from the provided csvs.
Your responses should be scientific and in the form of a summary report, avoiding bullet points. Format place names in bold. Format species names in italics.

Do not comment on CO2e emissions or removals.

Begin by describing the portfolio as a whole using information from the siteDescription, habitatDescription, landUseRegimesAtSite and rationaleForSiteInformation columns.
Include species info (names, whether endangered) dominant habitats in this summary.

Then consider the portfolio of KBAs as a whole and comment on threats to biodiversity, highlighting the most at-risk species, and the biggest threats they face using threatsDescription.
You should include data provided in annual_loss and totals to calculate and report on tree cover loss and loss intensity (loss/extent) in order to justify statements.

Finally, close by naming the KBA most at risk based on species importance and deforestation intensity."""

USER_PROMPT = "Describe the portfolio of KBAs. This should be in a single summary report describing the portfolio in general and includes the species and habitats present, threats to biodiversity backed by calculations, and recommended actions."

CSV_FILES = ["./portfolio.csv", "./annual_loss.csv", "./totals.csv"]  #from https://docs.google.com/spreadsheets/d/1x4VTm2eCJOlj3_0Hf4hLqQgvZavfrHkFGzHRd9q25ew/edit?gid=1817969311#gid=1817969311

# Load and prepare data
csv_data = load_csvs(CSV_FILES)

# Interact with ChatGPT
response = chat_with_gpt(api_key=API_KEY, prompt=USER_PROMPT, system_prompt=SYSTEM_PROMPT, csv_data=csv_data)
if response:
    print("ChatGPT response:\n\n")
    show_markdown(response)

In [34]:
show_markdown(response)

The portfolio of Key Biodiversity Areas (KBAs) in **Peru** encompasses a diverse range of ecosystems and species, each with unique environmental characteristics and conservation needs. The KBAs under consideration include **Lomas de Ilo**, **Vista Alegre-Omia**, **Río Abiseo National Park**, **Inchatoshi Kametza - Pucuta**, and **Tres Quebradas y Shitariyacu**.

**Lomas de Ilo** is characterized by its arid and temperate climate with minimal precipitation, supporting ecosystems such as the Coastal desert and Loma. This KBA is home to the endangered cactus species *Corryocactus brachypetalus* and the bird *Sternula lorata*. **Vista Alegre-Omia** features montane Yungas forests and pluvial altitudinal montane forests, with a climate that is temperate and rainy year-round. It supports critically endangered species like the primate *Lagothrix flavicauda* and several bird species. **Río Abiseo National Park** is notable for its significant montane and altimontane Yunga forests and is home to vulnerable plant species such as *Laccopetalum giganteum* and *Puya medica*, as well as several range-restricted bird and amphibian species. **Inchatoshi Kametza - Pucuta** encompasses various Yunga forest types and supports critically endangered species like *Lagothrix flavicauda* and *Pristimantis boundites*. Lastly, **Tres Quebradas y Shitariyacu** is dominated by lower montane rainforest and high hill forest, providing habitat to the critically endangered primate *Plecturocebus oenanthe*.

The threats to biodiversity across these KBAs are varied and significant. Roads, agriculture, logging, and livestock farming are common threats that contribute to habitat alteration and biodiversity loss. **Lomas de Ilo** faces threats from road development, potentially impacting the delicate desert ecosystem and its unique species. **Vista Alegre-Omia** and **Río Abiseo National Park** are both threatened by logging, agriculture, and livestock activities, which could result in habitat degradation. **Inchatoshi Kametza - Pucuta** is particularly threatened by livestock farming and hunting, jeopardizing its diverse mammalian and avian populations. **Tres Quebradas y Shitariyacu** faces wood harvesting and livestock farming threats.

A quantitative analysis of deforestation provides further insight into these threats. For instance, **Lomas de Ilo** has reported no tree cover loss, indicating minimal deforestation pressure. In contrast, **Vista Alegre-Omia** has experienced significant deforestation, with a total loss of 56.95 hectares, and **Río Abiseo National Park** has faced a total tree cover loss of 20.15 hectares. **Inchatoshi Kametza - Pucuta** and **Tres Quebradas y Shitariyacu** have smaller losses of 10.53 and 4.42 hectares, respectively. The loss intensity, calculated as loss/extent, reveals that **Vista Alegre-Omia** has the highest deforestation pressure, suggesting a need for urgent conservation action.

Given the data, **Vista Alegre-Omia** stands out as the KBA most at risk due to its significant deforestation intensity and the presence of critically endangered species. Conservation efforts should prioritize implementing stricter protections against logging and promoting sustainable land use practices to preserve the biodiversity of this area. Additionally, enhancing management practices and engaging local communities in conservation efforts can help mitigate the threats faced by these vital ecosystems.