# Evaluation of Large Language Models (LLMs) for Foundational Nature AI

This notebook outlines a structured approach to evaluating large language models (LLMs) as prototypes for a Foundational Nature AI. This initiative aims to leverage LLMs to inform questions about biodiversity and conservation, crucial for making real-world decisions. The concept emerged from the OpenNature initiative, a key output of the Nature Data Unlocked workshop held in Cambridge in late May. Proposed by Drew Purves at Google DeepMind, with Vizzuality offering technical implementation support, this project seeks to explore the potential of LLMs in environmental science and policy.

The evaluation employs three distinct systems, each designed to test the capabilities and flexibility of LLMs under different conditions, specifically tailored to address issues related to biodiversity and conservation. The goal is to understand how these models perform in scenarios ranging from zero-shot learning to the utilization of customized information, with a focus on their application in environmental decision-making. Below is a detailed description of each system:

- **System 1 (S1) - Zero-Shot Learning:** This approach evaluates the LLM's ability to understand and respond to queries about biodiversity and conservation without any prior specific training or context. It tests the model's inherent knowledge and its capacity to infer answers based on that knowledge, crucial for addressing unexpected or novel environmental issues.

- **System 2 (S2) - Retrieval Augmented Generation:** This system introduces a default dossier or a set of background information related to environmental science into the prompt. The purpose is to assess how well the LLM can utilize additional context provided at the time of the query to enhance its responses. This tests the model's ability to integrate new, relevant information with its pre-existing knowledge base, a key factor in providing accurate and useful information for conservation efforts. The use of techniques like Retrieval Augmented Generation (RAG) could potentially enhance this process by dynamically retrieving and incorporating external knowledge sources.

- **System 3 (S3) - Custom Dossier:** Unlike S2, which uses a generic set of information, S3 involves creating a custom dossier tailored to the specific query or topic at hand, related to biodiversity and conservation. This method aims to evaluate the LLM's performance when given highly relevant and targeted information, testing its ability to process and apply specific details to generate accurate and context-aware responses, essential for informed decision-making in environmental conservation.

The evaluation of LLMs through these systems will provide insights into their adaptability, information processing capabilities, and overall performance in diverse scenarios related to environmental science and policy. This structured approach allows for a comprehensive assessment of LLMs as prototypes for a Foundational Nature AI, highlighting their strengths and areas for improvement in contributing to biodiversity and conservation efforts.

## Table of Contents

1. [System 1 (S1) - Zero-Shot Learning](#s1)
2. [System 2 (S2) - Retrieval Augmented Generation](#s2)
3. [System 3 (S3) - Custom Dossier](#s3)

## Setup

### Library import


In [1]:
import os
from typing import Type, Union

import requests
from dotenv import load_dotenv
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.document_loaders import PyPDFLoader
from langchain.prompts import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    SystemMessagePromptTemplate,
)
from langchain.pydantic_v1 import BaseModel, Field
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.tools import BaseTool
from langchain_chroma import Chroma
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

**Add API Keys to environment variables**

In [10]:
load_dotenv()

os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")
os.environ["TAVILY_API_KEY"] = os.getenv("TAVILY_API_KEY")
os.environ["INATURALIST_API_KEY"] = os.getenv("INATURALIST_API_KEY")

<a id='s1'></a>
## 1. System 1 (S1) - Zero-Shot Learning

### Description

System 1 evaluates the LLM's ability to understand and respond to queries about biodiversity and conservation without any prior specific training or context. This approach tests the model's inherent knowledge and its capacity to infer answers based on that knowledge, crucial for addressing unexpected or novel environmental issues.

**Create the prompt template:** 

In [11]:
system_template = """You are a Foundational Nature AI capable of informing questions about
biodiversity and conservation relevant for real-world decisions."""
system_message_prompt = SystemMessagePromptTemplate.from_template(system_template)

human_template = "{text}"
human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)

chat_prompt_template = ChatPromptTemplate.from_messages(
    [system_message_prompt, human_message_prompt]
)

**Initialize the chat model**

In [12]:
llm_open_ai = ChatOpenAI(model="gpt-4o", max_tokens=1024)

**Invoke the chat model with the prompt**

In [19]:
prompt = chat_prompt_template.format_prompt(
    text="""
    Can you tell me if the Fosa (Cryptoprocta ferox) can be found in:
    Parque Nacional de Gorongosa, Mozambique
    Vilankulo, Mozambique
    Betroka, Madagascar
    Ambalabongo, Madagascar
    Please respond with one of the following 5 classes of probability that the species will be found there: almost certainly not, unlikely, possibly ,likely, almost certainly.
    Please provide your response in a tabular format with justification text after the table.
    
    """
).to_messages()

response = llm_open_ai.invoke(prompt)
print(response.content)

| Location                      | Probability              |
|-------------------------------|--------------------------|
| Parque Nacional de Gorongosa  | Almost certainly not     |
| Vilankulo, Mozambique         | Almost certainly not     |
| Betroka, Madagascar           | Likely                   |
| Ambalabongo, Madagascar       | Likely                   |

### Justification:

1. **Parque Nacional de Gorongosa, Mozambique**:
   - The fosa (Cryptoprocta ferox) is endemic to Madagascar and does not naturally occur outside the island. Therefore, it is almost certainly not found in Parque Nacional de Gorongosa in Mozambique.

2. **Vilankulo, Mozambique**:
   - Similar to Parque Nacional de Gorongosa, Vilankulo is located in Mozambique, a country on the African continent. The fosa is endemic to Madagascar, making it almost certainly not present in Vilankulo.

3. **Betroka, Madagascar**:
   - Betroka is located in Madagascar, within the range where the fosa is known to inhabit. The fo

In [14]:
prompt = chat_prompt_template.format_prompt(
    text="""
    If I want to avoid disturbing the habitat of the Yangtze Giant Softshell Turtle (Rafetus swinhoei), where is it likely to be found and therefore where should I avoid.
    """
).to_messages()

response = llm_open_ai.invoke(prompt)
print(response.content)

The Yangtze Giant Softshell Turtle (Rafetus swinhoei), one of the world's most critically endangered turtles, historically inhabited various parts of China and Vietnam. As of recent reports, only a few individuals are known to exist, and their locations are highly restricted.

Key areas to avoid to prevent disturbing their habitat include:

1. **Dong Mo Lake, Vietnam**: This is one of the known habitats with a confirmed individual.
2. **Xuan Khanh Lake, Vietnam**: Another location where an individual was confirmed in recent years.
3. **Zoos or conservation facilities**: Some individuals may be in captivity for breeding and conservation efforts, primarily in China.

Given the extreme rarity of the species and the ongoing conservation efforts, it's crucial to stay updated with the latest information from conservation organizations and local authorities if you are near these areas. Avoiding these specific lakes and closely following guidelines from environmental agencies can help mitigate

<a id='s2'></a>
## 2. System 2 (S2) - [Retrieval Augmented Generation](https://python.langchain.com/v0.2/docs/tutorials/rag/)

### Description

System 2 introduces a default dossier or a set of background information related to environmental science into the prompt. The purpose is to assess how well the LLM can utilize additional context provided at the time of the query to enhance its responses. This tests the model's ability to integrate new, relevant information with its pre-existing knowledge base, a key factor in providing accurate and useful information for conservation efforts. The use of techniques like Retrieval Augmented Generation (RAG) could potentially enhance this process by dynamically retrieving and incorporating external knowledge sources.


### Key documents for the default dossier:

For System 2 (S2) we should provide a comprehensive and general set of background information related to biodiversity and conservation. Here are some key documents and sources that could serve as a robust dossier for this purpose:

1. **The Convention on Biological Diversity (CBD) - Global Biodiversity Outlook 5 (GBO-5)**:
   - This document provides a thorough overview of the state of biodiversity globally, including key indicators, trends, and policy recommendations. It offers a broad context and is a reliable source of up-to-date information.
   - [Global Biodiversity Outlook 5](https://www.cbd.int/gbo/gbo5/publication/gbo-5-en.pdf)

2. **The Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES) - Global Assessment Report**:
   - This report assesses the state of biodiversity and ecosystem services worldwide, offering valuable insights into the drivers of biodiversity loss and potential policy responses.
   - [IPBES Global Assessment Report](https://ipbes.net/global-assessment)

3. **The International Union for Conservation of Nature (IUCN) - Red List of Threatened Species**:
   - The IUCN Red List provides comprehensive data on the global conservation status of animal, fungi, and plant species. It is an essential resource for understanding the risks faced by various species and the efforts needed to conserve them.
   - [IUCN Red List](https://www.iucnredlist.org/)

4. **The Millennium Ecosystem Assessment (MEA)**:
   - This assessment evaluates the consequences of ecosystem change for human well-being and establishes the scientific basis for actions needed to enhance the conservation and sustainable use of ecosystems.
   - [Millennium Ecosystem Assessment](https://www.millenniumassessment.org/en/index.html)

5. **The United Nations Environment Programme (UNEP) - Global Environment Outlook (GEO-6)**:
   - GEO-6 offers an extensive review of the global environmental situation, including biodiversity, and provides an analysis of the progress made in meeting internationally agreed environmental goals.
   - [UNEP GEO-6](https://www.unep.org/resources/global-environment-outlook-6)

6. **The World Wildlife Fund (WWF) - Living Planet Report**:
   - This report provides an annual snapshot of the state of the world's biodiversity based on the Living Planet Index, highlighting the pressures faced by natural ecosystems and the urgent need for conservation efforts.
   - [WWF Living Planet Report](https://livingplanet.panda.org/)

Using these documents, we can compile a dossier that includes key sections and summaries from each source, offering a well-rounded set of background information to support the LLM in generating informed and contextually relevant responses to biodiversity and conservation queries.

### Example queries

To test if the LLM has acquired knowledge from the Global Biodiversity Outlook 5 (GBO-5), we can frame the queries around the specific content and findings of the report. Here are some example queries:

1. **Key Findings and Trends:**
   - "What are the current global trends in biodiversity loss and what are the major factors contributing to these trends?"
   - "What has been the overall progress towards global biodiversity targets in recent years?"

2. **Biodiversity Targets:**
   - "What are the major international biodiversity targets and how successful have global efforts been in meeting these targets?"
   - "Can you describe the status and progress of the Aichi Biodiversity Targets?"

3. **Drivers of Biodiversity Loss:**
   - "What are the main drivers of biodiversity loss identified in recent global assessments?"
   - "How does climate change impact biodiversity on a global scale?"

4. **Conservation Efforts:**
   - "What strategies and actions are recommended to enhance biodiversity conservation worldwide?"
   - "How effective are protected areas in conserving biodiversity and what improvements are suggested?"

5. **Policy and Recommendations:**
   - "What policy measures are recommended to improve biodiversity conservation and integrate it into economic and development planning?"
   - "What are some key policy recommendations for achieving sustainable biodiversity management?"

6. **Regional Analysis:**
   - "What are the biodiversity trends and challenges in different world regions?"
   - "Can you discuss specific challenges and opportunities for biodiversity conservation in [specific region]?"

7. **Future Projections:**
   - "What are the projected scenarios for global biodiversity under different future conditions?"
   - "How might global biodiversity change by 2050 under various conservation and development scenarios?"

8. **Indicators and Data:**
   - "What indicators are commonly used to assess the state of global biodiversity?"
   - "What is the significance of the Living Planet Index in monitoring biodiversity trends?"

These queries are designed to elicit responses that reflect the content and insights provided in the GBO-5 report, without directly referencing the report itself. This will help assess whether the LLM has internalized and can apply the information from the dossier in a meaningful way.

### [Build a RAG system](https://python.langchain.com/v0.2/docs/tutorials/pdf_qa/)

A typical RAG application has two main components:

**Indexing**: a pipeline for ingesting data from a source and indexing it. This usually happens offline.

**Retrieval and generation**: the actual RAG chain, which takes the user query at run time and retrieves the relevant data from the index, then passes that to the model.

#### **Indexing**
1. **Load:** First we need to load our data. This is done with [Document Loaders](https://python.langchain.com/v0.2/docs/concepts/#document-loaders).

In [5]:
loader = PyPDFLoader(file_path="../data/retrieval_dossier/IPBES GLOBAL REPORT_FULL_DIGITAL_NOV 2021.pdf")
docs = loader.load()
print(len(docs))

1148


2. **Split:** [Text splitters](https://python.langchain.com/v0.2/docs/concepts/#text-splitters) break large `Documents` into smaller chunks. This is useful both for indexing data and for passing it in to a model, since large chunks are harder to search over and won't fit in a model's finite context window.

In [6]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)

3. **Store:** We need somewhere to store and index our splits, so that they can later be searched over. This is often done using a [VectorStore](https://python.langchain.com/v0.2/docs/concepts/#vector-stores) and [Embeddings](https://python.langchain.com/v0.2/docs/concepts/#embedding-models) model.

In [7]:
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(documents=splits, embedding=embeddings)

#### **Retrieval and generation**
4. **Retrieve:** Given a user input, relevant splits are retrieved from storage using a 
[Retriever](https://python.langchain.com/v0.2/docs/concepts/#retrievers).

In [8]:
retriever = vectorstore.as_retriever()

5. **Generate:** A [ChatModel](https://python.langchain.com/v0.2/docs/concepts/#chat-models) / 
[LLM](https://python.langchain.com/v0.2/docs/concepts/#llms) produces an answer using a prompt that 
includes the question and the retrieved data

    We'll use some built-in helpers to construct the final rag_chain:

In [9]:
system_prompt = (
    "You are a Foundational Nature AI capable of informing questions about "
    "biodiversity and conservation relevant for real-world decisions. "
    "Use the following pieces of retrieved context to answer "
    "the question. If you don't know the answer, say that you "
    "don't know. "
    "\n\n"
    "{context}"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)

In [24]:
llm_open_ai = ChatOpenAI(model="gpt-4o", max_tokens=1024)

question_answer_chain = create_stuff_documents_chain(llm_open_ai, prompt)
rag_chain = create_retrieval_chain(retriever, question_answer_chain)

results = rag_chain.invoke(
    {
        "input":
            "Can you tell me if the Fosa (Cryptoprocta ferox) can be found in:"
            "Parque Nacional de Gorongosa, Mozambique"
            "Vilankulo, Mozambique"
            "Betroka, Madagascar"
            "Ambalabongo, Madagascar"
            "Please respond with one of the following 5 classes of probability that the species will be found there: almost certainly not, unlikely, likely, almost certainly."
            "Please provide your response in a tabular format with justification text and sources of information after the table."
    }
)

results

{'input': 'Can you tell me if the Fosa (Cryptoprocta ferox) can be found in:Parque Nacional de Gorongosa, MozambiqueVilankulo, MozambiqueBetroka, MadagascarAmbalabongo, MadagascarPlease respond with one of the following 5 classes of probability that the species will be found there: almost certainly not, unlikely, possibly ,likely, almost certainly.Please provide your response in a tabular format with justification text after the table.',
 'context': [Document(metadata={'page': 355, 'source': '../data/raw/IPBES GLOBAL REPORT_FULL_DIGITAL_NOV 2021.pdf'}, page_content='30, no. 4 (August 1, 2016): 783–91.  https://\ndoi.org/10.1111/cobi.12664\nMoodley, Yoshan, Isa-Rita M. Russo, \nDesiré L. Dalton, Antoinette Kotzé, \nShadrack Muya, Patricia Haubensak, \nBoglárka Bálint, et al.  “Extinctions, \nGenetic Erosion and Conservation Options \nfor the Black Rhinoceros (Diceros Bicornis).” \nScientific Reports  7 (February 2017): \n41417.  https://doi.org/10.1038/srep41417\nMoore, Jonathan Harry, 

In [25]:
print(results.get("answer"))

| Location                     | Probability            |
|------------------------------|------------------------|
| Parque Nacional de Gorongosa, Mozambique | Almost certainly not |
| Vilankulo, Mozambique           | Almost certainly not |
| Betroka, Madagascar              | Almost certainly     |
| Ambalabongo, Madagascar          | Almost certainly     |

**Justification:**

- **Parque Nacional de Gorongosa, Mozambique:**
  The Fosa (Cryptoprocta ferox) is an endemic species to Madagascar and is not found outside the island. Therefore, it is almost certainly not present in Parque Nacional de Gorongosa in Mozambique.

- **Vilankulo, Mozambique:**
  Similar to Parque Nacional de Gorongosa, Vilankulo is located in Mozambique, and given that the Fosa is endemic to Madagascar, it is almost certainly not found in Vilankulo.

- **Betroka, Madagascar:**
  Betroka is a region in Madagascar. Since the Fosa is endemic to Madagascar and widely distributed across various habitats on the islan

<a id='s3'></a>
## 3. System 3 (S3) - Custom Dossier

### Description

System 3 involves creating a custom dossier tailored to the specific query or topic at hand, related to biodiversity and conservation. This method aims to evaluate the LLM's performance when given highly relevant and targeted information, testing its ability to process and apply specific details to generate accurate and context-aware responses, essential for informed decision-making in environmental conservation.

### [Tool calling agent](https://python.langchain.com/v0.1/docs/modules/agents/agent_types/tool_calling/)

One possibility to create a custom "dossier" is to use a tool calling agent. This agent can interact with external tools or APIs to gather relevant information based on the query. For example, the agent could access databases of species information, conservation projects, or environmental policies to provide detailed and up-to-date data for the LLM to generate responses.

### Example APIs

Here are several APIs available that provide information about the location and distribution of species. Here are a few notable ones:

1. **GBIF (Global Biodiversity Information Facility) API**:
   - The GBIF API provides access to a vast amount of biodiversity data, including species occurrences and distributions.
   - [GBIF API Documentation](https://www.gbif.org/developer/summary)

2. **iNaturalist API**:
   - iNaturalist is a platform where users can share observations of biodiversity. The API allows access to data about species observations.
   - [iNaturalist API Documentation](https://www.inaturalist.org/pages/api+reference)

3. **Map of Life API**:
   - Map of Life integrates diverse data sources to provide information on species distributions and biodiversity patterns.
   - [Map of Life API Documentation](https://mol.org/api/)

4. **eBird API**:
   - eBird is a platform for bird watchers to record sightings. The API provides data on bird species distributions and observations.
   - [eBird API Documentation](https://documenter.getpostman.com/view/664302/S1ENwy59)

5. **VertNet API**:
   - VertNet provides access to vertebrate biodiversity data, including specimen records and observation data.
   - [VertNet API Documentation](https://github.com/VertNet/webapp/wiki/Introduction-to-the-VertNet-API)

These APIs can provide detailed information about species' locations and distribution patterns. We can use them to query specific species and regions to gather the data we need.


### [Build Tool Calling Agent](https://blog.langchain.dev/tool-calling-with-langchain/)

LLMs can interact with external data sources via tool calling functionality. Tool calling is a powerful technique that allows developers to build sophisticated applications that can leverage LLMs to access, interact and manipulate external resources like databases, files and APIs.

A typical tool calling agent has three main components:

**Initialize Tools:** First, we need to set up the tools that the agent will interact with.

**Create Agent:** Next, we create the agent that will call these tools based on the user's query.

**Invoke Agent:** Finally, we invoke the agent with a query to retrieve the relevant information from the external tools.

Here's a step-by-step guide to building a tool calling agent using Langchain:

#### **Create Tools**

This demo uses [Tavily](https://app.tavily.com/) search engine and iNaturalist API, but you can also swap in any other [built-in tool](https://python.langchain.com/v0.1/docs/integrations/tools/) or add [custom tools](https://python.langchain.com/v0.1/docs/modules/tools/custom_tools/). 

**Create Tavily search tool**

You'll need to [sign up](https://app.tavily.com/sign-in) for an API key and add it to the environment variables.

In [6]:
search_tool = TavilySearchResults(max_results=1)

**Create iNaturalist tool**

You'll need to [sign up](https://www.inaturalist.org/users/sign_up) for an API key and add it to the environment variables.

In [7]:
class RequestInput(BaseModel):
    """
    Request input class for the iNaturalist tool.
    """

    taxon: Union[str, int] = Field(description="Taxon name or ID of a species to search for.")


class INaturalistTool(BaseTool):
    """
    Tool for querying the iNaturalist API to get information about places
    where a species can be found.
    """

    name = "inaturalist"
    description = "This tool is useful for when you need to get information about places where a"
    "species can be found."
    args_schema: Type[BaseModel] = RequestInput
    api_token = os.getenv("INATURALIST_API_KEY")

    def _run(self, taxon):
        headers = {"Authorization": f"Bearer {self.api_token}"}
        params = {
            "taxon": taxon,
        }
        response = requests.get(
            "https://www.inaturalist.org/places.json", headers=headers, params=params
        )
        if response.status_code == 200:
            data = response.json()
            filtered_data = [
                {
                    "name": item["name"],
                    "latitude": item["latitude"],
                    "longitude": item["longitude"],
                    "place_type_name": item["place_type_name"],
                }
                for item in data
            ]
            return filtered_data
        else:
            return {"error": response.status_code, "message": response.text}


# Initialize the tool with your iNaturalist API token
inat_tool = INaturalistTool()

#### **Create Agent**
Next, let's initialize our tool calling agent

In [12]:
llm_open_ai = ChatOpenAI(model="gpt-4-turbo", max_tokens=1024)

In [19]:
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a Foundational Nature AI capable of informing questions about"
            "and conservation relevant for real-world decisions. If you don't get any result from"
            "the inaturalist tool use the tavily_search_results_json tool.",
        ),
        ("placeholder", "{chat_history}"),
        ("human", "{input}"),
        ("placeholder", "{agent_scratchpad}"),
    ]
)

tools = [search_tool, inat_tool]

# Construct the Tools agent
agent = create_tool_calling_agent(llm_open_ai, tools, prompt)

#### **Run Agent**
Now, let's initialize the executor that will run our agent and invoke it!

In [20]:
# Create an agent executor by passing in the agent and tools
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

In [21]:
response = agent_executor.invoke(
    {
        "input": """Can you tell me if the Fosa (Cryptoprocta ferox) can be found in:
Parque Nacional de Gorongosa, Mozambique
Vilankulo, Mozambique
Betroka, Madagascar
Ambalabongo, Madagascar
Please respond with a score for each location between 0 and 1, where 1 indicates that you think
the species is very likely to be present there.
"""
    }
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `inaturalist` with `{'taxon': 'Cryptoprocta ferox'}`


[0m[33;1m[1;3m[{'name': 'Madagascar', 'latitude': '-19.2909168493', 'longitude': '46.7211052153', 'place_type_name': 'Country'}, {'name': 'Antsiranana', 'latitude': '-13.7810048764', 'longitude': '49.3663842667', 'place_type_name': 'State'}, {'name': 'Antananarivo', 'latitude': '-18.9357201709', 'longitude': '46.8045208501', 'place_type_name': 'State'}, {'name': 'Fianarantsoa', 'latitude': '-21.8663086766', 'longitude': '46.8886047965', 'place_type_name': 'State'}, {'name': 'Mahajanga', 'latitude': '-16.5077797698', 'longitude': '46.6446854389', 'place_type_name': 'State'}, {'name': 'Toamasina', 'latitude': '-17.8005654114', 'longitude': '48.7995072598', 'place_type_name': 'State'}, {'name': 'Toliary', 'latitude': '-22.6278273273', 'longitude': '44.9957564578', 'place_type_name': 'State'}, {'name': 'Itasy', 'latitude': '-19.0538111477', 'longitude': '46.8915

In [22]:
print(response.get("output"))

Based on the available information, here is the likelihood of finding the Fosa (Cryptoprocta ferox) in each of the specified locations:

1. **Parque Nacional de Gorongosa, Mozambique**: The information obtained does not specifically mention the presence of Fosa in Gorongosa National Park. Given that Fosa are endemic to Madagascar, the likelihood of their presence in Gorongosa is extremely low. **Score: 0.05**

2. **Vilankulo, Mozambique**: Similar to Gorongosa, there is no specific mention of Fosa in Vilankulo, and considering that Fosa are not native to Mozambique, the likelihood of finding them here is also very low. **Score: 0.05**

3. **Betroka, Madagascar**: The information indicates that the Fosa is the largest wild carnivore in Madagascar, with a geographical range covering the entire island, including all remaining forests. Given this broad range and Betroka's location within Madagascar, there is a high likelihood of Fosa presence. **Score: 0.9**

4. **Ambalabongo, Madagascar**