<img src="https://imagedelivery.net/Dr98IMl5gQ9tPkFM5JRcng/3e5f6fbd-9bc6-4aa1-368e-e8bb1d6ca100/Ultra" alt="Image description" width="160" />

<br/>

# Introduction to Contextual AI Platform using the API

Contextual AI lets you create and use generative AI agents. This notebook introduces an end-to-end example workflow for creating a Retrieval-Augmented Generation (RAG) agent for a financial use case. The agent will answer questions based on the documents provided, but avoid any forward looking statements, e.g., Tell me about sales in 2028. This notebook uses the API, there is another notebook using the python client.

This notebook covers the following steps:
- Creating a Datastore
- Ingesting Documents
- Creating an RAG Agent
- Querying an RAG Agent
- Evaluating an RAG Agent
- Improving the RAG Agent (Updating prompt and tuning)

With the exception of the tuning model, the rest of the notebook can be run in under 15 minutes. 
The full documentation is available at [docs.contextual.ai](https://docs.contextual.ai/)

## Prerequisites:

- API key, please contact Contextual AI's sales team to get your API key.

- Data files, this demo also uses 3 files, an ingested document, evaluation dataset, and a training dataset. These are toy datasets to illustrate the functionality and are built towards a use case for training a RAG agent to avoid forward looking statements.

      Ingestion: `Apple.pdf`

      Evaluation: `eval_short.csv`

      Training: `fin_train.jsonl`

## Setup of Libraries, API, Dataset

To begin, you will need an API key to securely access the API. To generate an API key, your admin can follow the process below:

1.   Log into your tenant at app.contextual.ai
2.   Click on "API Keys"
3.   Click on "Create API Key"
4.   Please keep your key in a secure place, and do not share it with anyone

In [2]:
import os
# Set up your API key here - i have it in my environment variables
API_KEY = os.environ["CONTEXTUAL_API_KEY"]


In [3]:
REQUEST_URL = 'https://api.contextual.ai/v1'

In [4]:
# Import libraries
import requests
import json
from pathlib import Path
from typing import List, Optional, Dict
from IPython.display import display, JSON
import pandas as pd

In [5]:
# Helper function to define headers for API calls
def get_headers(content_type: str = "application/json") -> Dict[str, str]:
    """
    Generate headers for API requests
    """
    return {
        "accept": "application/json",
        "Content-Type": content_type,
        "Authorization": f"Bearer {API_KEY}"
    }

## Step 1: Create your Datastore


You will need to first create a datastore for your agent using the  /datastores endpoint. A datastore is secure storage for data. Each agent will have it's own datastore for storing data securely.

In [6]:
def create_datastore(name: str) -> Dict:
    """
    Create a new datastore using the Contextual AI API.

    Args:
        name: Name of the datastore

    Returns:
        Dict: The JSON response from the API
    """
    url = f"{REQUEST_URL}/datastores"

    payload = {"name": name}

    try:
        response = requests.post(url, headers=get_headers(), json=payload)
        response.raise_for_status()
        return response.json()

    except requests.exceptions.RequestException as e:
        print(f"Error creating datastore: {e}")
        raise

In [None]:
try:
    result = create_datastore(name="Demo_fin_rag") #TODO: Set a name for your datastore
    datastore_id = result['id']
    print(f"Datastore ID: {datastore_id}")
except Exception as e:
    print(f"Failed to create datastore: {e}")

## Step 2: Ingest Documents into your Datastore

You can now ingest documents into your Agent's datastore using the /datastores endpoint. Documents must be a PDF or HTML file.


I am using a example PDF. You can also use your own documents here.

In [8]:
def ingest_file_to_datastore(datastore_id: str, file_path: str):
   """
   Upload a local file to Contextual AI datastore

   Args:
       datastore_id: ID of the target datastore
       file_path: Path to local file to upload
   """
   try:
       url = f"{REQUEST_URL}/datastores/{datastore_id}/documents"

       with open(file_path, 'rb') as f:
           response = requests.post(
               url,
               headers={
                   'accept': 'application/json',
                   'authorization': f"Bearer {API_KEY}"
               },
               files={'file': f}
           )
           response.raise_for_status()
           print(f"Successfully uploaded {file_path} to datastore {datastore_id}")
           return response.json()

   except requests.exceptions.RequestException as e:
       print(f"Upload failed: {str(e)}")
       raise

In [None]:
result = ingest_file_to_datastore(datastore_id, 'data/Apple.pdf')
document_id = result['id']

Once ingested, you can view the list of documents, see their metadata, and also delete documents.

In [176]:
def get_document_metadata(datastore_id, document_id):
    """
    Fetch metadata for a specific document from a datastore.

    Parameters:
        datastore_id (str): The unique ID of the datastore.
        document_id (str): The unique ID of the document.
        api_key (str): The API key for authentication.

    Returns:
        dict: A dictionary containing the document metadata if the request is successful.
        None: If the request fails, return None.

    Raises:
        Exception: For HTTP errors or unexpected responses.

    Usage:
        metadata = get_document_metadata("datastore123", "document456")
        if metadata:
            print("Metadata retrieved:", metadata)
        else:
            print("Failed to fetch metadata.")
    """
    url = f"https://api.contextual.ai/v1/datastores/{datastore_id}/documents/{document_id}/metadata"
    headers = {
        "accept": "application/json",
        'authorization': f"Bearer {API_KEY}"
    }

    response = requests.get(url, headers=headers)

    if response.status_code == 200:
        return response.json()  # Assuming the response is in JSON format
    else:
        print(f"Error: Failed to fetch metadata (HTTP {response.status_code}).")
        print("Response:", response.text)
        return None

In [None]:
metadata = get_document_metadata(datastore_id, document_id)
metadata

## Step 3: Create your Agent

Next let's create the Agent and modify it to our needs.


In [178]:
def create_agent(
    name: str,
    description: str,
    system_prompt: Optional[str] = None,
    datastore_ids: Optional[List[str]] = None
) -> Dict:
    """
    Create a new agent in Contextual AI

    Args:
        name: Name of the agent
        description: Description of the agent
        system_guidelines: Guidelines for the system
        datastore_ids: Optional list of datastore IDs

    Returns:
        JSON response from the API
    """
    url = f"{REQUEST_URL}/agents"

    payload = {
        "name": name,
        "description": description,
        "system_prompt": system_prompt,
        "datastore_ids": datastore_ids or []
    }

    response = requests.post(
        url,
        headers=get_headers(),
        json=payload
    )
    response.raise_for_status()
    return response.json()

Some additional parameters include setting a system prompt or using a previously tuned model.

`system_prompt` is used for the instructions that your RAG system references when generating responses. Note that we do not guarantee that the system will follow these instructions exactly.

`llm_model_id` is the optional model ID of a tuned model to use for generation. Contextual AI will use a default model if none is specified.


In [179]:
# Sample prompt
system_prompt = '''
You are an AI assistant specialized in financial analysis and reporting. Your responses should be precise, accurate, and sourced exclusively from official financial documentation provided to you. Please follow these guidelines:

Data Analysis & Response Quality:
* Only use information explicitly stated in provided documentation (e.g., earnings releases, financial statements, investor presentations)
* Present comparative analyses using structured formats with tables and bullet points where appropriate
* Include specific period-over-period comparisons (quarter-over-quarter, year-over-year) when relevant
* Maintain consistency in numerical presentations (e.g., consistent units, decimal places)
* Flag any one-time items or special charges that impact comparability

Technical Accuracy:
* Use industry-standard financial terminology
* Define specialized acronyms on first use
* Never interchange distinct financial terms (e.g., revenue, profit, income, cash flow)
* Always include units with numerical values
* Pay attention to fiscal vs. calendar year distinctions
* Present monetary values with appropriate scale (millions/billions)

Response Format:
* Begin with a high-level summary of key findings when analyzing data
* Structure detailed analyses in clear, hierarchical formats
* Use markdown for lists, tables, and emphasized text
* Maintain a professional, analytical tone
* Present quantitative data in consistent formats (e.g., basis points for ratios)

Critical Guidelines:
* Do not make forward-looking projections unless directly quoted from source materials
* Avoid opinions, speculation, or assumptions
* If information is unavailable or irrelevant, clearly state this without additional commentary
* Answer questions directly, then stop
* Do not reference source document names or file types in responses
* Focus only on information that directly answers the query

For any analysis, provide comprehensive insights using all relevant available information while maintaining strict adherence to these guidelines and focusing on delivering clear, actionable information.
'''


In [None]:
# Now let's try creating an agent
try:
    app_response = create_agent(
        name="Demo-Finance Forward Looking",
        description="Research Agent using only Historical Information",
        system_prompt=system_prompt,
        datastore_ids=[datastore_id]
    )
    # Store agent ID for later use
    agent_id = app_response['id']
    print(f"Agent ID created: {agent_id}")
except requests.exceptions.RequestException as e:
    print(f"Error creating agent: {e}")

## Step 4: Query your Agent

Let's query our agent to see if its working. Let's pass in some objects and get a response.

The required information is the agent_id and messages.  

Optional information includes parameters for streaming, conversation_id, and model_id if using a different fine tuned model.

In [181]:
def query_agent(
   agent_id: str,
   messages: List[Dict[str, str]],
   model_id: Optional[str] = None,
   conversation_id: Optional[str] = None,
   stream: bool = False
):
   """
   Query a Contextual AI agent

   Args:
       agent_id: Agent ID of the agent to query
       messages: List of message dictionaries with 'content' and 'role' keys
       model_id: Optional model ID for specific fine-tuned model
       conversation_id: Optional conversation ID for message history
       stream: Whether to stream the response
   """
   try:
       url = f"{REQUEST_URL}/agents/{agent_id}/query"

       payload = {
           "messages": messages,
           "stream": stream
       }

       if model_id:
           payload["model_id"] = model_id
       if conversation_id:
           payload["conversation_id"] = conversation_id

       response = requests.post(
           url,
           headers=get_headers(),
           json=payload
       )
       response.raise_for_status()
       return response.json()

   except requests.exceptions.RequestException as e:
       print(f"Query failed: {str(e)}")
       raise

**Note:** It may take a few minutes for the document to be ingested and processed. The Assistant will give a detailed answer once the documents are ingested.

In [None]:
result = query_agent(
    agent_id=agent_id,
    messages=[{
        # Input your question here
        "content": "what is Apple revenue in 2022",
        "role": "user",
    }]
)

print(result["message"]["content"])

In [None]:
result['retrieval_contents']

## Step 5: Evaluate your Agent



Evaluation endpoints allow you to evaluate your Agent using a set of prompts (questions) and reference (gold) answers. We support two metrics: equivalence and groundedness.

Equivalance evaluates if the Agent response is equivalent to the ground truth (model-driven binary classification).  
Groundedness decomposes the Agent response into claims and then evaluates if the claims are grounded by the retrieved documents.

For those using unit tests, we also offer our `lmunit` endpoint, get more details [here](https://contextual.ai/blog/lmunit/) 

Lets start with an evaluation dataset:

In [None]:
eval = pd.read_csv('data/eval_short.csv')
eval.head()

Start an evaluation job that measures equivalence

In [185]:
def evaluate_with_csv(agent_id: str, csv_path: str, metrics: str = "equivalence"):
    """
    Evaluate an agent using a CSV file.

    Args:
        agent_id: ID of the agent to evaluate.
        csv_path: Path to the evaluation CSV file.
        metrics: Metrics to evaluate (e.g., "equivalence", "groundedness").

    Returns:
        dict: API response containing evaluation results or status.
    """
    url = f"https://api.contextual.ai/v1/agents/{agent_id}/evaluate"
    headers = {
        "accept": "application/json",
        "authorization": f"Bearer {API_KEY}"
    }
    files = {
        "evalset_file": (csv_path, open(csv_path, "rb"), "text/csv")
    }
    payload = {
        "metrics": metrics
    }

    try:
        response = requests.post(url, data=payload, files=files, headers=headers)
        response.raise_for_status()
        return response.json()
    except requests.exceptions.RequestException as e:
        print(f"Error during evaluation: {e}")
        raise
    finally:
        # Ensure file handle is closed properly
        files["evalset_file"][1].close()

In [None]:
# Example usage
csv_path = "data/eval_short.csv"
eval_result = evaluate_with_csv(agent_id, csv_path, metrics="equivalence")
print(eval_result)

Evaluation jobs can take time, especially longer ones. Here is how you can check on their status. This dataset usually takes a few minutes to run.

In [None]:
def get_evaluation_metadata(agent_id: str, job_id: str):
    """
    Retrieve evaluation metadata for a specific evaluation job.

    Args:
        agent_id: ID of the agent.
        job_id: ID of the evaluation job.

    Returns:
        dict: API response containing the evaluation metadata.
    """
    url = f"https://api.contextual.ai/v1/agents/{agent_id}/evaluate/jobs/{job_id}/metadata"
    headers = {
        "accept": "application/json",
        "authorization": f"Bearer {API_KEY}"
    }

    try:
        response = requests.get(url, headers=headers)
        response.raise_for_status()  # Raise an error for HTTP codes >= 400
        return response.json()
    except requests.exceptions.RequestException as e:
        print(f"Error fetching evaluation metadata: {e}")
        raise

metadata = get_evaluation_metadata(agent_id, eval_result['id'])
metadata

View our evaluation results

In [207]:
def get_evaluation_dataset(agent_id: str, dataset_name: str, batch_size: int = 64):
    """
    Retrieve evaluation dataset.

    Args:
        agent_id: ID of the agent.
        dataset_name: Name of the dataset.
        batch_size: Size of the batch (default: 64).

    Returns:
        bytes: API response containing the evaluation dataset in octet-stream format.
    """
    url = f"https://api.contextual.ai/v1/agents/{agent_id}/datasets/evaluate/{dataset_name}"
    headers = {
        "accept": "application/octet-stream",
        "authorization": f"Bearer {API_KEY}"
    }
    
    params = {
        "batch_size": batch_size
    }

    try:
        response = requests.get(url, headers=headers, params=params)
        response.raise_for_status()  # Raise an error for HTTP codes >= 400
        return response.content  # Using .content since we expect octet-stream
    except requests.exceptions.RequestException as e:
        print(f"Error fetching evaluation dataset: {e}")
        raise


eval_dataset = get_evaluation_dataset(agent_id=agent_id, dataset_name=metadata['dataset_name'])

Let's write this out into pandas and csv to make it a bit friendlier 

In [None]:
import pandas as pd
import json
import ast

def parse_jsonl_to_df(content):
    """
    Parse JSONL bytes content into a pandas DataFrame, with flattened results.
    
    Args:
        content (bytes): JSONL content in bytes format
    
    Returns:
        pd.DataFrame: Parsed DataFrame
    """
    # Decode bytes to string and split into lines
    lines = content.decode('utf-8').strip().split('\n')
    
    # Parse each line and flatten the results
    data = []
    for line in lines:
        entry = json.loads(line)
        
        # Parse the results string and flatten
        if 'results' in entry:
            results = ast.literal_eval(entry['results'])
            # Remove the original results field
            del entry['results']
            # Add flattened results
            if isinstance(results, dict):
                for key, value in results.items():
                    if isinstance(value, dict):
                        for subkey, subvalue in value.items():
                            entry[f'{key}_{subkey}'] = subvalue
                    else:
                        entry[key] = value
        
        data.append(entry)
    
    # Create DataFrame
    return pd.DataFrame(data)

# Example usage:
df = parse_jsonl_to_df(eval_dataset)
df.head()

In [193]:
df.to_csv('eval_results.csv', index=False)

## Step 6: Tune your Agent

Contexual AI allows you to tune your entire agent end to end for improvemed performance. To run a tune job, you need to specificy a training file and an optional test file. (If no test file is provided, the tuning job will hold out a portion of the training file as the test set.)

A tuning job requires fine tuning models and the expectation should be it will take a couple of hours to run.

After the tune job completes, the metadata associated with the tune job will include evaluation results and a model ID.

### 6.1 Format for the training file:

The file should be in JSON array format, where each element of the array is a JSON object represents a single training example. The four required fields are guideline, prompt, response, and knowledge.

- knowledge field should be an array of strings, each string representing a piece of knowledge that the model should use to generate the response.

- reference: The gold-standard answer to the prompt.

- guideline field should be guidelines for the expected response.

- prompt field should be a question or statement that the model should respond to.

In [None]:
!head data/fin_train.jsonl

### 6.2 Starting a Tuning Model Job

In [None]:
def tune_agent(agent_id: str, training_file: str):
   """
   Tune agent with training data file

   Args:
       agent_id: ID of agent to tune
       training_file: Path to JSON training data file with format:
           [{"guideline": str, "prompt": str, "response": str, "knowledge": List[str]}]

   Returns:
       dict: API response
   """
   url = f"{REQUEST_URL}/agents/{agent_id}/tune"
   headers = {
       "accept": "application/json",
       "Authorization": f"Bearer {API_KEY}"
   }

   with open(training_file, 'rb') as f:
       files = {'training_file': f}
       response = requests.post(url, headers=headers, files=files)
       return response.json()

job_id = tune_agent(agent_id, "data/fin_train.jsonl")
print (job_id)

 ### 6.3 Checking the Status.

 You can check the status of the job using the API. For detailed information, refer to the API documentation. When the tuning job is complete, the status will turn to completed. The response payload will also contain evaluation_results .

In [None]:
def get_tune_job_metadata(agent_id: str, job_id: str):
   """Get metadata for a specific tuning job"""
   url = f"{REQUEST_URL}/agents/{agent_id}/tune/jobs/{job_id}/metadata"
   headers = {
       "accept": "application/json",
       "Authorization": f"Bearer {API_KEY}"
   }
   response = requests.get(url, headers=headers)
   return response.json()


result = get_tune_job_metadata(agent_id, job_id['id'])
result

When the tuning job is complete, the metadata would look like the following:
```
{'job_status': 'completed',
 'evaluation_results': {'grounded_generation_train_test.json_equivalence': 1.0,
  'grounded_generation_train_test.json_helpfulness': 0.814156498263641,
  'grounded_generation_train_test.json_groundedness': 0.7781168677598632},
 'model_id': 'registry/model-ada3c484-3ce0f31f-llm-fd6c2'}
 ```

### 6.4 Updating the agent
Once the tuned job is complete, you can deploy the tuned model via editing the agent through API. Note that currently a single fine-tuned model deployment is allowed per tenant. Please see the API doc for more information.

In [None]:
def update_agent(agent_id: str, llm_model_id: str):
   """Update agent's LLM model"""
   url = f"{REQUEST_URL}/agents/{agent_id}"
   headers = {
       "accept": "application/json",
       "content-type": "application/json",
       "Authorization": f"Bearer {API_KEY}"
   }
   data = {"llm_model_id": llm_model_id}
   response = requests.put(url, headers=headers, json=data)
   return response.json()

update_agent(agent_id, result['model_id'])

## Next Steps

In this Notebook, we've created a RAG agent in the finance domain, evaluating the agent, and tuned it for better performance. You can learn more at [docs.contextual.ai](https://docs.contextual.ai/). Finally, reach out to your account team if you have further questions or issues.