<img src="https://imagedelivery.net/Dr98IMl5gQ9tPkFM5JRcng/3e5f6fbd-9bc6-4aa1-368e-e8bb1d6ca100/Ultra" alt="Image description" width="160" />

Introduction to Contextual AI Platform

The Contextual APIs provide a simple interface to our state-of-the-art Contextual Language Models (CLMs). Use this guide to learn the basics of how to create your first application programmatically. In this demo, we will be tuning and deploying a CLM.

To run this notebook interactively, you can open it in Google Colab:

<a target="_blank" href="https://colab.research.google.com/github/ContextualAI/ContextualAI-Examples/blob/main/python/dataset-api-example.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

### Datasets
To begin, you will need an API key to securely access the API. Please contact Contextual's sales team to get your API key.

In [3]:


import requests
import json
import string
from typing import Dict, Optional
from pathlib import Path
import random

# Configuration
API_TOKEN = 'key-WSBcfgCgL8J_dST66R6CNEyO6gYhvKjLlFlfgSqkVNIVkV75U'  # Replace with your actual API token
BASE_URL = 'https://api.contextual.ai/v1'
BASE_URL = 'https://api.staging.ctxl.dev/v1'


def get_headers(content_type: str = "application/json") -> Dict[str, str]:
    """
    Generate headers for API requests

    Args:
        content_type: Content type for the request

    Returns:
        Dictionary containing request headers
    """
    return {
        "accept": "application/json",
        "Content-Type": content_type,
        "Authorization": f"Bearer {API_TOKEN}"
    }



### Create an Application

You will need to first create an application.

In [2]:
def create_application():

  url = f"{BASE_URL}/applications"

  payload = {
      "name": "string",
      "description": "string",
      "system_prompt": "string"
  }
  headers = {
      "accept": "application/json",
      "content-type": "application/json",
      "Authorization": f"Bearer {API_TOKEN}",
  }

  response = requests.post(url, json=payload, headers=headers)
  return json.loads(response.text)


response = create_application()
APPLICATION_ID = response['application_id']
APPLICATION_ID

'3ce0f31f-b262-4de6-aad8-411c5016b4ac'

### Dataset

In [3]:
import requests

def download_file_from_gdrive(file_id, output_file):
   # Direct download URL format for Google Drive
   url = f"https://drive.google.com/uc?export=download&id={file_id}"

   response = requests.get(url)
   with open(output_file, 'wb') as f:
       f.write(response.content)

# Extract file ID from the URL
file_id = "1MlsscmU9gtbCrZk3C_CNbCDvEkQ7AhUM"

# Download the file
download_file_from_gdrive(file_id, "dataset.jsonl")

### Tune a model

Create a tuning job for the specified Application. Tuning jobs are asynchronous tasks to specialize your Application to your specific domain or use case.

This API initiates a tuning specialization task using the provided `training_file` and an optional `test_file`. If no test_file is provided, the tuning job will hold out a portion of the training_file as the test set.

Returns a tune job id which can be used to check on the status of your tuning task through the GET `/tune/jobs/{job_id}/metadata` endpoint.

After the tune job completes, the metadata associated with the tune job will include evaluation results and a model ID.

In [5]:
def tune_application(application_id: str, training_file: str):
   """
   Tune application with training data file

   Args:
       application_id: ID of application to tune
       training_file: Path to JSON training data file with format:
           [{"guideline": str, "prompt": str, "response": str, "knowledge": List[str]}]

   Returns:
       dict: API response
   """
   url = f"{BASE_URL}/applications/{application_id}/tune"
   headers = {
       "accept": "application/json",
       "Authorization": f"Bearer {API_TOKEN}"
   }

   with open(training_file, 'rb') as f:
       files = {'training_file': f}
       response = requests.post(url, headers=headers, files=files)
       return response.json()

In [6]:
job_id = tune_application(APPLICATION_ID, "dataset.jsonl")

{'id': 'b7d6aa16-f335-4305-855f-d3c7f0bd931d'}

## Check the status


You can check the status of the job using the API. For more information, please see the API docs. When the job is complete, the status will turn to completed. The response payload will also contain evaluation_results .

In [6]:
def get_tune_job_metadata(application_id: str, job_id: str):
   """Get metadata for a specific tuning job"""
   url = f"{BASE_URL}/applications/{application_id}/tune/jobs/{job_id}/metadata"
   headers = {
       "accept": "application/json",
       "Authorization": f"Bearer {API_TOKEN}"
   }
   response = requests.get(url, headers=headers)
   return response.json()


result = get_tune_job_metadata(APPLICATION_ID, job_id['id'])

When the job is complete, the metadata would look like the following:
```
{'job_status': 'completed',
 'evaluation_results': {'grounded_generation_train_test.json_equivalence': 1.0,
  'grounded_generation_train_test.json_helpfulness': 0.814156498263641,
  'grounded_generation_train_test.json_groundedness': 0.7781168677598632},
 'model_id': 'registry/model-ada3c484-3ce0f31f-llm-fd6c2'}
 ```

In [7]:
model_id = result['model_id']
model_id

'registry/model-ada3c484-3ce0f31f-llm-fd6c2'

### Update the Application

Once the tuned job is complete, you can deploy the tuned model via editing the application through API. Note that currently a single fine-tuned model deployment is allowed per tenant. Please see the API doc for more information.

In [None]:
def update_application(application_id: str, llm_model_id: str):
   """Update application's LLM model"""
   url = f"{BASE_URL}/applications/{application_id}"
   headers = {
       "accept": "application/json",
       "content-type": "application/json",
       "Authorization": f"Bearer {API_TOKEN}"
   }
   data = {"llm_model_id": llm_model_id}
   response = requests.put(url, headers=headers, json=data)
   return response.json()

In [None]:
#update_application(APPLICATION_ID, model_id)
update_application('3ce0f31f-b262-4de6-aad8-411c5016b4ac', model_id)