# Inference SDK on Krutrim Cloud
This notebook demonstrates how to utilize the Inference SDK on Krutrim Cloud for managing and executing inference tasks. It provides step-by-step instructions for operations such as listing available clusters, managing fine-tuning checkpoints, creating inference tasks, retrieving task information, and monitoring the status of ongoing tasks. By following this guide, users can effectively set up, execute, and manage inference processes within the Krutrim Cloud environment, ensuring optimal performance and resource utilization.

## Install Krutrim Cloud SDK

In [1]:
%pip install krutrim-cloud

## Prerequisite
**Export the Required Environment Variables:** Create a .env file in the examples directory with the following details:

- KRUTRIM_CLOUD_API_KEY="Your Krutrim Cloud API Key"

## Import Libraries and Load Environment Variables
- **Purpose**: To prepare the environment for the script.
- **Key Actions**:
    -  Import necessary libraries (krutrim_cloud for API access, dotenv for environment variables).
    - Load environment variables from a .env file.

In [None]:
# Import necessary libraries
from krutrim_cloud import KrutrimCloud
from dotenv import load_dotenv
import os
# Load environment variables from .env file
load_dotenv()

## Initialize KrutrimCloud Client
- **Purpose:** To set up the API client for making requests.
- **Key Actions:**
    - Create an instance of KrutrimCloud using the base URL from environment variables.

In [3]:
# Initialize KrutrimCloud client
client = KrutrimCloud()

## List Fine-Tuning Checkpoints
- **Purpose**: Retrieve and display all fine-tuning checkpoints to monitor available models for inference.
- **Key Actions**:
    -  Fetches fine-tuning checkpoints and manages potential errors.

In [4]:
try:
    # List all fine-tuning checkpoints
    fine_tuning_checkpoints = client.inference.checkpoints.list()
    print(fine_tuning_checkpoints)  # Print or process the checkpoints list
except Exception as e:
    # Handle any exceptions that occur during the API call
    print(f"An error occurred while fetching the fine-tuning checkpoints: {e}")

[CheckpointListResponseItem(model='Meta-Llama-3-8B-instruct', name='ft-9ea84101-4cee-4b80-8b31-5e67d8628b98-ft_task_1_1_final', version='ft')]


## Get Inference Task Information
- **Purpose**: Retrieve and display information for a specific inference task by ID.
- **Key Actions**:
    -  Retrieves task information and includes validation error handling.

In [12]:
try:
    filename = fine_tuning_checkpoints[0].name
    # Get Inference Task Information
    inference_task_info = client.inference.checkpoints.retrieve(filename=filename)
    print(inference_task_info)  # Print or process the retrieved inference task information
except Exception as e:
    # Handle any exceptions that occur during the API call
    print(f"An error occurred while fetching the inference task information: {e}")

CheckpointRetrieveResponse(ctime='25_09_2024_15_41_25_967212', dataset='ft-alpaca-tiny.json', epoch='1', mode='lora', model='Meta-Llama-3-8B-instruct', mtime='25_09_2024_15_48_33_549117', name='ft_task_1', status='succeed', steps='steps', test_dataset='test-dataset')


## Delete a Checkpoint
- **Purpose**: Delete a specified fine-tuning checkpoint.
- **Key Actions**:
    -  Deletes a checkpoint and handles errors during the operation.

In [None]:
try:
    filename = fine_tuning_checkpoints[0].name
    # Delete the checkpoint
    delete_response = client.inference.checkpoints.delete(filename=filename)
    print(delete_response)  # Print the response after deletion, if needed
except Exception as e:
    # Handle any other exceptions that are not specifically caught above
    print(f"An unexpected error occurred while deleting the checkpoint: {e}")

## Create Inference Task
- **Purpose**: Create a new inference task with specified parameters.
- **Key Actions**:
    -  Initiates a new inference task and handles any arising errors.

In [13]:
try:
    checkpoint = fine_tuning_checkpoints[0].name
    model = inference_task_info.name
    # Create Inference Task
    inference_task_response = client.inference.tasks.create(
        model=model,
        namespace="gpu-scheduler",
        priority=1,
        checkpoint=checkpoint,
        ngpu=1,
        min_replicas=1,
        max_replicas=1,
        max_batch_size=256
    )
    print(inference_task_response)
except Exception as e:
    # Handle any exceptions that occur during the API call
    print(f"An error occurred while creating the inference task: {e}")

TaskCreateResponse(id='0239299e-00bd-412e-8588-d89245299992', name='ft_task_1')


## List Inference Tasks
- **Purpose**: Retrieve and display all current inference tasks.
- **Key Actions**:
    -  Fetches the list of inference tasks and manages errors.

In [14]:
try:
    # List Inference Tasks
    inference_task_list = client.inference.tasks.list()
    print(inference_task_list)
except Exception as e:
    # Handle any exceptions that occur during the API call
    print(f"An error occurred while listing the inference tasks: {e}")

TaskListResponse(count=1, offset=0, task_list=[{'name': 'ft_task_1', 'basemodel': 'Meta-Llama-3-8B-instruct', 'id': '0239299e-00bd-412e-8588-d89245299992', 'status': 'init', 'mtime': '09/27/2024 04_49_21 UTC'}])


## List Inference Tasks by ID
- **Purpose**: Retrieve and display information for a specific inference task by ID to check its status and details.
- **Key Actions**:
    -  Retrieves task information and includes validation error handling.

In [15]:
try:
    id = inference_task_list.task_list[0].get("id")
    # Get Inference Task Information
    task_info = client.inference.tasks.retrieve(id=id)
    print(task_info)
except Exception as e:
    # Handle any exceptions that occur during the API call
    print(f"An error occurred while retrieving the inference task information: {e}")

TaskRetrieveResponse(id='0239299e-00bd-412e-8588-d89245299992', basemodel='Meta-Llama-3-8B-instruct', checkpoint='ft-9ea84101-4cee-4b80-8b31-5e67d8628b98-ft_task_1_1_final', inference_svc_name=None, name='ft_task_1', namespace='gpu-scheduler', priority=1, status='init')


## Cancel Inference Task
- **Purpose**: Cancel a specified inference task by ID to stop unnecessary processing or free up resources.
- **Key Actions**:
    -  Cancels an inference task and handles exceptions during the cancellation.

In [16]:
try:
    id = inference_task_list.task_list[0].get("id")
    # Cancel Inference Task
    cancel_response = client.inference.tasks.cancel(id=id)
except Exception as e:
    # Handle any exceptions that occur during the API call
    print(f"An error occurred while canceling the inference task: {e}")