# Use Remote and Local Granite Code Models with LangChain

## Introduction and Setup

This notebook demonstrates using inference calls against a model hosted remotely on [Replicate](https://replicate.com/) and locally using [Ollama](https://ollama.com/).

### Install Granite `utils` package

This package is a thin shim with various functions that are required for notebooks.

To see the implementation of its functions, see the [utils repo](https://github.com/ibm-granite-community/utils/tree/main).

In [None]:
!pip install git+https://github.com/ibm-granite-community/utils

In [1]:
from ibm_granite_community.langchain_utils import find_langchain_model

### Define a Prompt

The sections below demonstrate remote options and a local option for model inference.

Each will perform a blocking call using the following prompt:

In [2]:
prompt = """
    Show me a SQL query that fetches all columns for the first 50 rows
    in a table named 'users'."""

## Remote Model using Replicate

### Establish Replicate Account

To use this remote option, create an account at [Replicate](https://replicate.com).

### Add credit to your Replicate Account (optional)

To remove a barrier to entry to try the Granite Code models on the Replicate platform,
use [this link](https://replicate.com/invites/a8717bfe-2f3d-4a52-88ed-1356231cdf03) to add a
small amount of credit to your Replicate account.

### Provide your API Token

Obtain your `REPLICATE_API_TOKEN` at [replicate.com/account/api-tokens](https://replicate.com/account/api-tokens)

There are three ways to provide this value to the cells below.  In order of precedence:

1. As an environment variable
2. As a Google colab secret
3. Supplied by the user using `getpass()`

### Choose a Model

Two Granite Code models are available in the [`ibm-granite`](https://replicate.com/ibm-granite) org at Replicate.

The `find_langchain_model` function below imports the `replicate` package.


In [10]:
model_id = "ibm-granite/granite-8b-code-instruct-128k"
# model_id = "ibm-granite/granite-20b-code-instruct-8k"

model = find_langchain_model(platform="Replicate", model_id=model_id)

### Perform Inference

In [None]:
response = model.invoke(prompt)

print(f"Granite response from Replicate: {response}")

## Local Model using Ollama

### Install Dependencies

[Download and Start Ollama](https://ollama.com/download)

Then pull a model:

- Granite Code 20b: `ollama pull granite-code:20b`
- Granite Code 8b: `ollama pull granite-code:8b`
- Granite Code 3b: `ollama pull granite-code:3b`

### Choose a Model

In [7]:
# model_id = "granite-code:3b"
model_id = "granite-code:8b"
# model_id = "granite-code:20b"

model = find_langchain_model(platform="ollama", model_id=model_id)

### Perform Inference

In [None]:
response = model.invoke(prompt)

print(f"Granite response from Ollama: {response}")

## Remote Model using IBM WatsonX

### Establish a WatsonX Account

To use this remote option, create an account on [WatsonX](https://www.ibm.com/watsonx).

### Provide the Environment Variables

There are three ways to provide the environment variables required by `find_langchain_model()` below.  In order of precedence:

1. Directly as an environment variable in the python environment where the jupyter notebook is running.
2. As a Google Colab secret, if you are running the notebook in Colab.
3. Supplied by the user in a prompt during execution of the notebook.

### Provide your API Key

Obtain your `WATSONX_APIKEY` by generating a [Platform API Key](https://www.ibm.com/docs/en/watsonx/watsonxdata/1.0.x?topic=started-generating-api-keys) on the watsonx.data web client.



### Provide your Project Id

Get your `WATSONX_PROJECT_ID` from the [WatsonX](https://www.ibm.com/watsonx) web client by following [these instructions](https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-project-id.html?context=wx).

### Provide your Base WatsonX URL

Get your `WATSONX_URL` by viewing the details for the service instance from the Cloud Pak for Data web client, as described in [these watsonx.ai setup instructions](https://ibm.github.io/watsonx-ai-python-sdk/setup_cpd.html).

As an example, your `WATSONX_URL` may be `https://us-south.ml.cloud.ibm.com` for the Dallas zone.

### Choose a Model

In [5]:
# model_id = "ibm/granite-3b-code-instruct"
model_id = "ibm/granite-8b-code-instruct"
# model_id = "ibm/granite-20b-code-instruct"
# model_id = "ibm/granite-34b-code-instruct"

import os
model = find_langchain_model(platform="watsonx", model_id=model_id)

### Perform Inference

In [None]:
response = model.invoke(prompt)

print(f"Granite response from WatsonX: {response}")