# Use Granite Code Hosted on Replicate

## Introduction

This notebook demonstrates using inference calls against a model hosted on [Replicate](https://replicate.com/).  To see how you can use [Ollama](https://ollama.com/) to host models locally instead, see the [Continue VSCode](Continue_VSCode/Continue_VSCode.ipynb) recipe.


## Replicate Credit

To remove a barrier to entry to try the Granite Code models on the Replicate platform,
use [this link](https://replicate.com/invites/a8717bfe-2f3d-4a52-88ed-1356231cdf03) to add a
small amount of credit to your Replicate account.


## Install Granite `utils` package

This package is just a thin shim with various functions that are required for notebooks, keeping execution looking nice an clean.

In [None]:
!pip install git+https://github.com/ibm-granite-community/utils

## Provide your API Token

This guide will demonstrate a basic inference call using the `replicate` package.

To establish an authenticated session, provide your [Replicate API Token](https://replicate.com/account/api-tokens)
to the cell when prompted below.


In [None]:
import getpass, os
from ibm_granite_community.notebook_utils import is_colab

if is_colab():
    from google.colab import userdata
    os.environ['REPLICATE_API_TOKEN'] = userdata.get('REPLICATE_API_TOKEN')


## Choose a Model

Two Granite Code models are available in the [`ibm-granite`](https://replicate.com/ibm-granite) org at Replicate.


In [None]:
from ibm_granite_community.langchain_utils import find_langchain_model

colab_model = find_langchain_model(platform="Replicate", model_id= "ibm-granite/granite-8b-code-instruct-128k")
ollama_model = find_langchain_model(platform="ollama", model_id= "granite-code:3b")

## Define a Prompt

In [None]:
prompt = """
    Show me a SQL query that fetches all columns for the first 50 rows
    in a table named 'users'."""

## Perform Inference

In [None]:
print(f"Colab: {colab_model.invoke(prompt)}")
print(f"Ollama: {ollama_model.invoke(prompt)}")
