## **Unleash Pre-trained LLMs with Replicate: A Step-by-Step Guide** ♋

Large language models (LLMs) are revolutionizing various fields, but accessing their power can be complex. Replicate offers a user-friendly API that simplifies working with pre-trained LLMs within your notebook. This guide will walk you through the process of using Replicate's API to leverage the capabilities of a pre-trained LLM, specifically the Llama-2-7b model, in your notebook.

### **_Prerequisites_** ⚡

- A computer with an internet connection

- A text editor or notebook environment (e.g., Jupyter Notebook, Google Colab)

---

---


### **_Steps_** 🌟

▶ Create a Replicate Account:

- Visit https://replicate.com/ and sign up for a free account. Replicate offers both free and paid plans. The free plan provides limited credits for running models, but it's sufficient for experimentation.


<figure>
<center>
<img src='https://drive.google.com/uc?id=1D1qDKUcgNx5ipr9fuDv6VT9ss70AH9vR'/>
<figcaption></figcaption></center>
</figure>


▶ Obtain Your API Token:

- Once logged in, navigate to your profile settings by clicking on your username in the top right corner.
  Select "API" from the left-hand menu.
  Click on "Generate New Token" and give it a descriptive name (e.g., "My-Notebook-Token").
  **_Copy the generated token. You'll need it to access models via the API._**


<figure>
<center>
<img src='https://drive.google.com/uc?id=1zC5yzMB_ftIqKfD8bWZZ2fSaAfxwzfPy'/>
<figcaption></figcaption></center>
</figure>


▶ Install replicate


In [2]:
!pip install replicate


Collecting replicate
  Downloading replicate-0.25.1-py3-none-any.whl (39 kB)
Collecting httpx<1,>=0.21.0 (from replicate)
  Downloading httpx-0.27.0-py3-none-any.whl (75 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m1.5 MB/s[0m eta [36m0:00:00[0m
Collecting httpcore==1.* (from httpx<1,>=0.21.0->replicate)
  Downloading httpcore-1.0.5-py3-none-any.whl (77 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.9/77.9 kB[0m [31m5.1 MB/s[0m eta [36m0:00:00[0m
Collecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<1,>=0.21.0->replicate)
  Downloading h11-0.14.0-py3-none-any.whl (58 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.3/58.3 kB[0m [31m5.1 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: h11, httpcore, httpx, replicate
Successfully installed h11-0.14.0 httpcore-1.0.5 httpx-0.27.0 replicate-0.25.1


▶ Import necessary libraries


In [None]:
import replicate
from getpass import getpass
import os

▶ Authenticate with Replicate


In [None]:
"""
  |/| Summary |/|
  
The code snippet is responsible for getting a user's Replicate API token 
as input and storing it in the environment variable 'REPLICATE_API_TOKEN'.
"""

def access_replicate():
    try:
        if 'REPLICATE_API_TOKEN' not in os.environ:
            api_token = getpass(prompt='Enter your Replicate API token: ')
            os.environ['REPLICATE_API_TOKEN'] = os.getenv(api_token)
    except Exception as e:
        print(f'Error: {e}')

access_replicate()


▶ Prepare Your Prompt


In [None]:
# Prompts
prompt = "Write me three facts about llamas, the first in AABB format, the second in ABAB, the third in AABB format?"

prompt_template = """
<|begin_of_text|>\
<|start_header_id|>system<|end_header_id|>\\n\nYou are a helpful assistant<|eot_id|>\
<|start_header_id|>user<|end_header_id|>\n\n{prompt}<|eot_id|>\
<|start_header_id|>assistant<|end_header_id|>\n\n
"""
model_name = "meta/meta-llama-3-8b-instruct"

inputs = {
    "TOP_P": 0.9,
    "prompt":prompt,
    "MAX_TOKENS": 1024,
    "MIN_TOKENS": 0,
    "TEMPERATURE": 0.6,
    "prompt_template": prompt_template,
    "PRESENCE_PENALTY": 0,
    "FREQUENCY_PENALTY": 0
}


▶ Run Inference


In [None]:
def run_inference(model_name, **inputs):
    """
    Run inference using the specified model.

    Args:
        model_name (str): The name of the model to use for inference.

    Returns:
        str: The result of the inference as a string.
    """
    if not isinstance(model_name, str) or model_name == "":
        raise ValueError("model_name must be a non-empty string")

    result = ""

    try:
        for event in replicate.stream(
            model_name,
            input=inputs,
        ):
            result += str(event)

    except Exception as e:
        print(f"An error occurred: {str(e)}")

    return result

run_inference()


### **_THANKS_** and good luck !!