# **LLAMA 3 CHATbot using APIs**

We shall see how to use Llama 3 APIs to create a simple chatbot like ChatGPT for basic day to day use cases.
We will be using Replicate API tokens to converse with the model and Streamlit to host the platform and make it into a working standard web app. 

LLaMA 3 (Large Language Model Meta AI) is the third version of Meta's language models, designed to handle complex natural language processing tasks such as text generation, summarization, question answering, and chatbot development. Creating a chatbot using LLaMA 3 APIs involves leveraging the power of this sophisticated model to provide meaningful and conversational responses. Here’s how to understand the concept and process.



## How We shall go about this app


 - Get a Replicate API token

 - Set up the coding environment
 
 - Build the app
 
 - Set the API token
 
 - Deploy the app (If we want to)

#### An Overview:
Here is a high-level overview of the Llama3.1 chatbot app:

The user provides two inputs: 
 - A Replicate API token (if requested) and 

 - A prompt input (i.e. ask a question).
An API call is made to the Replicate server, where the prompt input is submitted and the resulting LLM-generated response is obtained and displayed in the app

## **Requirements/Installations**

To set up this basic Llama based chatbot, the following libraries and tools are required

1.**Python**: Ensure Python (version 3.8 or higher) is installed in your systems.

2.**Streamlit**: Streamlit is an open-source Python framework that simplifies the process of building and sharing web applications, especially for data science and machine learning tasks.

3.**Replicate**: Replicate is a platform that provides developers and researchers with an easy way to run machine learning models in the cloud and deploy them as APIs without needing to manage infrastructure or deal with complex deployment workflows.

In [None]:
# pip install streamlit

In [None]:
# pip install Replicate

## **Procedure**

### 1. **Get a Replicate API token**


Getting your Replicate API token is a simple 3-step process:

Go to https://replicate.com/signin/.

Sign in with your GitHub account.

Proceed to the API tokens page and copy your API token. once this step is over we move to the real coding part


### 2. **Import necessary libraries**

First, import the necessary libraries:

In [None]:
import streamlit as st
import replicate
import os

### 3. **Define the app title**

The title of the app displayed on the browser can be specified using the page_title parameter, which is defined in the st.set_page_config() method:

In [None]:
st.set_page_config(page_title="AI Chatbot using LLama 3")

### 4. **Defining a sidebar to accept the API token and adjust model parameters**

When designing the chatbot app, divide the app elements by placing the app title and text input box for accepting the Replicate API token in the sidebar and the chat input text in the main panel. To do this, place all subsequent statements under with st.sidebar:, followed by the following steps:

1. Define the app title using the st.title() method.

2. Use if-else statements to conditionally display either:

 - A success message in a green box that reads API key already provided! for the if statement.

 - A warning message in a yellow box along with a text input box asking for the API token, as none were detected in the Secrets, for the else statement

In [5]:
with st.sidebar:
    st.title('Llama 3.1 Chatbot')
    # Enter Replicate API token
    replicate_api = st.text_input('Enter Replicate API token:', type='password')
    # Model parameters
    temperature = st.slider('Temperature', min_value=0.01, max_value=1.0, value=0.5, step=0.01)
    top_p = st.slider('Top P', min_value=0.01, max_value=1.0, value=0.9, step=0.01)
    max_length = st.slider('Max Length', min_value=32, max_value=512, value=128, step=32)

Along with the above code we will also be making toggles to set up the parameters such as temperature (creativity) , top_p (nucleus sampling) and max word length of the model


### 5. **Setting up Replicate API token**


 Inside the function, there’s a simple check: if replicate_api. This checks whether the replicate_api value is provided. If it is None or an empty string, the block of code will not run.

If a valid API token is passed, the line 

os.environ['REPLICATE_API_TOKEN'] = replicate_api 

sets the environment variable REPLICATE_API_TOKEN to the value of replicate_api. This is important because many libraries or services (like Replicate) use environment variables to securely store sensitive information like API keys. By storing the token in an environment variable, you avoid hardcoding it into the script, which is a best practice for security reasons.

In [6]:
if replicate_api:
    os.environ['REPLICATE_API_TOKEN'] = replicate_api

# Define the Replicate Llama 3.1 model version
llama_3_1_model = "meta/llama-2-70b-chat:02e509c789964a7ea8736978a43525956ef40397be9033abf9fd2badfe68c9e3"

In [None]:
# llama_3_1_model = "meta/llama-2-70b-chat:02e509c789964a7ea8736978a43525956ef40397be9033abf9fd2badfe68c9e3"

The above is the API that we are using to create this Chatbot. This API was taken from Replicate you can also get similar APIs from other open-source platforms such as HuggingFace etc...

In case we want to use OpenAI API keys, we need to pay for it and the key will be linked to your OpenAI account.

### 6. **Chat History Functions**

When making a Generative AI chatbot, it is essential to have good UI to talk to the chatbot. While doing this we need to enable history inorder for the model to learn your prefernces "for RAG"

We have 3 Self defining functions here that is used for chat history

In [7]:
def initialize_chat_history():
    if "messages" not in st.session_state:
        st.session_state.messages = [{"role": "assistant", "content": "How may I assist you today?"}]

In [8]:
def display_chat():
    for message in st.session_state.messages:
        with st.chat_message(message["role"]):
            st.write(message["content"])

In [9]:
def clear_chat_history():
    st.session_state.messages = [{"role": "assistant", "content": "How may I assist you today?"}]
st.sidebar.button('Clear Chat History', on_click=clear_chat_history)

### 7. **Response generation**

Let us create a custom function in order to get our response from the Llama API.

This function takes the user prompt as input and builds a connection to the replicate servers from where the model is called upon and the prompt is decoded and an output response is generated 

In [10]:
def generate_response(prompt_input):
    if not replicate_api:
        st.warning("Please enter a valid Replicate API token.")
        return None
    
    try:
        # API request to Replicate
        output = replicate.run(
            llama_3_1_model,
            input={
                "prompt": prompt_input,
                "temperature": temperature,
                "top_p": top_p,
                "max_length": max_length,
                "repetition_penalty": 1.0
            }
        )
        # Join the output response if it's a list (Replicate API often returns a list)
        return ''.join(output) if isinstance(output, list) else output
    except Exception as e:
        st.error(f"Error calling Replicate API: {e}")
        return None

# User input for chat
prompt = st.chat_input("Enter your message here...")

# Generate and display response if a prompt is entered
if prompt and replicate_api:
    # Add user input to chat history
    st.session_state.messages.append({"role": "user", "content": prompt})
    
    with st.chat_message("user"):
        st.write(prompt)

    with st.chat_message("assistant"):
        with st.spinner("Thinking..."):
            response = generate_response(prompt)
            if response:
                st.write(response)
                st.session_state.messages.append({"role": "assistant", "content": response})
            else:
                st.warning("Failed to generate a response. Please try again.")