# **Langchain Pandas Agent with Llama3.1 8B**

**reference:**
- [run streamlit in kaggle with ngrok](https://www.kaggle.com/code/amlanmohanty1/build-web-app-for-heart-disease-with-streamlit#Write-a-file-for-creating-web-app)
- [langchain agent with gemma:2b](https://www.youtube.com/watch?v=u3SGDvOVyO4)
- [work with ollama](https://stackoverflow.com/questions/78394289/running-ollama-on-kaggle)
- [ollama official website](https://ollama.com/library/llama3.1:8b)

## **1. pip install requirements**

In [None]:
pip install langchain langchain-experimental langchain-ollama openpyxl tabulate streamlit pyngrok

In [None]:
from pyngrok import conf, ngrok

In [None]:
!curl https://ollama.ai/install.sh | sh
import subprocess
process = subprocess.Popen("ollama serve", shell=True) # runs on a different thread
!pip install ollama
import ollama

In [None]:
# download model
!ollama pull llama3.1:8b

## **2. write a file for creating web app**
### **2-1. import packages**
### **2-2. define some functions we need**
### **2-3. build streamlit web app**

**Security Notice of create_pandas_dataframe_agent:**
```
This agent relies on access to a python repl tool which can execute arbitrary code. This can be dangerous and requires a specially sandboxed environment to be safely used. Failure to run this code in a properly sandboxed environment can lead to arbitrary code execution vulnerabilities, which can lead to data breaches, data loss, or other security incidents.

Do not use this code with untrusted inputs, with elevated permissions, or without consulting your security team about proper sandboxing!

You must opt-in to use this functionality by setting allow_dangerous_code=True.
```

In [None]:
%%writefile pandas_agent_app.py

# import packages required
import pandas as pd
import streamlit as st
from langchain.agents import AgentType
from langchain_experimental.agents import create_pandas_dataframe_agent
from langchain_ollama import ChatOllama

# streamlit web app configuration
st.set_page_config(
    page_title="Chat with CSV/XLSX using Llama3.1",
    page_icon="🦙",
    layout="centered"
)

# read csv, xlsx files uploaded to streamlit web app
def read_tabular_data(file):
    if file.name.endswith(".csv"):
        return pd.read_csv(file)
    elif file.name.endswith(".xlsx"):
        return pd.read_excel(file)
    else:
        raise ValueError("your file doesn't end with `.csv` or `.xlsx`, please check it.")

# streamlit page title
st.title("💻 Tabular Data Chat ft. Langchain Agent, Llama3.1, Ollama")

# initialize chat history
if "chat_history" not in st.session_state:
    st.session_state.chat_history = []

# initiate tabular_data in session state
if "tabular_data" not in st.session_state:
    st.session_state.tabular_data = None

# file upload widget
uploaded_file = st.file_uploader("please upload your `.csv` or `.xlsx` file", 
                                 type=["csv", "xlsx"])
if uploaded_file:
    st.session_state.tabular_data = read_tabular_data(uploaded_file)
    st.write("📽️ Tabular Data Preview：")
    st.dataframe(st.session_state.tabular_data.head())

# display chat history
for message in st.session_state.chat_history:
    with st.chat_message(message["role"]):
        st.markdown(message["content"])

# load llm (Llama3.1) using Ollama
llm = ChatOllama(model="llama3.1:8b", temperature=0.1)

# create a llm agent
tabular_agent = create_pandas_dataframe_agent(
    llm,
    st.session_state.tabular_data,
    agent_type="tool-calling",
    verbose=True,
    allow_dangerous_code=True,
    agent_executor_kwargs={"handle_parsing_errors": True}
)

# define system prompt
system_prompt = "you are a helpful assistant. the user you are helping speaks Traditional Chinese. please answer in Traditional Chinese as well."

# user question
user_query = st.chat_input("ask something in your tabular data...")
if user_query:
    # display user question
    st.chat_message("user").markdown(user_query)
    # save user question into chat history
    st.session_state.chat_history.append({
        "role":"user",
        "content": user_query
    })
    # give the system prompt and chat history including the latest user question to llm agent
    messages = [
        {
            "role":"system", 
            "content": system_prompt
        },
        *st.session_state.chat_history
    ]
    # call llm agent to response the user question
    response = tabular_agent.invoke(messages)
    llm_reply = response["output"]
    # save llm response into chat history
    st.session_state.chat_history.append({
        "role":"assistant", 
        "content": llm_reply
    })
    # display llm response
    with st.chat_message("assistant"):
        st.markdown(llm_reply)

## **3. run streamlit web app with ngrok**
**if you want to run the app even after the terminal is closed, do the following**

**code:**<br>
```!nohup streamlit run app.py```

**explained by gpt:**<br>
The command `!nohup streamlit run app.py` means the following:

- `!`: In some Python environments (such as Jupyter Notebook or Kaggle Notebook), `!` is used to execute system commands.
- `nohup`: This command stands for "no hang up". It allows a command to continue running even after the terminal is closed. By default, it writes the command's output to a file called `nohup.out` unless you specify another file.
- `streamlit run app.py`: This part of the command starts a Streamlit application and runs the `app.py` file. Streamlit is an open-source framework used for building data applications.

Putting it all together, the command means:

- In a Python environment that allows system commands (like Jupyter Notebook),
- Using the `nohup` command to ensure that the Streamlit application continues to run even if the terminal is closed,
- And running the `app.py` Streamlit application.

This ensures that your Streamlit application runs in the background and continues to operate even if you close the current command line or notebook environment.

In [None]:
ngrok_token = input("copy and paste your ngrok token here:") # keep it secret

if ngrok_token:
    conf.get_default().auth_token = ngrok_token
    conf.get_default().monitor_thread = False
    ssh_tunnels = ngrok.get_tunnels(conf.get_default())
    if len(ssh_tunnels) == 0:
        ssh_tunnel = ngrok.connect(8501)
        print("url:", ssh_tunnel.public_url)
    else:
        print("url:", ssh_tunnels[0].public_url)

In [None]:
import os
get_ipython().system = os.system

In [None]:
!streamlit run ./pandas_agent_app.py