# 🛠️<span style="color:#88dad3;"><strong>LLM Workshop 1</span>: Building a Local Chatbot using Streamlit + Langchain + Ollama

This walkthrough will help you understand and build a simple **Local Chatbot App** using:

- 🐍 **Python**
- 💻 **Streamlit** — for the UI
- 🦙 **LangChain + Ollama** — to run LLMs locally
- 💬 **LLaMA3.2** (or any local model via Ollama)

---

## 🧩 Step-by-Step

### 🖼️ <span style="color:#88dad3;"> UI Initialization </span>

In [None]:
import streamlit as st

st.title("My Local Chatbot 🤭")

### 💾 <span style="color:#88dad3;"> Initializing Chat History </span>

We use `st.session_state` to store the conversation history between the user and the assistant.

- `session_state` is like memory for the app.
- We check if `"messages"` exists, and if not, we create an empty list.
- This makes sure our chatbot remembers past messages as we interact with it.

> 💡 Without this, the chat would reset every time we type!

In [None]:
if "messages" not in st.session_state:
    st.session_state.messages = []


### 💬 Displaying Chat Messages

We loop through all messages stored in `st.session_state.messages` and display them using:

- `st.chat_message("user")` or `st.chat_message("assistant")` — shows the message in the correct "bubble"
- `st.markdown(message["content"])` — displays the text in a markdown-friendly way

Once submitted, we append the user's input to the messages history.




In [None]:
for message in st.session_state.messages:
    with st.chat_message(message["role"]):
        st.markdown(message["content"])


### 🙋‍♀️ Receiving User Input

This creates a chat box where the user can type their message.

- `st.chat_input("Write something")` lets the user type a message.
- If the user enters something, it's saved into the chat history.

🔍 Notes

- The dictionary `{"role": "user", "content": prompt}` keeps track of **who** sent the message.

In [None]:
if prompt := st.chat_input("Write something"):
    st.session_state.messages.append({"role": "user", "content": prompt})


### 👤 Displaying User's Message 


This shows the user’s latest message in a chat bubble right after they send it.

In [None]:
with st.chat_message("user"):
    st.markdown(prompt)


### 🧠 Initialize the Assistant's LLM with Ollama

- We import `ChatOllama` from `langchain_ollama` to connect with our local LLM.
- Inside the assistant’s chat message block, we:
  - Select the local model (`llama3.2`).
  - Create an instance of the language model with a temperature of 0.7 to control response creativity.
  
Langchain’s Ollama integration lets us run local LLMs like llama3.2.

temperature=0.7 makes the output more creative while retaining coherence.



<details>
<summary style="font-size: 1.1em; font-weight: 600; color:rgb(23, 114, 119); cursor: pointer;">
  ℹ️ <span>What is langchain_ollama and ChatOllama </span>
</summary>

`langchain_ollama` is a LangChain integration that lets you easily connect to local LLMs powered by Ollama.

- `ChatOllama` is the class that wraps your local model for chat-based interactions.
- It handles sending messages, streaming responses, and configuring parameters like temperature.

This makes it simple to plug your local LLM into a chatbot or other LangChain-powered applications.
</details>

<details>
<summary style="font-size: 1.1em; font-weight: 600; color:rgb(24, 129, 156); cursor: pointer;">
  🌡️ <span >Understanding Temperature in Language Models</span>
</summary>

The **temperature** parameter controls how random or creative the model's responses are:

- <span style="color:green; font-weight:bold;">Low temperature (0.0–0.3):</span><br>  
  More deterministic, precise, and predictable.<br>  
  Great for factual or technical tasks.

- <span style="color:orange; font-weight:bold;">Medium temperature (0.5–0.7):</span><br>  
  Balanced creativity and accuracy.<br>  
  Ideal for general chatting, summarizing, and coding.

- <span style="color:red; font-weight:bold;">High temperature (0.8–1.0+):</span><br>  
  More creative and unpredictable.<br>  
  Perfect for brainstorming and storytelling.

Adjusting temperature helps you control how “safe” or “wild” the generated answers feel.
</details>

In [None]:
from langchain_ollama import ChatOllama

with st.chat_message("assistant"):
    local_model = "llama3.2"
    llm = ChatOllama(model=local_model, temperature=0.7)

### 📤 Stream the Assistant's Response


- We send the entire chat history (`st.session_state.messages`) to the language model as input.
- The `.stream()` method from ChatOllama generates the assistant’s reply **incrementally**, simulating real-time typing.
- `st.write_stream(stream)` displays the response as it arrives, creating a smooth chat experience.

> 💡 Streaming makes the bot feel more interactive and responsive compared to waiting for the full answer.


In [None]:
stream = llm.stream(
    input=[
        {"role": m["role"], "content": m["content"]}
        for m in st.session_state.messages
    ]
)
response = st.write_stream(stream)

### 🗃️ Saving the Assistant’s Response

- After receiving the assistant’s reply, we add it to the chat history.
- This keeps the conversation **persistent** so the full chat shows in future interactions.
- Storing messages in `st.session_state.messages` ensures the chat updates correctly with every new message.

> ✅ This step is essential to maintain the conversation flow and context throughout the chat session.

In [None]:

st.session_state.messages.append({"role": "assistant", "content": response})

In [None]:
! streamlit run ./03_chatbot.py



[0m
[34m[1m  You can now view your Streamlit app in your browser.[0m
[0m
[34m  Local URL: [0m[1mhttp://localhost:8501[0m
[34m  Network URL: [0m[1mhttp://192.0.0.2:8501[0m
[0m
[34m[1m  For better performance, install the Watchdog module:[0m

  $ xcode-select --install
  $ pip install watchdog
            [0m
