**EL HARKAOUI Chaymae -- 01-03-2025**

**<u><h1 style=color:red>LangChain</h1></u>**

**<u><h3 style=color:magenta>What is LangChain ?</h3></u>**

**LangChain** is a <font color=gray>-- Python - JavaScript -</font> **Framework** designed to help building **applications powered by LLMs** such as **OpenAI's GPT** - **Google's Gemini** - **Meta's LLaMA** - and other similar models. <br>


**LangChain** allows developers to :<br>

- **Integrate** LLMs with **external data** -- APIs - databases - documents ...
- Implement **memory** in chat applications - so conversations are stateful
- Use **chains of LLM calls** to enhance reasoning and problem-solving.
- **Combine** different models - tools - agents for complex tasks.

**LangChain** makes it easy to build **AI-powered applications** like chatbots - document summarizers - question-answering systems - autonomous agents ...

LangChain was created in **October 2022** by **Harrison Chase** as an open-source framework to simplify working with LLMs. Initially, it focused on providing **wrappers for LLMs and prompt templates**, but as AI adoption surged especially after ChatGPT’s launch - it rapidly evolved into a **full ecosystem**.<br> 

By early **2023** - LangChain introduced **memory - chains - agent capabilities** - enabling chatbots - AI assistants - automation workflows. Its integration with **vector databases** like FAISS and Pinecone and support for **multiple LLM providers** - OpenAI - Hugging Face - Cohere ... made it widely adopted for **retrieval-augmented generation RAG and enterprise AI applications**.<br> 

By **2024** - LangChain had become a **leading AI development framework**, allowing developers to build **scalable - intelligent AI-driven applications** with ease. Its future promises **more advanced AI agents - better on-premise model support - improved enterprise scalability** - solidifying its role in the **next generation of AI-powered software**.

![Images/langchian.png](Images/langchain.png)

**<u><h3 style=color:magenta>Key Concepts in LangChain</h3></u>**

LangChain is built on five main **pillars** :<br>

**<font color=blue>LLM Wrappers</font>** --- Easily interact with models like OpenAI’s GPT - Claude - LLaMA ...<br>
**<font color=blue>Prompt Management</font>** --- Helps with prompt engineering and formatting.<br>
**<font color=blue>Memory</font>** --- Maintains context across multiple interactions.<br>
**<font color=blue>Data Connectivity</font>** --- Retrieves and processes structured or unstructured data.<br>
**<font color=blue>Agents & Chains</font>** --- Enables AI systems to make decisions and take actions.

**<u><h3 style=color:magenta>Core Modules in LangChain</h3></u>**


**<font color=navy>1 - Language Model Wrappers</font>** <br>
**<font color=navy>2 - Prompt Templates</font>**<br>
**<font color=navy>3 - Chains</font>**<br>
**<font color=navy>4 - Memory</font>** <br>
**<font color=navy>5 - Agents</font>**


<hr style=color:olive></hr>

**<u><h3 style=color:olive>LLM Wrappers</h3></u>**

**LLM Wrappers** provide a **unified interface** for **interacting** with various **language models** - OpenAI GPT - Anthropic Claude - Cohere - Hugging Face models ...

**<h5 style=color:navy>Why to use ?</h5>**

- Allow seamless **switching** between different models
- **Standardize** API calls and response handling
- **Reduce boilerplate code** when working with multiple LLM providers

**<h5 style=color:navy>Example without LangChain ❌</h5>**

To call OpenAI’s GPT model directly - we have to write custom API calls.<br><br>

**Problems**

- Requires **manual** API handling
- Switching between models requires **rewriting code**

In [None]:
import openai

# Our OpenAI API key
openai.api_key = 'sk-projKuYvXPYyKUAbLyZd0oxyfxfpNqMCUA'

# Call the Chat API 
response = openai.completions.create( model="gpt-3.5-turbo", prompt="What is LangChain?", max_tokens=100 )

# Print the response
print(response.choices[0].text.strip())

**<h5 style=color:navy>Example with LangChain ✅</h5>**

**Advantages** 
- Less boilerplate code
- Easily interchangeable models
- Integrated with other LangChain tools

In [None]:
from langchain_community.chat_models import ChatOpenAI

# Initialize ChatOpenAI with our API key
llm = ChatOpenAI(model="gpt-3.5-turbo", openai_api_key="svrFJ0ix87AZiZ-CkMdYvXPYyKUAbLyZd0oxyfxfpNqMCUA")

# Prepare the message 
messages = [ {"role": "user", "content": "What is LangChain?"} ]

# Invoke the model 
response = llm.invoke(messages)

# Print the response
print(response)

<hr style=color:olive></hr>

**<u><h3 style=color:olive>Prompt Templates</h3></u>**

**Prompt Templates** allow us to dynamically **format** and manage **prompts** in a **structured way**.

**<h5 style=color:navy>Why to use ?</h5>**

- **Prevent** redundant prompt writing
- Ensure **consistent structure** in requests
- Allow **parameterized inputs** for efficiency

**<h5 style=color:navy>Example without LangChain ❌</h5>**

**Problems**

- Hardcoded prompts
- Difficult to scale for multiple topics

In [None]:
user_input = "Quantum Computing"

prompt = f"Explain {user_input} in simple terms."

response = openai.ChatCompletion.create( model="gpt-4", messages=[{"role": "user", "content": prompt}] )

print(response["choices"][0]["message"]["content"])

**<h5 style=color:navy>Example with LangChain ✅</h5>**

**Advantages** 
- Reusability for different inputs
- Standardized prompt structures
- Easy modifications and scalability

In [None]:
from langchain.prompts import PromptTemplate

template = PromptTemplate( input_variables=["topic"], template="Explain {topic} in simple terms.")

formatted_prompt = template.format(topic="Quantum Computing")

print(formatted_prompt)

Explain Quantum Computing in simple terms.


<hr style=color:olive></hr>

**<u><h3 style=color:olive>Chains</h3></u>**

In LangChain - a **chain** means a **sequence of steps** that process **input** and generate **output**.<br>

A **basic** chain **links** a **prompt** to an **LLM**- It links a **prompt template** with an **LLM** to form a **structured pipeline** for **generating responses**.<br>

A **complex** chain can **combine multiple steps** - like retrieving data - applying logic - using different models...<br>

Think of it like a conveyor belt : **User input query** → **Prompt Formatting** → **LLM Processing** → **Output Generation**<br>

**Chains** allow you to **<font color=purple>connect multiple components</font>** - LLMs - memory - tools ... into a **<font color=purple>single workflow</font>**.

**<h5 style=color:navy>Example without LangChain ❌</h5>**

If we want to take a user input - format a prompt - get an LLM response - we need to :<br><br>

**Problems**

- Each step - formatting → sending → retrieving - is manually coded
- Hard to extend for complex workflows

In [None]:
user_input = "Neural Networks"

prompt = f"Explain {user_input} in simple terms."

response = openai.ChatCompletion.create( model="gpt-4", messages=[{"role": "user", "content": prompt}] )

print(response["choices"][0]["message"]["content"])

**<h5 style=color:navy>Example with LangChain ✅</h5>**

**Advantages** 
-  Automates chaining of prompts and responses
- Easily extendable with memory - tools and multiple steps
- Cleaner - reusable code

In [None]:
from langchain.chains import LLMChain
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate

llm = OpenAI(model_name="gpt-4", openai_api_key="our_api_key")

template = PromptTemplate( input_variables=["topic"],template="Explain {topic} in simple terms." )

# A chain that connects a Langauge Model with a structured prompt
chain = LLMChain(llm=llm, prompt=template)

response = chain.run("Neural Networks")

print(response)

<hr style=color:olive></hr>

**<u><h3 style=color:olive>Memory</h3></u>**

**Memory** allows LLMs to remember **previous conversations** and maintain context.

**<h5 style=color:navy>Why to use ?</h5>**

- Enables stateful conversations
- Avoids repetition in chatbot applications
- Makes LLMs behave more like a human assistant

**<h5 style=color:navy>Example without LangChain ❌</h5>**

Each message must contain context **manually**.<br><br>

**Problems**

- We must manually track the conversation history
- Becomes inefficient for long interactions

In [None]:
# This list defines a structured conversation history for the chatbot
# Each dictionary in the list represents a message in the conversation

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hi! How can I help you?"},
    {"role": "user", "content": "What was my first message?"}
]

response = openai.ChatCompletion.create( model="gpt-4", messages=messages )

print(response["choices"][0]["message"]["content"])

**<h5 style=color:navy>Example with LangChain ✅</h5>**

**Advantages** 
-  Automatic conversation tracking
- Supports long-term memory
- Scalable for chatbots and personal assistants

In [None]:
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain
from langchain.llms import OpenAI

llm = OpenAI(model_name="gpt-4", openai_api_key="our_api_key")

memory = ConversationBufferMemory()

conversation = ConversationChain(llm=llm, memory=memory)

print(conversation.run("Hello!"))
print(conversation.run("What was my first message?"))


<hr style=color:olive></hr>

**<u><h3 style=color:olive>Agents</h3></u>**

**Agents** allow an **LLM** to **interact** with **external tools** and dynamically decide which tool to use

**<h5 style=color:navy>Why to use ?</h5>**

- Enables AI-driven decision-making
- Connects LLMs with APIs - databases - web scraping
- Reduces the need for hardcoded responses

**<h5 style=color:navy>Example without LangChain ❌</h5>**

**Problems**

- Manually coded logic for each tool
- Hard to scale with multiple tools

In [None]:
def get_weather(city):
    return f"The weather in {city} is sunny."

query = "What's the weather in Paris?"
if "weather" in query:
    response = get_weather("Paris")

print(response)

**<h5 style=color:navy>Example with LangChain ✅</h5>**

**Advantages** 
- Dynamically selects the correct tool
- Automates AI-driven decision-making
- Scalable for multiple tools - web search - APIs

In [None]:
from langchain.agents import initialize_agent, AgentType
from langchain.tools import Tool

def get_weather(city):
    return f"The weather in {city} is sunny."

weather_tool = Tool(
    name="WeatherAPI",
    func=get_weather,
    description="Fetches weather data for a given city."
)

agent = initialize_agent(
    tools=[weather_tool],
    llm=llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True
)

response = agent.run("What's the weather in Paris?")
print(response)


<hr style=color:olive></hr>

**<u><h3 style=color:olive>Indexes</h3></u>**

In LangChain - an **index** is a **structured way** to **store - organize - retrieve information** efficiently when working with **large datasets** or **document collections**. <br>

Indexes are especially useful for retrieval-augmented generation RAG - where an LLM **fetches** relevant **information** before **generating a response**.

**<h5 style=color:navy>Why to use ?</h5>**

- Efficiently **search large datasets** instead of scanning everything
- Improve response accuracy by **retrieving relevant documents** before calling an LLM
- Handle **knowledge retrieval** for applications like chatbots - question answering - document search ...

**<h5 style=color:navy>Types of Indexes in LangChain ?</h5>**

- **Vector Index** - Embedding-Based
- **Keyword Index** - Text-Based
- **Structured Index** - SQL-Based


**<h5 style=color:teal>Vector Index</h5>**

It Converts **text** into **vector embeddings** and **stores** them in a **vector database** like FAISS - Pinecone - Weaviate - ChromaDB ...<br>
It Uses **similarity search** to find the most relevant data points.<br>
It is Ideal for semantic search - Q&A - knowledge retrieval.

In [None]:
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings

# Create an embedding model
embedding_model = OpenAIEmbeddings()

# Sample documents
documents = ["LangChain helps build LLM applications.", "Vector databases store embeddings for search."]

# Convert documents into a FAISS vector index
vector_index = FAISS.from_texts(documents, embedding_model)

# Retrieve relevant information
query = "What does LangChain do?"
similar_docs = vector_index.similarity_search(query)

print(similar_docs[0].page_content)

**<h5 style=color:teal>Keyword Index</h5>**

It **Stores documents** in a simple **text-based format**.<br>
Uses **traditional keyword matching** for **search**.<br>
Best for smaller datasets where full-text search is enough.

In [None]:
from langchain.indexes import VectorstoreIndexCreator
from langchain.document_loaders import TextLoader

# Load documents from a text file
loader = TextLoader("documents.txt")

# Create a keyword-based index
index = VectorstoreIndexCreator().from_loaders([loader])

# Query the index
response = index.query("What is LangChain?")
print(response)

**<h5 style=color:teal>Structured Index</h5>**

It Uses **structured databases** to store and retrieve information.<br>
Useful for retrieving structured data like user profiles - transactions - logs ...

In [None]:
from langchain.sql_database import SQLDatabase
from langchain.chains import SQLDatabaseChain
from langchain.llms import OpenAI

# Connect to an SQL database
db = SQLDatabase.from_uri("sqlite:///my_database.db")

# Create a query chain
chain = SQLDatabaseChain(llm=OpenAI(), database=db)

# Ask a question that requires structured retrieval
response = chain.run("How many users signed up in January?")
print(response)

<hr style=color:olive></hr>

**<u><h3 style=color:magenta>Example -- Chat Application with Streamlit</h3></u>**

In [1]:
# Import necessary dependencies
import os  # Used to set environment variables
import streamlit as st  # Streamlit for creating the web app
from langchain.llms import OpenAI  # OpenAI's language model integration
from langchain.prompts import PromptTemplate  # For structuring prompts
from langchain.chains import LLMChain, SequentialChain  # Chains for managing model interactions
from langchain.memory import ConversationBufferMemory  # Memory to store conversation history
from langchain.utilities import WikipediaAPIWrapper  # Wikipedia API for fetching relevant data

# Set OpenAI API Key (ensure to keep this secret in production)
os.environ['OPENAI_API_KEY'] = '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'


# Create the Streamlit app interface
st.title('🦜🔗 YouTube GPT Creator')  # App title
prompt = st.text_input('Plug in your prompt here')  # User input field


# Define Prompt Templates
# Template for generating a YouTube video title based on a given topic
title_template = PromptTemplate(
    input_variables=['topic'],  # The expected input variable
    template='write me a youtube video title about {topic}')


# Template for generating a YouTube video script using Wikipedia research
script_template = PromptTemplate(
    input_variables=['title', 'wikipedia_research'],  # Uses title and research data
    template='write me a youtube video script based on this title TITLE: {title} while leveraging this wikipedia research: {wikipedia_research}')


# Define memory buffers to store conversation history
title_memory = ConversationBufferMemory(input_key='topic', memory_key='chat_history')  # Memory for title generation
script_memory = ConversationBufferMemory(input_key='title', memory_key='chat_history')  # Memory for script generation


# Initialize OpenAI language model with a specific temperature setting
llm = OpenAI(temperature=0.9)  # Higher temperature makes responses more creative


# Create LLM chains for title and script generation
title_chain = LLMChain(llm=llm, prompt=title_template, verbose=True, output_key='title', memory=title_memory)
script_chain = LLMChain(llm=llm, prompt=script_template, verbose=True, output_key='script', memory=script_memory)


# Initialize Wikipedia API wrapper to fetch related content
wiki = WikipediaAPIWrapper()


# Check if the user has entered a prompt
if prompt:
    # Generate a video title using the input prompt
    title = title_chain.run(prompt)

    # Fetch related information from Wikipedia
    wiki_research = wiki.run(prompt)

    # Generate a video script using the title and Wikipedia research
    script = script_chain.run(title=title, wikipedia_research=wiki_research)

    # Display the generated title
    st.write(title)

    # Display the generated script
    st.write(script)

    # Expandable sections for viewing conversation history
    with st.expander('Title History'):
        st.info(title_memory.buffer)  # Show past generated titles

    with st.expander('Script History'):
        st.info(script_memory.buffer)  # Show past generated scripts

    with st.expander('Wikipedia Research'):
        st.info(wiki_research)  # Show Wikipedia research used for the script

2025-03-18 12:48:57.147 
  command:

    streamlit run C:\Users\MTechno\AppData\Roaming\Python\Python311\site-packages\ipykernel_launcher.py [ARGUMENTS]
2025-03-18 12:48:57.151 Session state does not function when running a script without `streamlit run`
  title_memory = ConversationBufferMemory(input_key='topic', memory_key='chat_history')  # Memory for title generation
  llm = OpenAI(temperature=0.9)  # Higher temperature makes responses more creative
  title_chain = LLMChain(llm=llm, prompt=title_template, verbose=True, output_key='title', memory=title_memory)
