# Intro to LangChain
LangChain is a popular framework that allows you to quickly build applications and pipelines of Large Language Models (LLMs). You can use it to create chatbots, RAGs, agents and much more.

The main idea of the library is that we can create a _chain_ of different components to create more complex applications. These _chains_ (you can think of them as pipelines) can be made up of various components such as:
- **Prompts templates**: Prompts templates are templates to generate different type of prompts. Like chat prompts, question answering prompts, etc.
- **LLMs**: Large Language Models are the core of LangChain. You can use any LLM that is compatible with the library, like OpenAI, Hugging Face, LLama, etc.
- **Tools**: Tools are functions that can be used by the LLM to perform specific tasks. For example, you can use a tool to search the web, or to access a database.
- **Agents**: Agents are components that can use LLMs and tools to perform specific tasks. They can be used to create chatbots, **R**etrieval **A**ugumentation **G**eneration (RAGs), etc.
- **Retrievers**: Retrievers are components that can be used to retrieve information from a database or a knowledge base. They can be used to create RAGs, or to retrieve information from a database.  
- **Memory**: Memory is a component that can be used to store information about the conversation. It can be used to create chatbots that can remember previous conversations, or to create RAGs that can remember previous queries.

## Using LLMs in LangChain

LangChain supports a wide range of providers for LLMs, including OpenAI, Hugging Face, Groq,  LLama and many others.

Let's start our exploration of LangChain by using Grog integration. 

### Groq Integration
Groq is a provider of LLMs that offers high-performance inference capabilities. To use Groq with LangChain, you need to set up your API key in the `.env` file. Follow the steps in the README.md file to set up your environment.

In [1]:
from dotenv import load_dotenv
import warnings
from langchain_groq import ChatGroq
from langchain_core.prompts import PromptTemplate

#### Load Credentials from .env file

In [2]:
load_dotenv()

True

#### Defining the LLM (Using Groq)

We can define the LLM using the [`ChatGroq`](https://python.langchain.com/docs/integrations/chat/groq/) class from the `langchain_groq` module. 
This class allows us to specify:
+ the model - below we use `llama-3.1-8b-instant`
+ the temperature - we set it to `0.1` for more deterministic responses
+ the maximum tokens - we set it to `512` to limit the response length

In [3]:
llm = ChatGroq(
    model="llama-3.1-8b-instant",
    temperature=0.1,
    max_tokens=512,
)

#### Build prompt template
A prompt is a set of instructions or input provided by a user to an LLM to guide its response. It helps the model understand the context and generate relevant output. In LangChain, we can create a prompt template using the `PromptTemplate` class.

In [13]:
# example for nutrition assistand

template_na = """You are a nutrition assistant.

Task: Estimate calories and macros for the meal described.
If details are missing, ask up to 3 clarifying questions.

Meal: {meal}

Return in this format:
- Estimate: calories, protein_g, carbs_g, fat_g
- Assumptions:
- Clarifying questions:
"""
prompt_na = PromptTemplate(template=template_na, input_variables=["meal"])

In [4]:
template = """Question: {question}

Answer: """
prompt = PromptTemplate(template=template, input_variables=["question"])

The __input_variables__ are defined in the template using curly braces '{}'. This allows us to dynamically insert values into the template when we use it.

#### Define Chain
A chain is sequence of components that are executed in order to produce a final output. In LangChain, we can use the pipe symbol `|` to define a chain of components. The output of one component is passed as input to the next component in the chain.

In [5]:
chain = prompt | llm

In [14]:
chain_na = prompt_na | llm

#### Invoke the Chain

In [25]:
question_na = "A family portion of wheat noodles, mixed with lentil noodles and a full glass of arrabiata sauce"

In [26]:
answer_na = chain_na.invoke(input={"meal": question_na})

In [27]:
print(answer_na.content.strip())

- Estimate: 
  I'll need more information to provide an accurate estimate. However, I can make some assumptions based on average values.

Assuming a family portion is around 400-500g of noodles and 250-300g of arrabiata sauce, here's a rough estimate:

- Estimate: 1200-1500 calories, 20-25g protein, 150-200g carbs, 20-25g fat

- Assumptions:
  - Wheat noodles: 150-200g per 100g serving (approx. 350-400 calories, 7-8g protein, 70-80g carbs, 2-3g fat)
  - Lentil noodles: 100-150g per 100g serving (approx. 200-300 calories, 10-12g protein, 30-40g carbs, 2-3g fat)
  - Arrabiata sauce: 100-150 calories per 100g serving (approx. 10-15g fat, 2-3g protein, 10-15g carbs)

- Clarifying questions:
1. What is the exact weight of the noodles (wheat and lentil combined)?
2. Is the arrabiata sauce homemade or store-bought?
3. Are there any additional ingredients in the meal (e.g., vegetables, meat, cheese)?


In [6]:
question = "What is the backpropagation algorithm?"

In [7]:
answer = chain.invoke(input={"question": question})

In [8]:
print(answer.content.strip())

**Backpropagation Algorithm**

The backpropagation algorithm is a widely used method for training artificial neural networks (ANNs). It is an optimization technique used to minimize the error between the network's predictions and the actual output. The algorithm is based on the concept of gradient descent, which iteratively adjusts the network's weights and biases to reduce the error.

**How Backpropagation Works**

Here's a step-by-step explanation of the backpropagation algorithm:

1. **Forward Pass**: The network processes the input data, and the output is calculated using the current weights and biases.
2. **Error Calculation**: The difference between the predicted output and the actual output is calculated, which is known as the error.
3. **Backward Pass**: The error is propagated backwards through the network, and the gradients of the error with respect to each weight and bias are calculated.
4. **Weight Update**: The weights and biases are updated using the gradients and a learn

If we'd like to ask multiple questions we can by passing a list of dictionary objects, where the dictionaries must contain the input variable set in our prompt template ("question") that is mapped to the question we'd like to ask.

In [28]:
qs = [ 
    {"question": "What is the backpropagation algorithm?"},
    {"question": "What is the purpose of the activation function in a neural network?"},
    {"question": "What is the difference between supervised and unsupervised learning?"},
    {"question": "Explain the concept of overfitting in machine learning."},
]

In [29]:
answers = chain.batch(qs)

In [30]:
for question, answer in zip(qs, answers):
    print("=" * 100)
    print(f"Question: {question['question']}")
    print(f"Answer: {answer.content.strip()}")
    print("=" * 100)

Question: What is the backpropagation algorithm?
Answer: **Backpropagation Algorithm**

The backpropagation algorithm is a widely used method for training artificial neural networks. It is an optimization technique used to minimize the error between the network's predictions and the actual output.

**How Backpropagation Works**

The backpropagation algorithm works as follows:

1. **Forward Pass**: The network processes the input data and produces an output.
2. **Error Calculation**: The difference between the predicted output and the actual output is calculated, resulting in an error value.
3. **Backward Pass**: The error value is propagated backwards through the network, adjusting the weights and biases of each layer to minimize the error.
4. **Weight Update**: The weights and biases are updated based on the error gradient, using an optimization algorithm such as stochastic gradient descent (SGD).

**Key Components of Backpropagation**

1. **Activation Functions**: Used to introduce n