
---

# **Langchain Introduction**

----

## **LangChain**

LangChain is a framework for developing applications powered by language models.

- **GitHub**: https://github.com/hwchase17/langchain
- **Docs**: https://python.langchain.com/v0.2/docs/introduction/

## **Overview**

- **Installation**
- **LLMs**
- **Prompt Templates**
- **Chains**
- **Agents and Tools**
- **Memory**
- **Document Loaders**
- **Indexes**

------


### **Installation**

In [None]:
!pip install langchain langchain_community



#### **Set Up OpenAI API Key**

In [None]:
import os

os.environ["HUGGINGFACEHUB_API_TOKEN"] = "YOUR_API_KEY"
os.environ["OPENAI_API_KEY"] = "YOUR_API_KEY"


### **Large Language Models**


- The basic building block of LangChain is a **Large Language Model** which takes text as input and generates more text

- Suppose we want to generate a company name based on the company description, so we will first initialize an OpenAI wrapper. In this case, since we want the output to be more random, we will intialize our model with high temprature.

- The temperature parameter adjusts the randomness of the output. Higher values like 0.7 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

- **temperature value**--> how creative we want our model to be

- **0** ---> temperature it means model is very safe it is not taking any bets.

- **1** --> it will take risk it might generate wrong output but it is very creative

- A generic interface for all LLMs. See all LLM providers: https://python.langchain.com/docs/integrations/llms/

#### **1. OpenAI**

- To use OpenAI Model We need to first Install langchain_openai and openai Library.

In [None]:
!pip install langchain_openai openai



In [None]:
# Import the OpenAI class from the langchain_openai module
from langchain_openai import OpenAI

# Initialize the OpenAI model with a temperature setting of 0.9
# The temperature controls the randomness of the output; higher values produce more diverse responses
llm = OpenAI(temperature=0.9)

# Define a prompt asking for a suggestion for a company name
text = "Suggest a name for a company that makes colorful socks."

# Invoke the OpenAI model with the provided prompt and print the response
print(llm.invoke(text))



Rainbow Threads LLC


### **HuggingFace**

In [None]:
!pip install huggingface-hub langchain-huggingface

Collecting langchain-huggingface
  Downloading langchain_huggingface-0.1.0-py3-none-any.whl.metadata (1.3 kB)
Collecting sentence-transformers>=2.6.0 (from langchain-huggingface)
  Downloading sentence_transformers-3.2.0-py3-none-any.whl.metadata (10 kB)
Downloading langchain_huggingface-0.1.0-py3-none-any.whl (20 kB)
Downloading sentence_transformers-3.2.0-py3-none-any.whl (255 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m255.2/255.2 kB[0m [31m5.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: sentence-transformers, langchain-huggingface
Successfully installed langchain-huggingface-0.1.0 sentence-transformers-3.2.0


In [None]:
# Import the HuggingFaceHub class from the langchain_community.llms module
from langchain_community.llms import HuggingFaceHub

# Initialize the HuggingFaceHub model with specific parameters
# - repo_id specifies the model to be used (in this case, "google/flan-t5-large").
# - model_kwargs is a dictionary that contains additional model parameters:
#   - temperature: Controls the randomness of the output; set to 0 for deterministic results.
#   - max_length: Limits the maximum length of the generated output; set to 64 tokens here.
llm = HuggingFaceHub(repo_id="google/flan-t5-large", model_kwargs={"temperature": 0, "max_length": 64})

# Invoke the model with a translation task, translating from English to German
# The input prompt is asking how to translate "How old are you?" into German.
llm.invoke("translate English to German: How old are you?")

'Wie alte sind Sie?'

### **Prompt Templates**

- Currently in the above applications we are writing an entire prompt, if you are creating a user directed application then this is not an ideal case

- LangChain faciliates prompt management and optimization.

- Normally when you use an LLM in an application, you are not sending user input directly to the LLM. Instead, you need to take the user input and construct a prompt, and only then send that to the LLM.

- In many Large Language Model applications we donot pass the user input directly to the Large Language Model, we add the user input to a large piece of text called prompt template.

In [None]:
# Import the PromptTemplate class from the langchain_core.prompts module
from langchain_core.prompts import PromptTemplate

# Create an instance of PromptTemplate
# - input_variables: A list of variables that will be used in the template; here, it includes 'cuisine'.
# - template: The string template where '{cuisine}' will be replaced with the actual cuisine type.
prompt_template_name = PromptTemplate(
    input_variables=['cuisine'],
    template="I want to open a restaurant for {cuisine} food. Suggest a fancy name for this."
)

# Format the prompt template by replacing the 'cuisine' variable with the string "indian"
p = prompt_template_name.format(cuisine="indian")

# Print the formatted prompt
print(p)

I want to open a restaurant for indian food. Suggest a fancy name for this.


In [None]:
prompt = PromptTemplate.from_template("What is a good name for a company that makes {product}")

prompt.format(product="colorful socks")

'What is a good name for a company that makes colorful socks'

### **Chains**

- Combine LLMs and Prompts in multi-step workflows

- Now as we have the model:

```python
llm = OpenAI(temperature=0.9)

prompt = PromptTemplate.from_template("What is a good name for a company that makes {product}")

prompt.format(product="colorful socks")
```
- Now using Chains we will link together model and the PromptTemplate and other Chains

- The simplest and most common type of Chain is LLMChain, which passes the input first to Prompt Template and then to Large Language Model

- LLMChain is responsible to execute the PromptTemplate, For every PromptTemplate we will specifically have an LLMChain.


In [None]:

# Example 1

from langchain.chains import LLMChain
from langchain_core.prompts import PromptTemplate
from langchain_openai import OpenAI

# Initialize the LLM with the desired temperature
llm = OpenAI(temperature=0.9)

# Create the prompt template
prompt = PromptTemplate.from_template("Give me a short, creative name for a company that makes {product}")

# Create the LLMChain
chain = LLMChain(prompt=prompt, llm=llm)

# Run the chain and get the response
response = chain.run({"product": "colorful socks"})

# Extract just the first line or a cleaned version of the response
company_name = response.strip().split("\n")[0]  # In case it's a multiline response

# Print the response
print(company_name)


"Rainbow Threads Co."


#### **To Visualize How Chains Work**

In [None]:
from langchain.chains import LLMChain
from langchain_core.prompts import PromptTemplate
from langchain_openai import OpenAI

# Initialize the LLM with the desired temperature
llm = OpenAI()

# Create the prompt template
prompt_template_name = PromptTemplate(
    input_variables=['cuisine'],
    template="I want to open a restaurant for {cuisine} food. Suggest a fancy name for this."
)

# Create the LLMChain
chain = LLMChain(llm=llm, prompt=prompt_template_name,verbose=True)

# Run the chain and get the response
response = chain.invoke({"cuisine": "Mexican"})

# Extract the 'text' field from the response
fancy_name = response['text'].strip()

# Print the fancy name
print(fancy_name)




[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mI want to open a restaurant for Mexican food. Suggest a fancy name for this.[0m

[1m> Finished chain.[0m
"El Sabor Delicioso" (The Delicious Flavor)


#### **Simple Sequential Chain**

In [None]:
# Import necessary classes from Langchain modules
from langchain.chains import SimpleSequentialChain
from langchain.chains import LLMChain
from langchain_core.prompts import PromptTemplate
from langchain_openai import OpenAI

# Initialize the OpenAI model with a temperature setting of 0.6
llm = OpenAI(temperature=0.6)

# Create a prompt template for generating a restaurant name
prompt_template_name = PromptTemplate(
    input_variables=['cuisine'],
    template="I want to open a restaurant for {cuisine} food. Suggest a fancy name for this."
)

# Create an LLMChain that combines the OpenAI model with the restaurant name prompt
name_chain = LLMChain(llm=llm, prompt=prompt_template_name, verbose=True)

# Create another prompt template for suggesting menu items based on the restaurant name
prompt_template_items = PromptTemplate(
    input_variables=['restaurant_name'],
    template="Suggest some menu items for {restaurant_name}."
)

# Create an LLMChain for generating menu items based on the restaurant name
food_items_chain = LLMChain(llm=llm, prompt=prompt_template_items, verbose=True)

# Combine the two chains into a sequential chain
chain = SimpleSequentialChain(chains=[name_chain, food_items_chain])

# Invoke the chain with the input "indian"
response = chain.invoke("indian")

# Extract the 'output' part and clean it
final_output = response['output'].strip()

# Print the cleaned output
print(final_output)




[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mI want to open a restaurant for indian food. Suggest a fancy name for this.[0m

[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSuggest some menu items for 

"Spice Symphony".[0m

[1m> Finished chain.[0m
1. Spicy Chicken Tikka Masala
2. Lamb Vindaloo
3. Paneer Butter Masala
4. Tandoori Shrimp
5. Vegetable Samosas
6. Garlic Naan
7. Aloo Gobi (potato and cauliflower curry)
8. Chana Masala (chickpea curry)
9. Palak Paneer (spinach and cheese curry)
10. Mango Lassi (yogurt drink)
11. Chicken Biryani
12. Lamb Rogan Josh
13. Prawn Curry
14. Vegetable Korma
15. Dal Makhani (creamy lentil curry)
16. Tandoori Chicken
17. Mixed Vegetable Pakoras
18. Mango Chicken
19. Malai Kofta (vegetable and cheese balls in creamy sauce)
20. Rasmalai (Indian dessert made with cheese and saffron milk)


#### **Sequential Chain**

In [None]:
# Import necessary classes from Langchain modules
from langchain.chains import SequentialChain
from langchain.chains import LLMChain
from langchain_core.prompts import PromptTemplate
from langchain_openai import OpenAI

# Initialize OpenAI model
llm = OpenAI(temperature=0.7)

# Create prompt template for generating a restaurant name
prompt_template_name = PromptTemplate(
    input_variables=['cuisine'],
    template="I want to open a restaurant for {cuisine} food. Suggest a fancy name for this."
)

# Create LLMChain for restaurant name generation
name_chain = LLMChain(llm=llm, prompt=prompt_template_name, output_key="restaurant_name", verbose=True)

# Create prompt template for generating menu items
prompt_template_items = PromptTemplate(
    input_variables=['restaurant_name'],
    template="Suggest some menu items for {restaurant_name}."
)

# Create LLMChain for menu items generation
food_items_chain = LLMChain(llm=llm, prompt=prompt_template_items, output_key="menu_items", verbose=True)

# Combine the chains in a sequential chain
chain = SequentialChain(
    chains=[name_chain, food_items_chain],
    input_variables=['cuisine'],
    output_variables=['restaurant_name', 'menu_items']
)

# Invoke the chain
response = chain.invoke({"cuisine": "Pakistani"})

# Clean the output by stripping unnecessary newlines or spaces
restaurant_name = response['restaurant_name'].strip().replace('\n', '')  # Removing extra newlines
menu_items = response['menu_items'].strip()  # Clean any surrounding whitespace

# Reformat the output in the desired human-readable format
print(f"cuisine : {response['cuisine']}")
print(f"restaurant_name : {restaurant_name}")
print(f"menu_items : {menu_items}")




[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mI want to open a restaurant for Pakistani food. Suggest a fancy name for this.[0m

[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSuggest some menu items for 

"Chandni Chowk Bistro" .[0m

[1m> Finished chain.[0m
cuisine : Pakistani
restaurant_name : "Chandni Chowk Bistro"
menu_items : 1. Butter Chicken: A classic North Indian dish of marinated chicken cooked in a creamy tomato-based sauce. 

2. Samosas: Crispy pastry pockets filled with spiced potatoes and peas, served with chutney for dipping. 

3. Tandoori Platter: A selection of tandoori grilled meats such as chicken tikka, seekh kebab, and tandoori shrimp. 

4. Palak Paneer: A vegetarian dish of creamy spinach and cottage cheese cubes, served with naan bread. 

5. Dal Makhani: A slow-cooked lentil dish simmered in a rich tomato and butter sauce. 

6. Aloo Paratha: Grilled flatbread stuffe

### **Agents and Tools**

- Agents involve an LLM making decisions about which Actions to take, taking that Action, seeing an Observation, and repeating that until done.

- When used correctly agents can be extremely powerful. In order to load agents, you should understand the following concepts:

- **Tool**: A function that **performs a specific duty**. This can be things like: Google Search, Database lookup, Python REPL, other chains.

- **LLM**: The language model powering the agent.

- **Agent**: The agent to use.

- **Agent** is a very powerful concept in LangChain

For example I have to travel from Dubai to Canada, I type this in ChatGPT

---> Give me two **flight options** from Dubai to Canada on September 1, 2024 | ChatGPT will not be able to answer because has knowledge till September 2021

- **ChatGPT plus** has **Expedia Plugin**, if we enable this plugin it will go to Expedia Plugin and will try to pull information about **Flights & it will show the information**.

- **SerpApi** is a **real-time API** to access **Google search results**.

##### **Wikepedia and llm-math Tool**

In [None]:
!pip install wikipedia

Collecting wikipedia
  Downloading wikipedia-1.4.0.tar.gz (27 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: wikipedia
  Building wheel for wikipedia (setup.py) ... [?25l[?25hdone
  Created wheel for wikipedia: filename=wikipedia-1.4.0-py3-none-any.whl size=11679 sha256=1c3ba02149730bcbc908b193f67b094d442ea6257f8216b55267b69079d4bbea
  Stored in directory: /root/.cache/pip/wheels/5e/b6/c5/93f3dec388ae76edc830cb42901bb0232504dfc0df02fc50de
Successfully built wikipedia
Installing collected packages: wikipedia
Successfully installed wikipedia-1.4.0


In [None]:
# Import libraries
from langchain.agents import AgentType, initialize_agent, load_tools
from langchain_openai import OpenAI

# Initialize OpenAI with temperature 0 for factual responses
llm = OpenAI(temperature=0)

# Load the tools we want: Wikipedia for information retrieval and Calculator for math (if needed)
tools = load_tools(["wikipedia", "llm-math"], llm=llm)

# Initialize the agent with the loaded tools, the language model, and the zero-shot agent type
agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True
)

# Query the agent to get the estimated GDP of the US in 2024 using the Wikipedia tool
response = agent.invoke("What is the estimated GDP of the US in 2024 from Wikipedia?")

# Print the result
print(response)




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should use Wikipedia to find the answer.
Action: wikipedia
Action Input: "GDP of the US in 2024"[0m
Observation: [36;1m[1;3mPage: List of U.S. states and territories by GDP
Summary: This is a list of U.S. states and territories by gross domestic product (GDP). This article presents the 50 U.S. states and the District of Columbia and their nominal GDP at current prices.
The data source for the list is the Bureau of Economic Analysis (BEA) in 2024. The BEA defined GDP by state as "the sum of value added from all industries in the state."
Nominal GDP does not take into account differences in the cost of living in different countries, and the results can vary greatly from one year to another based on fluctuations in the exchange rates of the country's currency. Such fluctuations may change a country's ranking from one year to the next, even though they often make little or no difference in the standard of living of its popu

### **Memory**

- Chatbot application like ChatGPT, you will notice that it **remember past information**.

In [None]:
# Import necessary classes from Langchain modules
from langchain.chains import SimpleSequentialChain
from langchain.chains import LLMChain
from langchain_core.prompts import PromptTemplate
from langchain_openai import OpenAI

# Initialize the OpenAI model with a temperature setting of 0.6
llm = OpenAI(temperature=0.6)

# Create a prompt template for generating a restaurant name
prompt_template_name = PromptTemplate(
    input_variables=['cuisine'],
    template="I want to open a restaurant for {cuisine} food. Suggest a fancy name for this."
)

# Create an LLMChain that combines the OpenAI model with the restaurant name prompt
name_chain = LLMChain(llm=llm, prompt=prompt_template_name, verbose=True)

# Create another prompt template for suggesting menu items based on the restaurant name
prompt_template_items = PromptTemplate(
    input_variables=['restaurant_name'],
    template="Suggest some menu items for {restaurant_name}."
)

# Create an LLMChain for generating menu items based on the restaurant name
food_items_chain = LLMChain(llm=llm, prompt=prompt_template_items, verbose=True)

# Combine the two chains into a sequential chain
chain = SimpleSequentialChain(chains=[name_chain, food_items_chain])

# Invoke the chain with the input "indian"
response = chain.invoke("indian")

# Extract the 'output' part and clean it
final_output = response['output'].strip()

# Print the cleaned output
print(final_output)




[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mI want to open a restaurant for indian food. Suggest a fancy name for this.[0m

[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSuggest some menu items for 

"Maharaja's Kitchen".[0m

[1m> Finished chain.[0m
1. Tandoori Chicken: Grilled chicken marinated in traditional Indian spices
2. Butter Chicken: Tender chicken cooked in a creamy tomato-based sauce
3. Rogan Josh: Lamb cooked in a rich, aromatic gravy
4. Palak Paneer: Spinach and cottage cheese curry
5. Dal Makhani: Slow-cooked black lentils in a creamy sauce
6. Vegetable Biryani: Fragrant basmati rice layered with mixed vegetables and spices
7. Chicken Tikka Masala: Grilled chicken in a spicy tomato and onion-based sauce
8. Aloo Gobi: Cauliflower and potatoes cooked in a blend of spices
9. Naan Bread: Freshly baked flatbread, perfect for dipping in curries
10. Mango Lassi: A refreshing yog

In [None]:
# Invoke the chain with the input "indian"
response = chain.invoke("Pakistani")

# Extract the 'output' part and clean it
final_output = response['output'].strip()

# Print the cleaned output
print(final_output)



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mI want to open a restaurant for Pakistani food. Suggest a fancy name for this.[0m

[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSuggest some menu items for 

"Spice of Pakistan" or "Pakistani Delights".[0m

[1m> Finished chain.[0m
1. Chicken Biryani: A fragrant rice dish cooked with tender chicken, spices, and herbs.

2. Tandoori Chicken: Marinated chicken cooked in a clay oven, served with a side of mint chutney.

3. Seekh Kebab: Skewers of minced beef or lamb, seasoned with spices and grilled to perfection.

4. Palak Paneer: Creamy spinach and cheese curry, a popular vegetarian dish in Pakistan.

5. Haleem: A slow-cooked stew made with lentils, wheat, and meat, served with crispy fried onions and lemon.

6. Naan Bread: Traditional flatbread, perfect for scooping up curries or kebabs.

7. Aloo Gosht: Tender lamb or beef cooked with potatoes

In [None]:
# Let's Check Memory
chain.memory

In [None]:
type(chain.memory)

NoneType

- We can Clearly see that no memory reference is Present.

### **ConversationBufferMemory**

- We can attach memory to remember all previous conversation.

In [None]:
# Importing the ConversationBufferMemory class from the langchain.memory module
from langchain.memory import ConversationBufferMemory

# Initializing an instance of ConversationBufferMemory to store conversation history
memory = ConversationBufferMemory()

# Creating an LLMChain instance with a language model (llm), a prompt template (prompt_template_name),
# and the memory instance to keep track of the conversation context
chain = LLMChain(llm=llm, prompt=prompt_template_name, memory=memory)

# Running the LLMChain with the input "Mexican" and storing the result in the variable 'name'
name = chain.invoke("Mexican")

# Extracting and printing the output stored in the 'text' key of the 'name' dictionary
print(name['text'])




"El Jardín de Sabores" (The Garden of Flavors)


In [None]:
name = chain.invoke("Arabic")
print(name['text'])



"Al-Amirah's Palace of Flavors"


In [None]:
print(chain.memory.buffer)

Human: Mexican
AI: 

"El Jardín de Sabores" (The Garden of Flavors)
Human: Arabic
AI: 

"Al-Amirah's Palace of Flavors"


- As we Can see that our memory is being saved.


### **ConversationChain**

- Conversation buffer memory goes growing endlessly. If we want to remember last few conversations we can use **ConversationChain**.

- Just remember last 5 Conversation Chain.

- Just remember last 10-20 Conversation Chain.



In [None]:
from langchain.chains import ConversationChain

convo = ConversationChain(llm=OpenAI(temperature=0.7))
print(convo.prompt.template)

The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
{history}
Human: {input}
AI:


In [None]:
convo.run("Who won the first cricket world cup?")

' The first cricket world cup was held in 1975 and was won by the West Indies team. The tournament was held in England and was a 60-over match. The West Indies team beat the Australia team in the final by 17 runs. Do you have any other questions about the cricket world cup?'

In [None]:
convo.run("How much is 5+5?")

'  5+5 is equal to 10. Is there anything else you would like to know?'

In [None]:
print(convo.memory.buffer)

Human: Who won the first cricket world cup?
AI:  The first cricket world cup was held in 1975 and was won by the West Indies team. The tournament was held in England and was a 60-over match. The West Indies team beat the Australia team in the final by 17 runs. Do you have any other questions about the cricket world cup?
Human: How much is 5+5?
AI:   5+5 is equal to 10. Is there anything else you would like to know?


### **ConversationBufferWindowMemory**

In [None]:
from langchain.memory import ConversationBufferWindowMemory

memory = ConversationBufferWindowMemory(k=3)

convo = ConversationChain(
    llm=OpenAI(temperature=0.7),
    memory=memory
)
convo.invoke("Who won the first cricket world cup?")

{'input': 'Who won the first cricket world cup?',
 'history': '',
 'response': ' The first cricket world cup was held in 1975 and was won by the West Indies team.'}

In [None]:
convo.invoke("Who was the captain of the winning team?")

{'input': 'Who was the captain of the winning team?',
 'history': 'Human: Who won the first cricket world cup?\nAI:  The first cricket world cup was held in 1975 and was won by the West Indies team.',
 'response': ' The captain of the West Indies team during the 1975 cricket world cup was Clive Lloyd. He also captained the team during their second world cup win in 1979.'}

In [None]:
print(convo.memory.buffer)

Human: Who won the first cricket world cup?
AI:  The first cricket world cup was held in 1975 and was won by the West Indies team.
Human: Who was the captain of the winning team?
AI:  The captain of the West Indies team during the 1975 cricket world cup was Clive Lloyd. He also captained the team during their second world cup win in 1979.


### **Document Loaders**

In [None]:
!pip install pypdf

Collecting pypdf
  Downloading pypdf-5.0.1-py3-none-any.whl.metadata (7.4 kB)
Downloading pypdf-5.0.1-py3-none-any.whl (294 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m294.5/294.5 kB[0m [31m4.8 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pypdf
Successfully installed pypdf-5.0.1


In [None]:
from langchain.document_loaders import PyPDFLoader

loader = PyPDFLoader("/content/Transformer_Models_and_BERT_Models.pdf")
pages = loader.load()

In [None]:
pages

 Document(metadata={'source': '/content/Transformer_Models_and_BERT_Models.pdf', 'page': 1}, page_content='import tensorflow_hub as hub\nimport tensorflow_text as text\nfrom google.cloud import aiplatform\nfrom official.nlp import optimization  # to create AdamW optmizer\ntf.get_logger().setLevel("ERROR")\nTo check if you have a GPU attached. Run the following.\nprint("Num GPUs Available: ", len(tf.config.list_physical_devices("GPU")))\nSentiment Analysis\nThis notebook trains a sentiment analysis model to classify movie reviews as positive or negative, \nbased on the text of the review.\nYou\'ll use the Large Movie Review Dataset that contains the text of 50,000 movie reviews from the \nInternet Movie Database.\nDownload the IMDB dataset\nLet\'s download and extract the dataset, then explore the directory structure.\nurl = "https://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz"\n# Set a path to a folder outside the git repo. This is important so data won\'t get indexed by gi