<a href="https://colab.research.google.com/github/rabbitmetrics/langchain-13-min/blob/main/notebooks/langchain-13-min.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# Import Libraries
import os
from secret_key import openapi_key
from langchain_openai import OpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain.chains import SequentialChain

In [2]:
# Load environment variables

os.environ['OPENAI_API_KEY'] = openapi_key

# Run basic query with OpenAI wrapper

llm = OpenAI (temperature=0.7)

output = llm.invoke("explain large language models in one sentence")
print(output)

# Chain 1: Restaurant Name

prompt_template_name = PromptTemplate(
    input_variables=['cuisine'],
    template="I want to open a restaurant for {cuisine} food. Suggest a fancy name for this."
)

name_chain = LLMChain(llm=llm, prompt=prompt_template_name, output_key="restaurant_name")

print(name_chain.invoke({'cuisine': 'Indian'}))

# Chain 2: Menu Items
prompt_template_items = PromptTemplate(
    input_variables=['restaurant_name'],
    template="Suggest some menu items for {restaurant_name}. Return it as a comma separated string"
)

food_items_chain = LLMChain(llm=llm, prompt=prompt_template_items, output_key="menu_items")


chain = SequentialChain(
    chains=[name_chain, food_items_chain],
    input_variables=['cuisine'],
    output_variables=['restaurant_name', "menu_items"]
)

response = chain.invoke({'cuisine': 'Indian'})

print(response)



Large language models are computer programs that use vast amounts of data to generate human-like text and complete tasks such as translation and summarization.


  name_chain = LLMChain(llm=llm, prompt=prompt_template_name, output_key="restaurant_name")


{'cuisine': 'Indian', 'restaurant_name': '\n\n"Spice Kingdom: A Taste of India"'}
{'cuisine': 'Indian', 'restaurant_name': '\n\n"Maharaja\'s Palace"', 'menu_items': '\n\n1. Tandoori Chicken\n2. Butter Chicken\n3. Chicken Tikka Masala\n4. Lamb Rogan Josh\n5. Vegetable Biryani\n6. Palak Paneer\n7. Dal Makhani\n8. Naan Bread\n9. Samosas\n10. Mango Lassi\n\n"Tandoori Chicken, Butter Chicken, Chicken Tikka Masala, Lamb Rogan Josh, Vegetable Biryani, Palak Paneer, Dal Makhani, Naan Bread, Samosas, Mango Lassi"'}


In [3]:
# import schema for chat messages and ChatOpenAI in order to query chatmodels GPT-3.5-turbo or GPT-4

from langchain.schema import (
    AIMessage,
    HumanMessage,
    SystemMessage
)
from langchain_openai import ChatOpenAI

In [4]:
chat = ChatOpenAI(model_name="gpt-3.5-turbo",temperature=0.3)
messages = [
    SystemMessage(content="You are an expert data scientist"),
    HumanMessage(content="Write a Python script that trains a neural network on simulated data ")
]
response=chat.invoke(messages)

print(response.content,end='\n')

Sure! Here is an example Python script that trains a simple neural network on simulated data using the TensorFlow library:

```python
import numpy as np
import tensorflow as tf

# Generate simulated data
np.random.seed(0)
X = np.random.rand(1000, 2)
y = np.random.randint(0, 2, (1000, 1))

# Define the neural network architecture
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(64, activation='relu', input_shape=(2,)),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Train the model
model.fit(X, y, epochs=10, batch_size=32)

# Evaluate the model
loss, accuracy = model.evaluate(X, y)
print(f'Loss: {loss}, Accuracy: {accuracy}')
```

In this script:
- We first generate simulated data with 2 features and binary labels.
- We define a simple neural network with 2 hidden layers of 64 units each and a final output layer w

In [5]:
# Import prompt and define PromptTemplate

from langchain import PromptTemplate

template = """
You are an expert data scientist with an expertise in building deep learning models. 
Explain the concept of {concept} in a couple of lines
"""

prompt = PromptTemplate(
    input_variables=["concept"],
    template=template,
)

In [6]:
# Run LLM with PromptTemplate

llm.invoke(prompt.format(concept="autoencoder"))

'\nAn autoencoder is a type of neural network that is trained to reconstruct its input data. It consists of an encoder and decoder, with the encoder learning to compress the input data into a lower-dimensional representation, and the decoder learning to reconstruct the original data from the compressed representation. This unsupervised learning technique is often used for dimensionality reduction, data denoising, and anomaly detection.'

In [7]:
# Import LLMChain and define chain with language model and prompt as arguments.

from langchain.chains import LLMChain
chain = LLMChain(llm=llm, prompt=prompt)

# Run the chain only specifying the input variable.
print(chain.invoke("autoencoder"))

{'concept': 'autoencoder', 'text': '\nAutoencoders are neural networks that are trained to reconstruct their input data. They consist of an encoder that compresses the input data into a lower dimensional representation, and a decoder that reconstructs the original data from the compressed representation. This allows for unsupervised learning of feature representations and can be used for tasks such as data compression, anomaly detection, and data denoising. '}


In [8]:
# Define a second prompt 

second_prompt = PromptTemplate(
    input_variables=["ml_concept"],
    template="Turn the concept description of {ml_concept} and explain it to me like I'm five in 500 words",
)
chain_two = LLMChain(llm=llm, prompt=second_prompt)

In [9]:
# Define a sequential chain using the two chains above: the second chain takes the output of the first chain as input

from langchain.chains import SimpleSequentialChain
overall_chain = SimpleSequentialChain(chains=[chain, chain_two], verbose=False)

# Run the chain specifying only the input variable for the first chain.
explanation = overall_chain.invoke("autoencoder")
print(explanation)

{'input': 'autoencoder', 'output': "\n\nImagine you have a big box of toys, but you can't fit all of them into your small toy chest. So you decide to pick out your favorite toys and put them into the chest, leaving out the ones you don't play with as much.\n\nNow, let's say your friend comes over and wants to play with all of your toys. But you don't have enough space in your chest to fit everything back in. So you have to decide which toys to take out to make room for the other ones. This is kind of like what an autoencoder does.\n\nAn autoencoder is like a big toy box for data. It takes in a lot of information, like pictures or numbers, and tries to fit it into a smaller space. Just like you had to pick your favorite toys to fit in your chest, the autoencoder chooses the most important parts of the data to keep and throws out the less important parts.\n\nBut the cool thing is, the autoencoder doesn't just throw out the extra information. It actually learns how to compress the data in

In [10]:
# Import utility for splitting up texts and split up the explanation given above into document chunks

from langchain_text_splitters import CharacterTextSplitter
from langchain_community.document_loaders import TextLoader

# path to an example text file
loader = TextLoader("./state_of_the_union.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

print(docs)


Created a chunk of size 1163, which is longer than the specified 1000
Created a chunk of size 1015, which is longer than the specified 1000


[Document(metadata={'source': './state_of_the_union.txt'}, page_content='Madame Speaker, Vice President Biden, members of Congress, distinguished guests, and fellow Americans:\n\nOur Constitution declares that from time to time, the president shall give to Congress information about the state of our union. For 220 years, our leaders have fulfilled this duty. They have done so during periods of prosperity and tranquility. And they have done so in the midst of war and depression; at moments of great strife and great struggle.'), Document(metadata={'source': './state_of_the_union.txt'}, page_content="It's tempting to look back on these moments and assume that our progress was inevitable, that America was always destined to succeed. But when the Union was turned back at Bull Run and the Allies first landed at Omaha Beach, victory was very much in doubt. When the market crashed on Black Tuesday and civil rights marchers were beaten on Bloody Sunday, the future was anything but certain. Thes

In [11]:
# Individual text chunks can be accessed with "page_content"

docs[0].page_content

'Madame Speaker, Vice President Biden, members of Congress, distinguished guests, and fellow Americans:\n\nOur Constitution declares that from time to time, the president shall give to Congress information about the state of our union. For 220 years, our leaders have fulfilled this duty. They have done so during periods of prosperity and tranquility. And they have done so in the midst of war and depression; at moments of great strife and great struggle.'

In [12]:
# Import and instantiate OpenAI embeddings

from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-large", dimensions=1024)

In [13]:
# Turn the first text chunk into a vector with the embedding

query_result = embeddings.embed_query(docs[0].page_content)
print(query_result)

[0.014747709035873413, -0.0023592801298946142, -0.01125738862901926, -0.026084624230861664, 0.019033292308449745, -0.06089947000145912, -0.03145706653594971, 0.0612882636487484, 0.01721302419900894, 0.04492352157831192, 0.03849072754383087, 0.06892278790473938, 0.026261350139975548, 0.011425277218222618, -0.07493144273757935, -0.021719515323638916, -0.03569846972823143, 0.007749395444989204, -0.0032274420373141766, -0.03792520612478256, -0.0680038183927536, 0.05598651245236397, -0.02256779558956623, 0.009083669632673264, 0.024299701675772667, 0.01864449866116047, 0.027728168293833733, 0.03387819975614548, 0.06690812110900879, 0.007082258351147175, 0.00034047194640152156, 0.05043734237551689, -0.005315007176250219, -0.013510633260011673, 0.03884417563676834, 0.005955635569989681, 0.015887586399912834, 0.009260395541787148, -5.129169949213974e-05, 0.03343638777732849, -0.021472100168466568, -0.009472465142607689, -0.022638484835624695, 0.011222044005990028, 0.051745109260082245, 0.013899

In [14]:
# Import and initialize Pinecone client

import os
from pinecone import Pinecone, ServerlessSpec
from secret_key import pineconeapi_key
from langchain_pinecone import PineconeVectorStore
os.environ['PINECONE_API_KEY'] = pineconeapi_key

pc = Pinecone(api_key="YOUR_API_KEY")

print(pineconeapi_key)

pcsk_7A1TPN_MS8g1LuZcvvpXAFVN9sobPRDXSH1ipWhhMNPxfp3yLmLpq9azhYDTDXn5af1qXN



For example, replace imports like: `from langchain_core.pydantic_v1 import BaseModel`
with: `from pydantic import BaseModel`
or the v1 compatibility namespace if you are working in a code base that has not been fully upgraded to pydantic 2 yet. 	from pydantic.v1 import BaseModel

  from langchain_pinecone.vectorstores import Pinecone, PineconeVectorStore


In [15]:
# Upload vectors to Pinecone
index_name = "langchain-quickstart"

vectorstore_from_docs = PineconeVectorStore.from_documents(
        docs,
        index_name=index_name,
        embedding=embeddings
    )

In [16]:
# Do a simple vector similarity search

query = "How to deal with our economy?"
result = vectorstore_from_docs.similarity_search(query)

print(result)

[Document(id='c3df6fb6-d83d-43a0-96a1-81a9f7b00efd', metadata={'source': './state_of_the_union.txt'}, page_content="But the truth is, these steps still won't make up for the 7 million jobs we've lost over the last two years. The only way to move to full employment is to lay a new foundation for long-term economic growth and finally address the problems that America's families have confronted for years.\n\nWe cannot afford another so-called economic expansion like the one from last decade -- what some call the lost decade -- where jobs grew more slowly than during any prior expansion, where the income of the average American household declined while the cost of health care and tuition reached record highs, where prosperity was built on a housing bubble and financial speculation.\n\nFrom the day I took office, I have been told that addressing our larger challenges is too ambitious -- that such efforts would be too contentious, that our political system is too gridlocked and that we shoul

In [17]:
#!pip install langchain openai google-search-results
from langchain.agents import Tool
from langchain.agents import AgentType
from langchain.memory import ConversationBufferMemory
from langchain.llms import OpenAI
from langchain.utilities import SerpAPIWrapper
from langchain.agents import initialize_agent

import os
from secret_key import serpapi_key

print(serpapi_key)


search = SerpAPIWrapper(serpapi_api_key=serpapi_key)
tools = [
Tool(
name = "Current Search",
func=search.run,
description="useful for when you need to answer questions about current events or the current state of the world"
),
]

829ec43a33397f677d52519d1cbeab42760944da5fe64e8f7b3799fc895e735a


In [18]:
# Execute the Search agent

search.run("Who will be the US Predient in 2024?")

'["Presidential elections were held in the United States on November 5, 2024. The Republican Party\'s ticket—Donald Trump, who served as the 45th president of ...", \'View maps and real-time results for the 2024 US presidential election matchup between former President Donald Trump and Vice President Kamala Harris.\', \'The APP excludes "over" and "under" votes in the total votes cast, which also impacts the vote percentage for candidates. Harris-Walz 2024. Trump-Vance 2024.\', \'Former President Donald Trump (R) won the November 5, 2024, presidential election. Twenty-four candidates appeared on presidential election ballots across the ...\', \'45th & 47th President of the United States. President Donald J. Trump Official Presidential Portrait. After a landslide election victory in 2024, President ...\', \'Donald Trump passed the critical threshold of 270 electoral college votes with a projected win in the state of Wisconsin making him the next US president.\', "Donald Trump\'s second 

In [19]:
from langchain.agents import AgentType, initialize_agent, load_tools

os.environ["SERPAPI_API_KEY"] = serpapi_key

tools = load_tools(["serpapi"],llm=llm)
                   
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
                   
agent.run("What is GDP of US in 2023?")

  agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
  agent.run("What is GDP of US in 2023?")




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should always think about what to do
Action: Search
Action Input: "GDP of US in 2023"[0m
Observation: [36;1m[1;3m['Current-dollar GDP increased 6.3 percent, or $1.61 trillion, in 2023 to a level of $27.36 trillion, compared with an increase of 9.1 percent, or ...', 'U.S. GDP for 2023 was 27.721 trillion US dollars, a 6.59% increase from 2022. U.S. GDP for 2022 was 26.007 trillion US dollars, a 9.82% increase from 2021.', 'Gross domestic product of the United States from 1990 to 2024 (in billion current U.S. dollars ) ; 2023, 27,720.7 ; 2022, 26,006.9 ; 2021, 23,681.2.', 'Economy of the United States · Increase $30.507 trillion (nominal; 2025) · Increase $30.507 trillion (PPP; 2025).', 'GDP (current US$) - United States. Country official statistics, National ... 2023, 2022, 2021, 2020, 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012, 2011, 2010 ...', 'The nominal GDP per capita in the United States of America was $85,809.

'The GDP of the US in 2023 is projected to be around $27.36 trillion.'

In [20]:
output = llm.invoke("What is GDP of US in 2024?")
print(output)



It is not possible to accurately predict the GDP of the US in 2024. Economic factors, policies, and events can greatly impact the country's GDP over the next few years. The GDP for 2024 will only be known once the year has passed.


In [21]:
agent.run("What is GDP of US in 2025?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I should think about using a reliable source for current GDP information
Action: Search
Action Input: "GDP of US 2025"[0m
Observation: [36;1m[1;3m['Real gross domestic product (GDP) increased at an annual rate of 3.0 percent in the second quarter of 2025 (April, May, and June), according to ...', 'From Q1 2025 to Q2 2025: Real GDP increased by 3.29 percent annualized. Current-dollar nominal GDP increased by 5.33 percent annualized, or $391.855 billion ...', 'Real gross domestic product (GDP) increased at an annual rate of 3.3 percent in the second quarter of 2025 (April, May, and June), according to ...', 'Economy of the United States · Increase $30.507 trillion (nominal; 2025) · Increase $30.507 trillion (PPP; 2025).', 'We expect real GDP growth of 1.5% and 1.3% in 2025 and 2026, respectively, with Q4 2025 growth slowing to a mere 0.8% y/y. While the probability of a recession ...', "World Economics estimates United Stat

'Based on the results from my search, the GDP of the US in 2025 is estimated to be around $30.5 trillion in nominal terms and $27.6 trillion in PPP terms. However, there is also a possibility of a recession and slower growth in the fourth quarter of 2025.'

In [22]:
output = llm.invoke("What is GDP of US in 2025?")
print(output)



It is impossible to accurately predict the GDP of the US in 2025 as it is influenced by many factors such as economic policies, global events, and technological advancements. However, according to the Congressional Budget Office, the US GDP is projected to reach $29.2 trillion in 2025, assuming no major changes in economic policies. 
