<a href="https://colab.research.google.com/github/rabbitmetrics/langchain-13-min/blob/main/notebooks/langchain-13-min.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [22]:
# Load environment variables

from dotenv import load_dotenv,find_dotenv
load_dotenv(find_dotenv())

True

In [3]:
from langchain.llms import OpenAI
from langchain import PromptTemplate, LLMChain


In [2]:
# import schema for chat messages and ChatOpenAI in order to query chatmodels GPT-3.5-turbo or GPT-4

from langchain.schema import (
    AIMessage,
    HumanMessage,
    SystemMessage
)
from langchain.chat_models import ChatOpenAI
     

chat = ChatOpenAI(model_name="gpt-3.5-turbo",temperature=0.3)
messages = [
    SystemMessage(content="You are an expert data scientist"),
    HumanMessage(content="Write a Python script that trains a neural network on simulated data ")
]
response=chat(messages)

print(response.content,end='\n')
     

Sure! Here's an example Python script that trains a simple neural network on simulated data using the Keras library:

```python
import numpy as np
from keras.models import Sequential
from keras.layers import Dense

# Generate simulated data
np.random.seed(0)
X = np.random.rand(1000, 10)
y = np.random.randint(2, size=(1000, 1))

# Define the neural network architecture
model = Sequential()
model.add(Dense(32, input_dim=10, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

# Compile the model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# Train the model
model.fit(X, y, epochs=10, batch_size=32)

# Evaluate the model
loss, accuracy = model.evaluate(X, y)
print(f'Loss: {loss}, Accuracy: {accuracy}')
```

In this script, we first generate simulated data using `numpy.random.rand()` and `numpy.random.randint()`. The input data `X` is a 1000x10 matrix of random numbers between 0 and 1, and the target labels `y` are a 1000x1 matrix of random b

In [5]:
# Import prompt and define PromptTemplate

from langchain import PromptTemplate
from langchain.llms import OpenAI
template = """
You are an expert data scientist with an expertise in building deep learning models. 
Explain the concept of {concept} in a couple of lines
"""
llm = OpenAI()

prompt = PromptTemplate(
    input_variables=["concept"],
    template=template,
)
     

# Run LLM with PromptTemplate

llm(prompt.format(concept="autoencoder"))
llm(prompt.format(concept="regularization"))

'\nRegularization is a technique used to reduce overfitting of a model by adding a penalty term to the loss function. This penalty term is usually added as a regularization parameter that is multiplied by the weights in the model. This helps to reduce the complexity of the model and helps to prevent overfitting.'

In [8]:
from langchain.chains import LLMChain
chain = LLMChain(llm=llm, prompt=prompt)

# Run the chain only specifying the input variable.
print(chain.run("autoencoder"))


Autoencoders are a type of artificial neural network that are used to learn the underlying structure of data by compressing it into a lower-dimensional representation (encoding) and then reconstructing it from that representation (decoding). Autoencoders can be used for data compression, feature extraction, and anomaly detection.


In [9]:
# Define a second prompt 

second_prompt = PromptTemplate(
    input_variables=["ml_concept"],
    template="Turn the concept description of {ml_concept} and explain it to me like I'm five in 500 words",
) 
chain_two = LLMChain(llm=llm, prompt=second_prompt)

# Define a sequential chain using the two chains above: the second chain takes the output of the first chain as input

from langchain.chains import SimpleSequentialChain
overall_chain = SimpleSequentialChain(chains=[chain, chain_two], verbose=True)

# Run the chain specifying only the input variable for the first chain.
explanation = overall_chain.run("autoencoder")
print(explanation)



[1m> Entering new SimpleSequentialChain chain...[0m


[36;1m[1;3m
Autoencoders are a type of neural network that can be used for unsupervised learning. They learn to encode input data into a compressed representation (the "encoding"), then decode that representation back into a reconstruction of the original input data. This process can be used to identify patterns in data and reduce its dimensionality.[0m
[33;1m[1;3m

An autoencoder is a type of computer program that can help teach a computer how to learn without a teacher. It's kind of like a game where the computer figures out the answer on its own! 

Let's say you have a big box of apples and oranges. An autoencoder would look at all of the apples and oranges and figure out what the most important parts are. It would then save the important parts in a smaller box, like a mini version of the big box. 

The autoencoder is kind of like a detective. It looks for patterns and figures out which parts are most important. It then takes the most important parts and stores them in the smal

In [10]:
# Import utility for splitting up texts and split up the explanation given above into document chunks

from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 100,
    chunk_overlap  = 0,
)

texts = text_splitter.create_documents([explanation])


In [14]:
# Individual text chunks can be accessed with "page_content"

texts[0].page_content

'An autoencoder is a type of computer program that can help teach a computer how to learn without a'

In [19]:
# Import and instantiate OpenAI embeddings

from langchain.embeddings import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()

In [20]:
# Turn the first text chunk into a vector with the embedding

query_result = embeddings.embed_query(texts[0].page_content)
print(query_result)

[-0.03306716795226674, -0.009183744287162132, -0.005405756832413864, 0.010818621518963742, -0.008373414456719947, 0.020428855680738923, 0.012837339637917097, -0.00867195736894908, -0.02274611671192141, -0.018296408170318898, 0.028517943864824372, 0.035853565982878366, -0.021139672759309072, -0.012147847984400207, -0.006585711376074457, -0.01761402437093802, -0.0027473043672022407, 0.006002842191210827, 0.008458712431642557, 0.007239662361927363, -0.036166326466024745, 0.013583696918489933, 0.007182797200532724, -0.025845275225679417, -0.0034439043405164508, -0.03301030139388819, 0.026541874733332322, -0.05805946342872006, -0.00761284100221378, -0.004837103821483568, 0.02041463997246689, -0.012204713611456149, -0.0014242978562114196, -0.04825020003791039, -0.0211112394801198, -0.015936498151675103, 0.0017637125595372023, -0.003296409847935888, 0.02156616263392212, 0.004748251919492951, 0.02621490040455913, 0.014301620919873491, -0.019760689452275288, -0.008401846804586615, -0.0027899533

In [23]:
# Import and initialize Pinecone client

import os
import pinecone
from langchain.vectorstores import Pinecone


pinecone.init(
    api_key=os.getenv('PINECONE_API_KEY'),  
    environment=os.getenv('PINECONE_ENV')  
)

  from tqdm.autonotebook import tqdm


In [None]:
index_name = "langchain-demo"

# First, check if our index already exists. If it doesn't, we create it
if index_name not in pinecone.list_indexes():
    # we create a new index
    pinecone.create_index(
      name=index_name,
      metric='cosine',
      dimension=1536  
)


In [34]:
index_description = pinecone.describe_index(index_name)
print(index_description)

IndexDescription(name='langchain-demo', metric='cosine', replicas=1, dimension=1536.0, shards=1, pods=1, pod_type='p1', status={'ready': True, 'state': 'Ready'}, metadata_config=None, source_collection='')


In [35]:
# Upload vectors to Pinecone

search = Pinecone.from_documents(texts, embeddings, index_name=index_name)

In [36]:
# Do a simple vector similarity search

query = "What is magical about an autoencoder?"
result = search.similarity_search(query)

print(result)

[Document(page_content='The autoencoder also has a special superpower. It can take the important parts from the smaller box', metadata={}), Document(page_content="Let's say you have a big box of apples and oranges. An autoencoder would look at all of the apples", metadata={}), Document(page_content='Autoencoders can be used for unsupervised learning, which means that the computer', metadata={}), Document(page_content='The autoencoder is kind of like a detective. It looks for patterns and figures out which parts are', metadata={})]


In [39]:
# Import Python REPL tool and instantiate Python agent

from langchain.agents.agent_toolkits import create_python_agent
from langchain.tools.python.tool import PythonREPLTool
from langchain.python import PythonREPL
from langchain.llms.openai import OpenAI

agent_executor = create_python_agent(
    llm=OpenAI(temperature=0, max_tokens=100),
    tool=PythonREPLTool(),
    verbose=True
)

In [40]:
# Execute the Python agent

agent_executor.run("Find the roots (zeros) if the quadratic function 3 * x**2 + 2*x -1")



[1m> Entering new AgentExecutor chain...[0m


[32;1m[1;3m I need to solve a quadratic equation
Action: Python_REPL
Action Input: from scipy.optimize import root[0m
Observation: [36;1m[1;3mModuleNotFoundError("No module named 'scipy'")[0m
Thought:

Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-OWN0d8SxGQLsBXj66ycuLneL on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..
Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-OWN0d8SxGQLsBXj66ycuLneL on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/acco

[32;1m[1;3m I need to install scipy
Action: Python_REPL
Action Input: pip install scipy[0m
Observation: [36;1m[1;3mSyntaxError('invalid syntax', ('<string>', 1, 5, 'pip install scipy\n', 1, 12))[0m
Thought:

Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-OWN0d8SxGQLsBXj66ycuLneL on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..
Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-OWN0d8SxGQLsBXj66ycuLneL on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/acco

KeyboardInterrupt: 