<a href="https://colab.research.google.com/github/rabbitmetrics/langchain-13-min/blob/main/notebooks/langchain-13-min.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [3]:
# Load environment variables

from dotenv import load_dotenv,find_dotenv
load_dotenv(find_dotenv())

True

In [4]:
# Run basic query with OpenAI wrapper

from langchain.llms import OpenAI
llm = OpenAI(model_name="text-davinci-003")
llm("explain large language models in one sentence")

'\n\nLarge language models are artificial neural networks that are trained on large text corpora to predict the next word or character in a sequence.'

In [5]:
# import schema for chat messages and ChatOpenAI in order to query chatmodels GPT-3.5-turbo or GPT-4

from langchain.schema import (
    AIMessage,
    HumanMessage,
    SystemMessage
)
from langchain.chat_models import ChatOpenAI

In [6]:
chat = ChatOpenAI(model_name="gpt-3.5-turbo",temperature=0.3)
messages = [
    SystemMessage(content="You are an expert data scientist"),
    HumanMessage(content="Write a Python script that trains a neural network on simulated data ")
]
response=chat(messages)

print(response.content,end='\n')

Certainly! Here's an example script that trains a simple neural network on simulated data using the Keras library:

```python
import numpy as np
from keras.models import Sequential
from keras.layers import Dense

# Generate simulated data
np.random.seed(0)
X = np.random.rand(100, 2)
y = np.random.randint(0, 2, size=(100,))

# Define the model architecture
model = Sequential()
model.add(Dense(10, input_dim=2, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

# Compile the model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# Train the model
model.fit(X, y, epochs=10, batch_size=10)

# Evaluate the model
loss, accuracy = model.evaluate(X, y)
print(f"Loss: {loss}")
print(f"Accuracy: {accuracy}")
```

In this script, we first generate a simulated dataset with 100 samples and 2 features. We then define a simple neural network model with one hidden layer of 10 neurons and an output layer with one neuron. The model is compiled with binary cros

In [7]:
# Import prompt and define PromptTemplate

from langchain import PromptTemplate

template = """
You are an expert data scientist with an expertise in building deep learning models. 
Explain the concept of {concept} in a couple of lines
"""

prompt = PromptTemplate(
    input_variables=["concept"],
    template=template,
)

In [8]:
# Run LLM with PromptTemplate

llm(prompt.format(concept="autoencoder"))

'\nAn autoencoder is a type of neural network that is used to learn efficient data codings in an unsupervised manner. It consists of an encoder which compresses the input into a lower dimensional representation and a decoder which reconstructs the input from this lower dimensional representation.'

In [9]:
# Import LLMChain and define chain with language model and prompt as arguments.

from langchain.chains import LLMChain
chain = LLMChain(llm=llm, prompt=prompt)

# Run the chain only specifying the input variable.
print(chain.run("autoencoder"))


Autoencoders are a type of neural network used for learning efficient representations of data. They are trained to reconstruct their input, which forces the network to learn a compressed representation of the input data that captures its most important characteristics. Autoencoders can be used for a variety of tasks, such as dimensionality reduction, denoising, and feature extraction.


In [10]:
# Define a second prompt 

second_prompt = PromptTemplate(
    input_variables=["ml_concept"],
    template="Turn the concept description of {ml_concept} and explain it to me like I'm five in 500 words",
)
chain_two = LLMChain(llm=llm, prompt=second_prompt)

In [11]:
# Define a sequential chain using the two chains above: the second chain takes the output of the first chain as input

from langchain.chains import SimpleSequentialChain
overall_chain = SimpleSequentialChain(chains=[chain, chain_two], verbose=True)

# Run the chain specifying only the input variable for the first chain.
explanation = overall_chain.run("autoencoder")
print(explanation)



[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3m
An autoencoder is a type of artificial neural network used to learn efficient data encodings in an unsupervised manner. It is comprised of an encoder and a decoder, which learn to compress and decompress data, respectively. Autoencoders are used for feature learning, dimensionality reduction, and reconstruction of inputs.[0m
[33;1m[1;3m

An Autoencoder is a special type of neural network that helps computers learn how to better understand data. It’s made up of two different parts: an encoder and a decoder. 

The encoder’s job is to take data (like an image or a sound) and make it smaller and simpler. It does this by breaking the data down into smaller pieces. The encoder takes the data and stores it in a format that takes up less space. 

The decoder’s job is to take the simplified data from the encoder and turn it back into something useable. The decoder takes the smaller pieces of data and builds it back up into

In [12]:
# Import utility for splitting up texts and split up the explanation given above into document chunks

from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 100,
    chunk_overlap  = 0,
)

texts = text_splitter.create_documents([explanation])


In [14]:
# Individual text chunks can be accessed with "page_content"

texts

[Document(page_content='An Autoencoder is a special type of neural network that helps computers learn how to better', metadata={}),
 Document(page_content='understand data. It’s made up of two different parts: an encoder and a decoder.', metadata={}),
 Document(page_content='The encoder’s job is to take data (like an image or a sound) and make it smaller and simpler. It', metadata={}),
 Document(page_content='does this by breaking the data down into smaller pieces. The encoder takes the data and stores it in', metadata={}),
 Document(page_content='a format that takes up less space.', metadata={}),
 Document(page_content='The decoder’s job is to take the simplified data from the encoder and turn it back into something', metadata={}),
 Document(page_content='useable. The decoder takes the smaller pieces of data and builds it back up into something that’s', metadata={}),
 Document(page_content='easy to use.', metadata={}),
 Document(page_content='An Autoencoder is used to help computers l

In [15]:
# Import and instantiate OpenAI embeddings

from langchain.embeddings import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model_name="ada")

In [16]:
# Turn the first text chunk into a vector with the embedding

query_result = embeddings.embed_query(texts[0].page_content)
print(query_result)

[-0.030426677826509596, 0.034438308905724725, 0.008450140386919611, 0.01471959917336777, 0.035631511957744715, 0.011746878001326882, -0.023905205009790795, 0.0012266333420575298, 0.030262098223654783, -0.00018547363599168737, -0.038223643496675785, -0.02050046284591732, 0.013310385159770195, 0.01953355628216141, 0.011078273131898558, -0.011726306016631325, 0.03624868453712767, 0.006254029565095844, 0.017414592498421807, 0.010023934003371997, 0.033121672082886223, -0.006938064121537779, 0.07262080284507386, -0.011067986673889484, 0.012662353206360017, 0.02590073502271188, 0.015645360836409977, 0.05916641379243477, -0.010502243590584121, -0.023411466201226357, -0.02606531462556669, 0.017167723094139588, 0.03777104679485728, 0.024419516734373384, 0.043572482452102386, 0.022938300308680066, 0.0255921487330204, -0.011500008597044663, -0.027032221189322605, -0.016385969049256637, 0.015851084408655902, -0.01736316113969903, 0.0005056326711565846, 0.008280418020721556, -0.016643125842870518, -

In [27]:
# Import and initialize Pinecone client

import os
import pinecone
from langchain.vectorstores import Pinecone
from tqdm.autonotebook import tqdm

pinecone.init(
    api_key=os.getenv('PINECONE_API_KEY'),  
    environment=os.getenv('PINECONE_ENV')  
)

In [32]:
# Upload vectors to Pinecone
index = pinecone.Index('langchain-quickstart')
index_name = "langchain-quickstart"
search = Pinecone.from_documents(texts, embeddings, index_name=index_name)

ForbiddenException: (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'content-length': '0', 'date': 'Sat, 29 Jul 2023 20:15:37 GMT', 'server': 'envoy', 'connection': 'close'})


In [33]:
# Do a simple vector similarity search

query = "What is magical about an autoencoder?"
result = search.similarity_search(query)

print(result)

NameError: name 'search' is not defined

In [None]:
# Import Python REPL tool and instantiate Python agent

from langchain.agents.agent_toolkits import create_python_agent
from langchain.tools.python.tool import PythonREPLTool
from langchain.python import PythonREPL
from langchain.llms.openai import OpenAI

agent_executor = create_python_agent(
    llm=OpenAI(temperature=0, max_tokens=1000),
    tool=PythonREPLTool(),
    verbose=True
)

In [None]:
# Execute the Python agent

agent_executor.run("Find the roots (zeros) if the quadratic function 3 * x**2 + 2*x -1")