In [1]:
from dotenv import load_dotenv, find_dotenv
load_dotenv(find_dotenv())

Python-dotenv could not parse statement starting at line 3


True

In [2]:
from langchain.llms import OpenAI
llm = OpenAI(model_name="text-davinci-003")
llm("explain large language models in one sentence")

'\n\nLarge language models are machine learning algorithms that can learn from large datasets of natural language, allowing them to generate human-like text.'

In [3]:
from langchain.schema import (
    AIMessage,
    HumanMessage,
    SystemMessage
)
from langchain.chat_models import ChatOpenAI

In [4]:
chat = ChatOpenAI(model_name="gpt-3.5-turbo",temperature=0.3)
messages = [
    SystemMessage(content="You are an expert data scientist"),
    HumanMessage(content="Write a Python script that trains a neural network on simulated data ")
]
response=chat(messages)

In [5]:
print(response.content,end='\n')

Sure, here's an example Python script that trains a neural network on simulated data using the Keras library:

```python
import numpy as np
from keras.models import Sequential
from keras.layers import Dense

# Generate simulated data
X = np.random.rand(1000, 10)
y = np.random.randint(2, size=(1000, 1))

# Define the neural network architecture
model = Sequential()
model.add(Dense(32, input_dim=10, activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

# Compile the model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# Train the model
model.fit(X, y, epochs=10, batch_size=32)

# Evaluate the model on new data
X_test = np.random.rand(100, 10)
y_test = np.random.randint(2, size=(100, 1))
loss, accuracy = model.evaluate(X_test, y_test)
print('Test loss:', loss)
print('Test accuracy:', accuracy)
```

This script generates a simulated dataset with 1000 samples, each with 10 features and a binary label. It th

In [6]:
from langchain import PromptTemplate

template = """
You are an expert data scientist with an expertise on building deep learning models.
Explain the concept of {concept} in a couple of lines
"""

prompt = PromptTemplate(
    input_variables=["concept"],
    template=template,
)

In [7]:
prompt

PromptTemplate(input_variables=['concept'], output_parser=None, partial_variables={}, template='\nYou are an expert data scientist with an expertise on building deep learning models.\nExplain the concept of {concept} in a couple of lines\n', template_format='f-string', validate_template=True)

In [8]:
llm(prompt.format(concept="regularization"))

'\nRegularization is a technique used in machine learning models to prevent overfitting. It adds a penalty to the cost function of a model, which forces the model to learn the underlying structure of the data and not just the noise. By adding a regularization term to the cost function, the model will be forced to learn the underlying structure of the data instead of just memorizing it. This means that the model can generalize better, leading to better and more reliable predictions.'

In [9]:
llm(prompt.format(concept="autoencoder"))

'\nAn autoencoder is a type of artificial neural network that is used for unsupervised learning. It is designed to reduce the dimensionality of a given dataset by learning a compressed representation of its input data. The autoencoder consists of two parts: an encoder that learns a compressed representation of the input data and a decoder that reconstructs the original data from the compressed representation.'

In [10]:
from langchain.chains import LLMChain
chain = LLMChain(llm=llm, prompt=prompt)

# Run the chain only specifying the input variable.
print(chain.run("autoencoder"))


An autoencoder is a type of artificial neural network used to learn a compressed representation of the input data (encoding) by training the network to reconstruct the input data from the encoded version (decoding). This allows the network to learn useful features from the data that can be used for classification or other tasks.


In [11]:
second_prompt = PromptTemplate(
    input_variables=["ml_concept"],
    template="Turn the concept description of {ml_concept} and explain it to me like I'm five in 500 words",
)
chain_two = LLMChain(llm=llm, prompt=second_prompt)

In [12]:
from langchain.chains import SimpleSequentialChain
overall_chain = SimpleSequentialChain(chains=[chain, chain_two], verbose=True)

# Run the chain specifying only the input variable for the first chain.
explanation = overall_chain.run("autoencoder")
print(explanation)



[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3m
An autoencoder is a type of artificial neural network used to learn data representations in an unsupervised manner by reconstructing its own inputs. It consists of an encoder that compresses the data into a lower-dimensional representation, and a decoder that reconstructs the original data from the compressed representation.[0m
[33;1m[1;3m

An autoencoder is like a machine that can learn how to copy things. It can look at a picture or a document and it can learn how to copy it. It does this by taking the thing it wants to copy and breaking it down into smaller pieces. This process of breaking down into smaller pieces is called “encoding”. 

Once the autoencoder has broken the thing down into small pieces, it can put those pieces back together in the same order it found them in. This process of putting the pieces back together is called “decoding”. 

This machine is called an autoencoder because it can learn how to

In [13]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 100,
    chunk_overlap = 0,
)

texts = text_splitter.create_documents([explanation])

In [14]:
texts

[Document(page_content='An autoencoder is like a machine that can learn how to copy things. It can look at a picture or a', metadata={}),
 Document(page_content='document and it can learn how to copy it. It does this by taking the thing it wants to copy and', metadata={}),
 Document(page_content='breaking it down into smaller pieces. This process of breaking down into smaller pieces is called', metadata={}),
 Document(page_content='“encoding”.', metadata={}),
 Document(page_content='Once the autoencoder has broken the thing down into small pieces, it can put those pieces back', metadata={}),
 Document(page_content='together in the same order it found them in. This process of putting the pieces back together is', metadata={}),
 Document(page_content='called “decoding”.', metadata={}),
 Document(page_content='This machine is called an autoencoder because it can learn how to copy things without being told how', metadata={}),
 Document(page_content='to do it. It can learn how to copy thing

In [15]:
texts[0].page_content

'An autoencoder is like a machine that can learn how to copy things. It can look at a picture or a'

In [16]:
from langchain.embeddings import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()

In [17]:
query_result = embeddings.embed_query(texts[0].page_content)
query_result

[-0.03644082695245743,
 0.003638610942289233,
 0.00861092284321785,
 -0.018357202410697937,
 -0.006825815420597792,
 0.026318373158574104,
 -0.01692090928554535,
 -0.013555877842009068,
 -0.0012507725041359663,
 -0.03135224059224129,
 0.006907889153808355,
 0.05788948014378548,
 -0.004062659572809935,
 -0.018302487209439278,
 0.02247457765042782,
 0.013022396713495255,
 -0.005618644412606955,
 0.015990737825632095,
 -0.021900061517953873,
 -0.02164015918970108,
 -0.03217298164963722,
 0.009034971706569195,
 -0.006323112640529871,
 -0.02949189953505993,
 0.013172865845263004,
 0.015005850233137608,
 0.025401880964636803,
 -0.04018886759877205,
 -0.008207392878830433,
 0.02345946617424488,
 0.024389635771512985,
 -0.027782024815678596,
 -0.027275903150439262,
 -0.047712311148643494,
 -0.022501936182379723,
 -0.00843309611082077,
 -0.0009789025643840432,
 -0.017331277951598167,
 0.009103367105126381,
 -0.0007224215660244226,
 0.01879492960870266,
 0.01849399320781231,
 -0.0127214593812823

In [31]:
import os
import pinecone
from langchain.vectorstores import Pinecone

# initialize pinecone
pinecone.init(
    api_key=os.getenv('b0354c6e-5f60-41df-85ea-9fca58e8af78'),
    environment=os.getenv('us-west1-gcp-free')
)

In [34]:
index_name = "langchain-quickstart"
search = Pinecone.from_documents(texts, embeddings, index_name=index_name)

TypeError: expected string or bytes-like object, got 'NoneType'