In [1]:
from langchain.document_loaders import DirectoryLoader
from langchain.text_splitter import CharacterTextSplitter

loader = DirectoryLoader(
    './data/', # my local directory
    glob='**/*.epub',     # we only get pdfs
    show_progress=True
)
docs = loader.load()

text_splitter = CharacterTextSplitter(
    chunk_size=1000, 
    chunk_overlap=0
)

docs_split = text_splitter.split_documents(docs)
print(len(docs_split))
print(docs_split[200])

100%|██████████| 1/1 [00:05<00:00,  5.69s/it]
Created a chunk of size 1044, which is longer than the specified 1000
Created a chunk of size 1051, which is longer than the specified 1000
Created a chunk of size 1124, which is longer than the specified 1000
Created a chunk of size 1290, which is longer than the specified 1000
Created a chunk of size 1164, which is longer than the specified 1000
Created a chunk of size 1068, which is longer than the specified 1000
Created a chunk of size 1076, which is longer than the specified 1000
Created a chunk of size 1115, which is longer than the specified 1000
Created a chunk of size 1034, which is longer than the specified 1000
Created a chunk of size 1318, which is longer than the specified 1000
Created a chunk of size 1098, which is longer than the specified 1000
Created a chunk of size 1261, which is longer than the specified 1000
Created a chunk of size 1141, which is longer than the specified 1000
Created a chunk of size 1176, which is longe

619
page_content='At larger companies, fixing the bottlenecks might be out of your\ncircle of influence, and the best you can do is work around them. At\nsmaller startups, you often can directly address the bottlenecks\nthemselves. When I started working on the user growth team at Quora, for\nexample, we had to get design approval for most of our live traffic\nexperiments. Approval meetings were a bottleneck. But over time, we\neliminated that bottleneck by building up mutual trust; the founders\nknew that our team would use our best judgment and solicit feedback on\nthe experiments that might be controversial. Not having to explicitly\nsecure approval for every single experiment meant that our team could\niterate at a much faster pace and try out many more ideas.' metadata={'source': 'data/Effective-Engineer.epub'}


In [3]:
import os
PINECONE_API_KEY = '1aee190b-02c4-461b-835d-b2cf7bc0d13b'
PINECONE_ENV = 'northamerica-northeast1-gcp'
OPENAI_API_KEY = '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'
os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY

In [5]:
import pinecone 
from langchain.vectorstores import Pinecone
from langchain.embeddings.openai import OpenAIEmbeddings

# we use the openAI embedding model
embeddings = OpenAIEmbeddings()
pinecone.init(
    api_key=PINECONE_API_KEY,
    environment=PINECONE_ENV
)

doc_db = Pinecone.from_documents(
    docs_split, 
    embeddings, 
    index_name='effective-engineer'
)

In [6]:
query = "What's the most important mindset of an effective engineer"
search_docs = doc_db.similarity_search(query)

[Document(page_content='This story might not seem like it has much to do with engineering,\nbut it illustrates the power of the right mindset about any of our\nskills. How we view our own intelligence, character, and abilities\nprofoundly affects how we lead our lives; it largely determines whether\nwe remain stuck in our current situations or achieve what we value.\nThat’s what Stanford psychologist Carol Dweck concludes in her book\nMindset, written after 20 years of researching people’s\nself-perceptions and beliefs.\xa04 Dweck’s\nfindings are relevant to us engineers, because how we view our own\neffectiveness impacts how much effort we invest in improving it.', metadata={'source': 'data/Effective-Engineer.epub'}), Document(page_content='This research reminds us that the mindset we adopt about our\neffectiveness as engineers drastically shapes whether we learn and grow\nor let our skills plateau and stagnate. Do we treat our abilities as\nfixed quantities outside of our control? Or

In [8]:
search_docs

[Document(page_content='This story might not seem like it has much to do with engineering,\nbut it illustrates the power of the right mindset about any of our\nskills. How we view our own intelligence, character, and abilities\nprofoundly affects how we lead our lives; it largely determines whether\nwe remain stuck in our current situations or achieve what we value.\nThat’s what Stanford psychologist Carol Dweck concludes in her book\nMindset, written after 20 years of researching people’s\nself-perceptions and beliefs.\xa04 Dweck’s\nfindings are relevant to us engineers, because how we view our own\neffectiveness impacts how much effort we invest in improving it.', metadata={'source': 'data/Effective-Engineer.epub'}),
 Document(page_content='This research reminds us that the mindset we adopt about our\neffectiveness as engineers drastically shapes whether we learn and grow\nor let our skills plateau and stagnate. Do we treat our abilities as\nfixed quantities outside of our control? O

In [9]:
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI()

In [10]:
from langchain.chains import RetrievalQA

qa = RetrievalQA.from_chain_type(
    llm=llm, 
    chain_type='stuff',
    retriever=doc_db.as_retriever(),
)

query = "What's the most important mindset of an effective engineer"
result = qa.run(query)

result

'According to the context provided, the most important mindset of an effective engineer is having a growth mindset. This means viewing failures as opportunities to learn and improve, and directing efforts towards improving oneself.'

In [11]:
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.agents import AgentType

In [12]:
tools = load_tools(['wikipedia'], llm=llm)
agent = initialize_agent(
    tools, 
    llm, 
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, 
    verbose=True
)

In [13]:
agent.run('When was google created?')



[1m> Entering new  chain...[0m
[32;1m[1;3mI need to find the date when Google was created.
Action: Wikipedia
Action Input: "Google creation date"[0m
Observation: [36;1m[1;3mPage: Google Workspace
Summary: Google Workspace (formerly known as Google Apps and later G Suite) is a collection of cloud computing, productivity and collaboration tools, software and products developed and marketed by Google. It consists of Gmail, Contacts, Calendar, Meet and Chat for communication; Currents for employee engagement; Drive for storage; and the Google Docs Editors suite for content creation. An Admin Panel is provided for managing users and services. Depending on edition Google Workspace may also include the digital interactive whiteboard Jamboard and an option to purchase add-ons such as the telephony service Voice. The education edition adds a learning platform Google Classroom and today has the name Workspace for Education.While most of these services are individually available at no co

'Google was officially launched in 1998 by Larry Page and Sergey Brin.'

In [15]:
agent.run('what is first math crisis and why?')



[1m> Entering new  chain...[0m
[32;1m[1;3mI need to use a reliable source to answer this question.
Action: Wikipedia
Action Input: "first math crisis"[0m
Observation: [36;1m[1;3mPage: New Math
Summary: New Mathematics or New Math was a dramatic but temporary change in the way mathematics was taught in American grade schools, and to a lesser extent in European countries and elsewhere, during the 1950s–1970s. Curriculum topics and teaching practices were changed in the U.S. shortly after the Sputnik crisis.  The goal was to boost students' science education and mathematical skills to compete with Soviet engineers, reputedly highly skilled mathematicians.



Page: Foundations of mathematics
Summary: Foundations of mathematics is the study of the philosophical and logical and/or algorithmic basis of mathematics, or, in a broader sense, the mathematical investigation of what underlies the philosophical theories concerning the nature of mathematics. In this latter sense, the distinc

'The first math crisis refers to the educational shift in mathematics teaching in the United States during the 1950s-1970s, known as New Mathematics, in response to the Sputnik crisis. The foundations of mathematics is a philosophical and logical study of the basis of mathematics, including the search for its fundamental concepts and structures. Mathematics as a whole encompasses the study of numbers, formulas, shapes, and quantities, and there is no general consensus among mathematicians about a common definition for the discipline.'