In [3]:
PDF_FILE = "paul.pdf"
MODEL = "llama3.1"

In [6]:
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader(PDF_FILE)

pages = loader.load()

print(f"Number of pages: {len(pages)}")
print(f"Length of page: {len(pages[1].page_content)}")
print("Content of a page:", pages[1].page_content)

Number of pages: 9
Length of page: 3272
Content of a page: 10% a week. And while 110 may not seem much better than 100,
if you keep growing at 10% a week you'll be surprised how big
the numbers get. After a year you'll have 14,000 users, and after
2 years you'll have 2 million.
You'll be doing different things when you're acquiring users a
thousand at a time, and growth has to slow down eventually. But
if the market exists you can usually start by recruiting users
manually and then gradually switch to less manual methods. [3]
Airbnb is a classic example of this technique. Marketplaces are so
hard to get rolling that you should expect to take heroic measures
at first. In Airbnb's case, these consisted of going door to door in
New York, recruiting new users and helping existing ones improve
their listings. When I remember the Airbnbs during YC, I picture
them with rolly bags, because when they showed up for tuesday
dinners they'd always just flown back from somewhere.
Fragile
Airbnb now 

In [7]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(chunk_size=1500, chunk_overlap=100)

chunks = splitter.split_documents(pages)

print("Number of chunks", len(chunks))
print("Length of chunks", len(chunks[1].page_content))
print("Content of a chunk", chunks[1].page_content)

Number of chunks 23
Length of chunks 1236
Content of a chunk took better advantage of it than Stripe. At YC we use the term
"Collison installation" for the technique they invented. More
diffident founders ask "Will you try our beta?" and if the answer is
yes, they say "Great, we'll send you a link." But the Collison
brothers weren't going to wait. When anyone agreed to try Stripe
they'd say "Right then, give me your laptop" and set them up on
the spot.
There are two reasons founders resist going out and recruiting
users individually. One is a combination of shyness and laziness.
They'd rather sit at home writing code than go out and talk to a
bunch of strangers and probably be rejected by most of them.
But for a startup to succeed, at least one founder (usually the
CEO) will have to spend a lot of time on sales and marketing. [2]
The other reason founders ignore this path is that the absolute
numbers seem so small at first. This can't be how the big, famous
startups got started, they t

In [8]:
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import OllamaEmbeddings

embeddings = OllamaEmbeddings(model=MODEL)
vector_store = FAISS.from_documents(chunks, embeddings)


  embeddings = OllamaEmbeddings(model=MODEL)


In [9]:
retriever = vector_store.as_retriever()
retriever.invoke("What can you get away when you only have a small number of users?")

[Document(id='541c13db-7755-471a-96fa-01e3b4b919f0', metadata={'source': 'paul.pdf', 'page': 1, 'page_label': '2'}, page_content='were taking "professional" photos of their first hosts\' apartments.\nThey were just trying to survive. But in retrospect that too was\nthe optimal path to dominating a big market.\nHow do you find users to recruit manually? If you build something\nto solve your own problems, then you only have to find your\npeers, which is usually straightforward. Otherwise you\'ll have to\n8/6/24, 11:04 AM Do Things that Don\'t Scale\nhttps://paulgraham.com/ds.html 2/9'),
 Document(id='a514d039-1495-4db9-adce-18caa0f5cab8', metadata={'source': 'paul.pdf', 'page': 4, 'page_label': '5'}, page_content='You can tweak the design faster when you\'re the factory, and you\nlearn things you\'d never have known otherwise. Eric Migicovsky\nof Pebble said one of the things he learned was "how valuable it\nwas to source good screws." Who knew?\nConsult\n8/6/24, 11:04 AM Do Things that 

In [10]:
from langchain_ollama import ChatOllama

model = ChatOllama(model=MODEL, temperature=0)

model.invoke("Who is the Prime Minister of Pakistan?")

AIMessage(content="The current Prime Minister of Pakistan is Shehbaz Sharif. He took office on April 11, 2022, after the resignation of Imran Khan following a no-confidence vote in the National Assembly.\n\nHowever, please note that the situation can change rapidly in politics, and I may not always have the most up-to-date information. If you're looking for the latest news or updates, I recommend checking reputable news sources or official government websites for the most current information.", additional_kwargs={}, response_metadata={'model': 'llama3.1', 'created_at': '2025-02-04T13:24:18.924562486Z', 'done': True, 'done_reason': 'stop', 'total_duration': 33377046365, 'load_duration': 25930949, 'prompt_eval_count': 18, 'prompt_eval_duration': 2379000000, 'eval_count': 96, 'eval_duration': 30969000000, 'message': Message(role='assistant', content='', images=None, tool_calls=None)}, id='run-0b3a4e2d-fe75-4282-8428-ce892f2d6b40-0', usage_metadata={'input_tokens': 18, 'output_tokens': 96,

In [11]:
from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()

chain = model | parser

print(chain.invoke("Who is the Prime Minister of Pakistan?"))

The current Prime Minister of Pakistan is Shehbaz Sharif. He took office on April 11, 2022, after the resignation of Imran Khan following a no-confidence vote in the National Assembly.

However, please note that the situation can change rapidly in politics, and I may not always have the most up-to-date information. If you're looking for the latest news or updates, I recommend checking reputable news sources or official government websites for the most current information.


In [12]:
from langchain.prompts import PromptTemplate

template = """
Answer the question based on the context below. If you can't answer the question, reply "I don't know."

Context: {context}

Question: {question}

"""

prompt = PromptTemplate.from_template(template)
prompt.format(context="Here is some context", question="What is your name?")


'\nAnswer the question based on the context below. If you can\'t answer the question, reply "I don\'t know."\n\nContext: Here is some context\n\nQuestion: What is your name?\n\n'

In [14]:
from operator import itemgetter

chain = (
    {
        "context": itemgetter("question") | retriever,
        "question": itemgetter("question"),
    }
    | prompt
    | model
    | parser
)

In [15]:
chain.invoke({"question": "What can you get away with when you only have a small number of users?"})

'When you only have a small number of users, you can provide a level of service that no big company can. According to the text, "Tim Cook doesn\'t send you a hand-written note after you buy a laptop. He can\'t. But you can." This implies that with a small user base, you have more flexibility and can offer personalized attention and services that larger companies cannot match.'