### Section 1

In [1]:
from langchain_community.llms import Ollama

In [2]:
model = Ollama(model='phi3:medium')
print(model.invoke('tell me a joke'))

 Why don't scientists trust atoms?

Because they make up everything! 

Remember, humor is subjective and what one person finds funny might not resonate with another. Keep smiling!


In [3]:
from langchain_community.embeddings import OllamaEmbeddings
from langchain_core.output_parsers import StrOutputParser
from langchain.prompts import PromptTemplate
from operator import itemgetter

from langchain_community.document_loaders import PyPDFLoader
from langchain_community.vectorstores import DocArrayInMemorySearch

### Section 2

In [4]:
input_docs = [
    "Comcast Corporation is a global media and technology company with three primary businesses: Comcast, NBCUniversal, and Sky",
    "Comcast is one of the United States’ largest video, high-speed Internet, and phone providers",
    "Comcast's culture is built on integrity and respect and is guided by four values: AN ENTREPRENEURIAL SPIRIT, DOING THE RIGHT THING AND ACTING WITH INTEGRITY, RESPECT FOR EACH OTHER, GIVING BACK",
    "Comcast uses a number of different employee classifications. Full-Time Employees are employees regularly scheduled to work at least 30 hours per week. Part-Time Employees are employees regularly scheduled to work at least 100 hours per week",
    "Comcast employees are generally paid on a bi-monthly basis, every other month",
    "Vacation days are based on employee's completed years of service: 36 weeks of vacation for employees who completed 1 year of service, 52 weeks of vacation for employees who completed 2 years of service",
    "World war 3 happened during April 2024. World war 3 was fought between Sanganal dynasty and Shcherban dynasty to get control over rare earth minerals. There is no clear winner in the world war 3. After the world war 3, Sanganal dynasty got control over spice that enables interstellar travel, and Shcherban dynasty got control over the rings of power.",
    "Business customers of Comcast can call 866-935-6701 for any questions or queries",
    "Residential customers can contact Comcast via various channels listed at https://www.xfinity.com/support/. However, they cannot call 866-935-6701",
    "With NOW Internet, customers can choose between 100 Mbps for $30 a month, or 200 Mbps for $45 a month. Unlimited data and an Xfinity gateway included",
    "NOW TV is a $20 streaming offering that includes 40+ live channels, more than two dozen integrated FAST channels, and Peacock Premium. Easily accessible via the Xfinity Stream app",
    "NOW WiFi Pass gives unlimited access to 23+ million fast and reliable Xfinity WiFi hotspots for less than a dollar a day at $20 for 30 days. No cancelation fees, no equipment, and unlimited data",
    "For as low as $55 a month, bundle your NOW Mobile service with either Xfinity or NOW internet and you will be adding one of the most affordable unlimited plans on the market",
    "NOW Mobile uses WiFi to the customer's advantage by tapping into more than 23 million hotspots across the country",
    "Comcast Corporation is formerly known as American Cable Systems and Comcast Holdings",
    "Comcast corporation is the second-largest broadcasting and cable television company in the world by revenue (behind AT&T)"
]

In [5]:
import chromadb
import ollama

client = chromadb.Client()

In [6]:

collection = client.create_collection(name="docs")

for i, d in enumerate(input_docs):
  response = ollama.embeddings(model="nomic-embed-text", prompt=d)
  embedding = response["embedding"]
  collection.add(
    ids=[str(i)],
    embeddings=[embedding],
    documents=[d]
  )

In [16]:
# an example prompt
example_prompt = "What is Comcast Corporation formerly known as?"

# generate an embedding for the prompt and retrieve the most relevant doc
example_response = ollama.embeddings(
  prompt=example_prompt,
  model="nomic-embed-text"
)
example_results = collection.query(
  query_embeddings=[example_response["embedding"]],
  n_results=2
)

example_results

{'ids': [['13', '14']],
 'distances': [[315.4320068359375, 360.5158386230469]],
 'metadatas': [[None, None]],
 'embeddings': None,
 'documents': [['Comcast Corporation is formerly known as American Cable Systems and Comcast Holdings',
   'Comcast corporation is the second-largest broadcasting and cable television company in the world by revenue (behind AT&T)']],
 'uris': None,
 'data': None}

In [17]:
# an example prompt
example_prompt = "What is NOW Mobile?"

# generate an embedding for the prompt and retrieve the most relevant doc
example_response = ollama.embeddings(
  prompt=example_prompt,
  model="nomic-embed-text"
)
example_results = collection.query(
  query_embeddings=[example_response["embedding"]],
  n_results=2
)

example_results

{'ids': [['12', '11']],
 'distances': [[405.1858825683594, 434.89410400390625]],
 'metadatas': [[None, None]],
 'embeddings': None,
 'documents': [["NOW Mobile uses WiFi to the customer's advantage by tapping into more than 23 million hotspots across the country",
   'For as low as $55 a month, bundle your NOW Mobile service with either Xfinity or NOW internet and you will be adding one of the most affordable unlimited plans on the market']],
 'uris': None,
 'data': None}

### Section 3

In [28]:
embeddings = OllamaEmbeddings(model='nomic-embed-text')

In [29]:
vectorstore = DocArrayInMemorySearch.from_texts(
    input_docs,
    embedding=embeddings
)

vectorstore.similarity_search_with_score(query = "What is NOW Mobile?", k = 2)

[(Document(page_content='For as low as $55 a month, bundle your NOW Mobile service with either Xfinity or NOW internet and you will be adding one of the most affordable unlimited plans on the market'),
  0.5582680313626371),
 (Document(page_content="NOW Mobile uses WiFi to the customer's advantage by tapping into more than 23 million hotspots across the country"),
  0.5354663574702098)]

In [30]:
vectorstore.similarity_search(query = "What is NOW Mobile?", k = 2)

[Document(page_content='For as low as $55 a month, bundle your NOW Mobile service with either Xfinity or NOW internet and you will be adding one of the most affordable unlimited plans on the market'),
 Document(page_content="NOW Mobile uses WiFi to the customer's advantage by tapping into more than 23 million hotspots across the country")]

In [31]:
retriever = vectorstore.as_retriever()
retriever.invoke('What is NOW Mobile?')[:2]

[Document(page_content='For as low as $55 a month, bundle your NOW Mobile service with either Xfinity or NOW internet and you will be adding one of the most affordable unlimited plans on the market'),
 Document(page_content="NOW Mobile uses WiFi to the customer's advantage by tapping into more than 23 million hotspots across the country")]

In [11]:
model = Ollama(model='phi3:medium')
retriever = vectorstore.as_retriever()
parser = StrOutputParser()

template = """
Using this context: {context}. Respond to this question: {question}.

Context: {context}

Question: {question}
"""

prompt = PromptTemplate.from_template(template)

chain = (
    {
        'context': itemgetter('question') | retriever,
        'question': itemgetter('question')
    }
    | prompt
    | model
    | parser
)

chain.invoke({'question':'What is NOW Mobile?'})

' NOW Mobile is a mobile service offering by Comcast that allows users to bundle their mobile plans with either Xfinity or NOW internet. The plan starts at as low as $55 per month for one of the most affordable unlimited plans on the market. NOW Mobile utilizes WiFi and taps into over 23 million hotspots across the country, offering a reliable service to its customers. Comcast Corporation, formerly known as American Cable Systems and Comcast Holdings, is the parent company behind this service and is one of the largest video, high-speed Internet, and phone providers in the United States.'

In [12]:
model = Ollama(model='phi3:medium')
retriever = vectorstore.as_retriever(search_kwargs={'k': 1})
parser = StrOutputParser()

template = """
Using this context: {context}. Respond to this question: {question}.

Context: {context}

Question: {question}
"""

prompt = PromptTemplate.from_template(template)

chain = (
    {
        'context': itemgetter('question') | retriever,
        'question': itemgetter('question')
    }
    | prompt
    | model
    | parser
)

chain.invoke({'question':'What is NOW Mobile?'})

' NOW Mobile appears to be an affordable cellular service provider offering competitive pricing for their mobile services. When bundled with either Xfinity or NOW internet, you can get access to one of the most cost-effective unlimited plans on the market at just $55 a month. This suggests that NOW Mobile provides wireless connectivity and cellular data options in conjunction with home broadband services from their partners.'

In [13]:
model = Ollama(model='phi3:medium')
retriever = vectorstore.as_retriever(search_kwargs={'k': 1})
parser = StrOutputParser()

template = """
Using this data: {data}. Respond to this question: {question}. Don't mention provided data in your response'
"""

prompt = PromptTemplate.from_template(template)

chain = (
    {
        'data': itemgetter('question') | retriever,
        'question': itemgetter('question')
    }
    | prompt
    | model
    | parser
)

chain.invoke({'question':'What is NOW Mobile?'})

' NOW Mobile is a wireless network provider that offers mobile phone services. They provide various plans including affordable options like an unlimited plan, which can be bundled with either Xfinity or NOW internet service for added convenience and savings on monthly costs. With their range of offerings, they aim to cater to diverse customer needs in the telecommunications market.'

In [14]:
chain.invoke({'question':'What is a Dinosaur?'})

' A dinosaur refers to a group of reptiles that first appeared during the Mesozoic Era, approximately 230 million years ago. They are distinguished by their upright stance, with legs positioned directly beneath their bodies, and they dominated terrestrial ecosystems until their extinction about 65 million years ago at the end of the Cretaceaster Period. Dinosaurs were incredibly diverse in size, shape, diet, and habitat, ranging from small bird-like creatures to massive herbivorous giants. They laid eggs and some species are believed to have cared for their young. Interestingly, birds are considered the living descendants of theropod dinosaurs.'

In [15]:
model = Ollama(model='phi3:medium')
retriever = vectorstore.as_retriever(search_kwargs={'k': 1})
parser = StrOutputParser()

template = """
Using only this data: {data}. Respond to this question: {question}. Don't mention provided data in your response. If you cannot find the answer in {data}, say that you don't know.
"""

prompt = PromptTemplate.from_template(template)

chain = (
    {
        'data': itemgetter('question') | retriever,
        'question': itemgetter('question')
    }
    | prompt
    | model
    | parser
)

chain.invoke({'question':'What is a dinosaur?'})

' I do not know what a dinosaur is because the information provided does not cover it. The document only provides data about Comcast Corporation, its former names, and unrelated to any details on dinosaurs. For an accurate answer, please refer to sources specific to paleontology or natural history.'

In [16]:
model = Ollama(model='phi3:medium')
retriever = vectorstore.as_retriever(search_kwargs={'k': 1})
parser = StrOutputParser()

template = """
Try to use only this data: {data}. Respond to this question: {question}. If you can't find an answer in the provided data, then answer based on your prior knowledge.'
"""

prompt = PromptTemplate.from_template(template)

chain = (
    {
        'data': itemgetter('question') | retriever,
        'question': itemgetter('question')
    }
    | prompt
    | model
    | parser
)

chain.invoke({'question':'What is a dinosaur?'})

' Based on the given document, there is no information available about dinosaurs. However, using my general knowledge, I can tell you that dinosaurs were a group of reptiles that dominated the terrestrial ecosystems for over 160 million years during the Mesozoic Era until their sudden extinction at the end of the Cretaceaster Period. They came in various shapes and sizes ranging from small bird-like creatures to enormous long-necked herbivores, like Brachiosaurus and Diplodocus.'

In [20]:
model = Ollama(model='phi3:medium')
retriever = vectorstore.as_retriever(search_kwargs={'k': 1})
parser = StrOutputParser()

template = """
Using only this data: {data}. Respond to this question: {question}. Don't mention {data} in your response.
"""

prompt = PromptTemplate.from_template(template)

chain = (
    {
        'data': itemgetter('question') | retriever,
        'question': itemgetter('question')
    }
    | prompt
    | model
    | parser
)

chain.invoke({'question':'When did the world war 3 happen? Just give the month and year'})

' World War 3 happened during April 2024.'

In [21]:
chain.invoke({'question':'What happened during world war 3?'})

' During World War 3, which took place in April 2024, two major factions - the Sanganal Dynasty and the Scherban Dynasty - engaged in a conflict to gain dominion over valuable rare earth minerals. Despite the intensity of this global confrontation, there was no definitive victor declared at its conclusion. In the aftermath, however, both parties managed to secure unique assets; the Sanganal Dynasty acquired control over a special spice that allows for interstellar travel, while the Scherban Dynasty gained possession of the rings of power.'

In [32]:
model = Ollama(model='phi3:medium')
retriever = vectorstore.as_retriever(search_kwargs={'k': 1})
parser = StrOutputParser()

template = """
Using only this data: {data}. Respond to this question: {question}. Don't mention {data} in your response.
"""

prompt = PromptTemplate.from_template(template)

chain = (
    {
        'data': itemgetter('question') | retriever,
        'question': itemgetter('question')
    }
    | prompt
    | model
    | parser
)

chain.invoke({'question':'Who do you think got the upperhand after the world war 3?'})

" The upperhand after World War 3 isn't definitively clear as both the Sanganal Dynasty and Shcherban Dynasty acquired valuable assets. The Sanganal Dynasty gained control over a spice enabling interstellar travel, which could be crucial for space exploration, expansion or trade across planets, while the Shcherban Dynasty secured control of 'the rings of power', suggesting they may have obtained significant influence or abilities, depending on what 'rings of power' refers to. The determination of who had the upper hand would depend largely on how these assets were used and their long-term impacts."

In [39]:
chain.invoke({'question':'how many vacation days do employees get?'})

' Employees receive varying amounts of vacation days depending on their completed years of service. For those with one year of service, they get 36 weeks of vacation. If an employee has completed two years of service, the allotted vacation time increases to \n52 weeks.'

### Section 4

Test PDF input

In [22]:
loader = PyPDFLoader('comcast_wikipedia.pdf')
pages = loader.load_and_split()
len(pages)

18

In [23]:
vectorstore = DocArrayInMemorySearch.from_documents(pages, embedding=embeddings)
vectorstore.similarity_search_with_score(query = "What is NOW Mobile?", k = 2)

[(Document(page_content="HeadquartersComcast Center, Philadelphia, Pennsylvania, U.S.\nArea served Worldwide\nKey people Brian L. Roberts (chairman & CEO)\nMichael J. Cavanagh (president)\nProducts Cable television\nBroadband\nInternet service\nBroadcasting\nRevenue\n  US$121.6 billion\xa0(2023)\nOperating\nincome\n US$23.31 billion\xa0(2023)\nNet income\n  US$15.11 billion\xa0(2023)\nTotal assets\n  US$264.8 billion\xa0(2023)\nTotal equity\n  US$83.23 billion\xa0(2023)\nOwner Brian L. Roberts (1% equity interest, 33% voting power)\nNumber of\nemployees186,000\xa0(2023)\nDivisions Xﬁnity\nComcast Spectacor\nSubsidiariesMidco (49%)\nNBCUniversal\nSky Group\nASN 7922 (https://bgp.tools/as/7922)\nWebsite corporate.comcast.com (https://corporate.comcast.com/)\nFootnotes\xa0/ references\n[ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 6 ] [ 7 ]Comcast\nCorporation  (simply known as Comcast , and form erly known as American Cable  Systems  and\nComcast Holdings ),[note 1] incorpo rated and headquartered in 

In [25]:
model = Ollama(model='phi3:medium')
retriever = vectorstore.as_retriever(search_kwargs={'k': 1})
parser = StrOutputParser()

template = """
Using only this context: {context}. Respond to this question: {question}.

Context: {context}

Question: {question}
"""

prompt = PromptTemplate.from_template(template)

chain = (
    {
        'context': itemgetter('question') | retriever,
        'question': itemgetter('question')
    }
    | prompt
    | model
    | parser
)

chain.invoke({'question':'What is Xumo?'})

' Xumo is a free ad-supported streaming television (FAST) service that Comcast acquired on February 25, e., in addition to its premium streaming service Peacock and the streaming platforms Pluto TV and Tubi. The service operates as part of the Comcast Cable division and features content from both first-party and third-party sources. It is available globally through distribution partnerships with smart TV manufacturers and can be accessed on YouTube in some regions.'

In [26]:
chain.invoke({"question":"Is Comcast headquartered in Philadelphia?"})

' Yes, Comcast is headquartered in Philadelphia. According to the provided Wikipedia document, it mentions that "Comcast attempts to acquire Time Warner Cable for $45.n billion" and provides other information related to its financial performance. Additionally, there\'s a note stating "[32][33] The Boston Globe found Comcast to be that city\\\'s top place to work in 2009." This suggests that the company is headquartered near or within Boston but does not confirm it as such. However, further research confirms that Comcast\'s headquarters are indeed located in Philadelphia.'

In [27]:
chain.invoke({"question":"Is Comcast involved in any Philanthropy?"})

" Yes, Comcast is involved in philanthropic efforts as demonstrated by their initiative announced in August 2015 to increase Internet access for low-income customers and senior citizens. The company increased Internet speeds from 5 Mbit/s to 10 Mbit/s for these individuals, provided free wireless routers, and piloted a program specifically targeting low-income senior citizens' Internet access. This initiative shows Comcast's commitment to philanthropy by addressing the digital divide and supporting underprivileged communities with better connectivity options."