Make sure to do all the stuff in the README first!

Set up the LLM and test it out.

In [2]:
import ollama

model = "llama3.2:1b"
message_content = 'why do leaves change in the fall?'
message = {'role': 'user', 'content': message_content}

# Try to call ollama directly - this assumes Ollama is already installed, along with the target model
ollama.chat(model=model,messages=[message])

{'model': 'llama3.2:1b',
 'created_at': '2024-10-14T15:00:54.045715Z',
 'message': {'role': 'assistant',
  'content': "Leaves changing colors in the fall is a natural process that's driven by the changing seasons and environmental factors. Here's why it happens:\n\n1. **Daylight shortens**: As the days get shorter, trees prepare for winter by slowing down their food-making process. This means they produce less chlorophyll, the green pigment that helps them absorb sunlight.\n2. **Chlorophyll breaks down**: As daylight hours shorten, the amount of chlorophyll in leaves starts to break down. This allows other pigments in the leaf to become visible, which is why we see a range of colors during fall.\n3. **Carotenoids and anthocyanins shine through**: The breakdown of chlorophyll exposes the underlying carotenoids (yellow, orange, and brown pigments) and anthocyanins (red, purple, and blue pigments). These pigments are always present in leaves, but become more visible during fall.\n4. **Tem

In [3]:
from langchain_ollama.llms import OllamaLLM
llm = OllamaLLM(model=model)

In [4]:
from langchain_core.prompts import PromptTemplate

RAG_PROMPT_TEMPLATE = """\
<|start_header_id|>system<|end_header_id|>
You are a helpful potty training assistant. You answer user questions based on context. If you can't answer the question with the context, say you don't know.
Context:
{context}
<|eot_id|>

<|start_header_id|>user<|end_header_id|>
User Question:
{query}
<|eot_id|>

<|start_header_id|>assistant<|end_header_id|>
"""

rag_prompt = PromptTemplate.from_template(RAG_PROMPT_TEMPLATE)

In [5]:
rag_chain = rag_prompt | llm

In [6]:
rag_chain.invoke({"query" : "How old is Carl?", "context" : "Carl is a sweet dude, he's 40."})

'Carl is 40 years old.'

Set up the embeddings, pull down some test data, and test out the retriever

In [9]:
from langchain_ollama import OllamaEmbeddings

embeddings = OllamaEmbeddings(
    model="mxbai-embed-large",
)

In [10]:
from langchain_community.document_loaders import RecursiveUrlLoader

# This example uses `beautifulsoup4` and `lxml`
import re
from bs4 import BeautifulSoup

def bs4_extractor(html: str) -> str:
    soup = BeautifulSoup(html, "lxml")
    
    # Remove unwanted tags
    for tag in soup(['nav', 'footer', 'header', 'aside', 'script', 'style']):
        tag.decompose()
    
    # Extract text
    text = soup.get_text(separator=' ', strip=True)
    
    # Clean up whitespace
    clean_text = re.sub(r'\s+', ' ', text).strip()
    
    return clean_text

loader = RecursiveUrlLoader("https://pottygenius.com/blogs/blog",
                            extractor=bs4_extractor)
docs = loader.load()

  soup = BeautifulSoup(html, "lxml")


In [11]:
# A bit of cleanup
unwanted_terms = str.split("arrow-right cart chevron-down chevron-left chevron-right chevron-up close menu minus play plus search share user email pinterest facebook instagram snapchat tumblr twitter vimeo youtube subscribe dogecoin dwolla forbrugsforeningen litecoin amazon_payments american_express bitcoin cirrus discover fancy interac jcb master paypal stripe visa diners_club dankort maestro trash"," ")
unwanted_terms.extend(["tell your story", "free undies","shop free undies!"])
for doc in docs:
    content = doc.page_content.lower()
    for term in unwanted_terms:
        content = content.replace(term,"")
    doc.page_content=content

In [12]:
print(len(docs))
print(docs[0])


30
page_content='potty genius blog                                             potty genius blog  games  shop ! potty genius blog — potty training boys — potty training girls — potty training methods  —  stories games  shop get ready to train! get ready to train it all starts with changing the mindset. we can’t just tell children it's time to potty... by marshall mizrahi potty genius blog pull ups®? potty genius blog recognize your child's toilet training readiness potty genius blog eva shockey on potty training potty genius blog shannen michaela on elimination communication potty genius blog potty training a child with down syndrome potty genius blog potty training boys and girls potty training is challenging regardless of your toddler’s gender. that said, potty training boys is a bit different than potty training girls. while it is obvious that males and females use the bathroom differently, there are some other distinct potty training differences parents may run into when potty trai

In [13]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=2000,       # Adjust based on your needs
    chunk_overlap=200,     # Overlap to maintain context
)
split_docs = text_splitter.split_documents(docs)

print(f"len(docs): {len(docs)}, len(split_docs):{len(split_docs)}")

len(docs): 30, len(split_docs):109


In [14]:
for i in range(4): print(split_docs[i])

page_content='potty genius blog                                             potty genius blog  games  shop ! potty genius blog — potty training boys — potty training girls — potty training methods  —  stories games  shop get ready to train! get ready to train it all starts with changing the mindset. we can’t just tell children it's time to potty... by marshall mizrahi potty genius blog pull ups®? potty genius blog recognize your child's toilet training readiness potty genius blog eva shockey on potty training potty genius blog shannen michaela on elimination communication potty genius blog potty training a child with down syndrome potty genius blog potty training boys and girls potty training is challenging regardless of your toddler’s gender. that said, potty training boys is a bit different than potty training girls. while it is obvious that males and females use the bathroom differently, there are some other distinct potty training differences parents may run into when potty trainin

In [15]:
# create the vector store
from langchain_qdrant import QdrantVectorStore
url="http://localhost:6333"

qdrant = QdrantVectorStore.from_documents(
    docs,
    embeddings,
    url=url,
    prefer_grpc=True,
    collection_name="PottyTraining",
)

In [17]:
# make sure we can load it
qdrant_vector_store = QdrantVectorStore.from_existing_collection(
    embedding=embeddings,
    collection_name="PottyTraining",
    url=url
)

In [19]:
# set up retriever and see what we get with a simple query
retriever = qdrant_vector_store.as_retriever(
    search_type="mmr",  # Options: 'similarity', 'mmr', etc.
    search_kwargs={"k": 5}     # Number of documents to retrieve
)
retriever.invoke("How is potty training boys different from potty training girls")

[Document(metadata={'title': '\n  Potty Training Boys and Girls – Potty Genius\n  ', 'language': None, 'content_type': 'text/html; charset=utf-8', 'source': 'https://pottygenius.com/blogs/blog/potty-training-differences-in-boys-and-girls', '_id': '88a3095a-aaa3-4c82-a333-7883f091f1c2', '_collection_name': 'PottyTraining'}, page_content='potty training boys and girls – potty genius                                             potty genius blog  games  shop ! potty genius blog — potty training boys — potty training girls — potty training methods  —  stories games  shop potty genius blog potty training boys and girls potty training is challenging regardless of your toddler’s gender. that said, potty training boys is a bit different than potty training girls. while it is obvious that males and females use the bathroom differently, there are some other distinct potty training differences parents may run into when potty training boys versus girls. by brittany tacket, ma brittany tackett is a 

Now let's put it all together!

In [20]:
from operator import itemgetter
from langchain_core.runnables import RunnablePassthrough

rag_chain = (
    {"context": itemgetter("query") | retriever, "query": itemgetter("query")} 
    | RunnablePassthrough.assign(context=itemgetter("context"))
    | {"response": rag_prompt | llm}
)

In [22]:
# I used jelly beans as rewards to help potty train my child. Was that bad?
from pprint import pprint
answer = rag_chain.invoke(input={'query':"Should I use jelly beans to help potty train my child?"})
pprint(answer)

# Notice that the LLM thinks jelly beans are "non-food items." 1Bn parameter models aren't that bright.

{'response': 'There is no conclusive evidence to suggest that using jelly '
             'beans can be an effective or helpful tool for potty training. In '
             'fact, introducing non-food items like jelly beans during potty '
             'training can potentially create more problems than solutions.\n'
             '\n'
             'Some potential issues with using jelly beans as a potty training '
             'aid include:\n'
             '\n'
             '1. Allergies: Jelly beans are made from a mixture of sugar, corn '
             'syrup, and food coloring. Some children may be allergic to these '
             'ingredients, which could lead to skin irritation or other '
             'adverse reactions.\n'
             '2. Nutritional value: Jelly beans are high in added sugars and '
             'low in nutritional value, which may not provide your child with '
             'the energy they need to engage in physical activity or maintain '
             'focus during 