### **Install Necessary Packages**

In [None]:
# We begin by installing the required packages:
# - `langchain_community`: Provides community-contributed modules for LangChain.
# - `unstructured`: A library for processing unstructured data.
# - `chromadb`: An open-source vector database for efficient storage and retrieval of embeddings.

In [1]:
!pip install langchain_community



In [2]:
!pip install unstructured



In [3]:
!pip install chromadb



### **Import UnstructuredURLLoader**

In [2]:
# The `UnstructuredURLLoader` from `langchain_community.document_loaders` is used to load and process data from URLs.
from langchain_community.document_loaders import UnstructuredURLLoader

### **Define URLs to Load**

In [3]:
urls = [
    "https://www.victoriaonmove.com.au/local-removalists.html",
    "https://victoriaonmove.com.au/index.html",
    "https://victoriaonmove.com.au/contact.html",
]

### **Initialize the Loader**

In [5]:
# We create an instance of `UnstructuredURLLoader` with the specified URLs.
loader = UnstructuredURLLoader(urls=urls)

### **Load Data from URLs**

In [6]:
data = loader.load()

In [7]:
data

[Document(metadata={'source': 'https://www.victoriaonmove.com.au/local-removalists.html'}, page_content='Loading...\n\nLOCAL REMOVALS\n\nYour trusted partner in seamless moving and packing solutions!\n\nGoogle Rating\n\n5 stars, 111 reviews\n\nRequst A call for You:\n\nLocal removal services via "Victoria on move"\n\nVictoria on Move is your trusted local moving company in Melbourne, specializing in seamless relocation services. As experienced furniture movers and relocation experts, we provide top-notch packing and moving services tailored to your needs. Whether you\'re moving across town or relocating interstate, our professional movers ensure a stress-free experience. Count on Victoria on Move for reliable removal services, making us the preferred choice among local movers in Melbourne. Discover why we\'re recognized for our commitment to quality and customer satisfaction.\n\nApartment Moving\n\nEfficient and careful relocation services tailored for apartments of all sizes, ensuring

### **Initialize Text Splitter**

In [8]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [9]:
# We set up the text splitter with a chunk size of 1000 characters.
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000)

### **Split Data into Chunks**

In [10]:
# The `split_documents` method divides the loaded data into manageable chunks.
docs = text_splitter.split_documents(data)

In [11]:
print("Total number of documents: ", len(docs))

Total number of documents:  11


In [12]:
# We import `Chroma` for vector storage and `HuggingFaceEmbeddings` for generating text embeddings.
from langchain.vectorstores import Chroma
from langchain.embeddings import HuggingFaceEmbeddings

### **Initialize Embedding Model**

In [14]:
# We use the `sentence-transformers/all-MiniLM-L6-v2` model from Hugging Face for generating embeddings.
embedding_model = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

  from .autonotebook import tqdm as notebook_tqdm





To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


### **Create Vector Store**

In [16]:
# We create a Chroma vector store from the document chunks using the specified embedding model.
vectorstore = Chroma.from_documents(documents=docs, embedding=embedding_model)

### **Set Up Retriever**

In [17]:
# We configure the retriever to fetch the top 3 most similar documents based on a query.
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 3})

In [18]:
# We use the retriever to find documents related to the query: "What kind of services do they provide?"
retrieved_docs = retriever.invoke("What kind of services they provide?")

### **Display Number of Retrieved Documents**

In [20]:
# We print the number of documents retrieved for the query.
print("Number of retrieved docs:", len(retrieved_docs))

Number of retrieved docs: 3


In [21]:
# We print the content of the first retrieved document to inspect its relevance.
print(retrieved_docs[0].page_content)

Contact Us

Our Clients Say!

Discover firsthand experiences from our valued clients through their heartfelt testimonials. From seamless moves to exceptional service, our customers share how we've made their relocation journey stress-free and rewarding. Explore their stories and see why they trust us with their moves time and again.

Get In Touch

Wollert Victoria

0404922328

victoriaonmove07@gmail.com

Quick Links

About Us Contact Us Our Services Terms & Condition

Photo Gallery

Check us out on Google!

© Victoria On Move 2024, All Right Reserved. Designed By HTML Codex

Home


In [26]:
!pip install -U langchain-huggingface

Collecting langchain-huggingface
  Downloading langchain_huggingface-0.1.2-py3-none-any.whl.metadata (1.3 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers>=2.6.0->langchain-huggingface)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers>=2.6.0->langchain-huggingface)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=1.11.0->sentence-transformers>=2.6.0->langchain-huggingface)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch>=1.11.0->sentence-transformers>=2.6.0->langchain-huggingface)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==1

In [23]:
# We install and import `langchain_huggingface` and necessary components from `transformers` for text generation.
from langchain_huggingface import HuggingFacePipeline

In [24]:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
# from langchain.llms import HuggingFacePipeline

### **Initialize Text Generation Model**

In [25]:
# Use a small Hugging Face model for text generation
model_name = "distilgpt2"

In [26]:
tokenizer = AutoTokenizer.from_pretrained(model_name)

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


In [27]:
model = AutoModelForCausalLM.from_pretrained(model_name)

### **Define Text Generation Pipeline**

In [28]:
# Define the text generation pipeline with max_new_tokens inside
hf_pipeline = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=150)

Device set to use cpu


In [29]:
# Create HuggingFacePipeline without passing max_new_tokens here
llm = HuggingFacePipeline(pipeline=hf_pipeline)

In [30]:
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate

### **Define System Prompt**

In [31]:
# We set up a system prompt to guide the assistant's responses, emphasizing conciseness and clarity.
system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer "
    "the question. If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
    "\n\n"
    "{context}"
)

### **Create Chat Prompt Template**

In [32]:
# We define a chat prompt template that includes both system instructions and user input.
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)

### **Create Question-Answering Chain**

In [33]:
# We set up a chain that combines retrieved documents and generates answers using the language model.
question_answer_chain = create_stuff_documents_chain(llm, prompt)
rag_chain = create_retrieval_chain(retriever, question_answer_chain)

### **Generate Response to Query**

In [34]:
# We invoke the retrieval-augmented generation chain with the input question.
response = rag_chain.invoke({"input": "What kind of services do they provide?"})

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


In [35]:
print(response["answer"])

System: You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, say that you don't know. Use three sentences maximum and keep the answer concise.

Contact Us

Our Clients Say!

Discover firsthand experiences from our valued clients through their heartfelt testimonials. From seamless moves to exceptional service, our customers share how we've made their relocation journey stress-free and rewarding. Explore their stories and see why they trust us with their moves time and again.

Get In Touch

Wollert Victoria

0404922328

victoriaonmove07@gmail.com

Quick Links

About Us Contact Us Our Services Terms & Condition

Photo Gallery

Check us out on Google!

© Victoria On Move 2024, All Right Reserved. Designed By HTML Codex

Home

Loading...

LOCAL REMOVALS

Your trusted partner in seamless moving and packing solutions!

Google Rating

5 stars, 111 reviews

Requst A call for You:

Local removal ser