[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Sciform/fhnw-mini-rag-system/blob/main/rag.ipynb)


In [1]:
#%pip install --upgrade pip setuptools wheel
%pip install langchain langchain-huggingface langchain-community faiss-cpu sentence-transformers transformers huggingface_hub

Collecting langchain-huggingface
  Downloading langchain_huggingface-0.1.2-py3-none-any.whl.metadata (1.3 kB)
Collecting langchain-community
  Downloading langchain_community-0.3.22-py3-none-any.whl.metadata (2.4 kB)
Collecting faiss-cpu
  Downloading faiss_cpu-1.11.0-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (4.8 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain-community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain-community)
  Downloading pydantic_settings-2.9.1-py3-none-any.whl.metadata (3.8 kB)
Collecting httpx-sse<1.0.0,>=0.4.0 (from langchain-community)
  Downloading httpx_sse-0.4.0-py3-none-any.whl.metadata (9.0 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community)
  Downloading marshmallow-3.26.1-py3-none-any.whl.metadata (7.3 kB)
Collecting typing-inspect<1,>=0.4.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community)
  Downloadi

In [3]:
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain_huggingface import HuggingFaceEmbeddings

import os

# Set your Hugging Face Hub API token (free signup)
os.environ["HUGGINGFACEHUB_API_TOKEN"] = "<Hugging_Face_Access>"

In [4]:
# 1. Load documents

# Download the sample text files
!wget https://raw.githubusercontent.com/sciform/fhnw-mini-rag-system/main/docs/sample1.txt -P docs/
!wget https://raw.githubusercontent.com/sciform/fhnw-mini-rag-system/main/docs/sample2.txt -P docs/
!wget https://raw.githubusercontent.com/sciform/fhnw-mini-rag-system/main/docs/sample3.txt -P docs/


loader1 = TextLoader("docs/sample1.txt")
loader2 = TextLoader("docs/sample2.txt")
loader3 = TextLoader("docs/sample3.txt")
documents = loader1.load() + loader2.load() + loader3.load()

# 2. Split into chunks
text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=10)
docs = text_splitter.split_documents(documents)

--2025-04-28 12:23:48--  https://raw.githubusercontent.com/sciform/fhnw-mini-rag-system/main/docs/sample1.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 363 [text/plain]
Saving to: ‘docs/sample1.txt’


2025-04-28 12:23:48 (5.22 MB/s) - ‘docs/sample1.txt’ saved [363/363]

--2025-04-28 12:23:48--  https://raw.githubusercontent.com/sciform/fhnw-mini-rag-system/main/docs/sample2.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.111.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 199 [text/plain]
Saving to: ‘docs/sample2.txt’


2025-04-28 12:23:48 (2.67 MB/s) - ‘docs/sample2.txt’

In [5]:
# 3. Embed and store in FAISS
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
vector_store = FAISS.from_documents(docs, embeddings)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.5k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [6]:
# Print the number of documents in the vector store (FAISS index)
print(f"Number of documents in vector store: {vector_store.index.ntotal}")

Number of documents in vector store: 3


In [7]:
retriever = vector_store.as_retriever(search_type="similarity", k=1)
query = "What is a blue whale ?"
retrieved_docs = retriever.invoke(query, k=1)

# Print the retrieved documents
print("Documents retrieved:")
for doc in retrieved_docs[:len(retrieved_docs)]:
    print(f"Document: {doc.page_content}")

Documents retrieved:
Document: The blue whale is the largest animal known to have ever existed. Blue whales are marine mammals and can reach lengths of up to 30 meters. They primarily feed on tiny shrimp-like animals called krill.


In [8]:
# 4. Use a free LLM (small one)
from langchain_huggingface import HuggingFaceEndpoint

model_falcon_base = "tiiuae/falcon-rw-1b-instruct"
model_falcon_instruct = "ericzzz/falcon-rw-1b-instruct-openorca"
model_mistral = "mistralai/Mistral-7B-v0.1"

# Wrap it in LangChain's new HuggingFaceEndpoint class
llm = HuggingFaceEndpoint(
    repo_id=model_falcon_instruct,
    task="text-generation",
    max_new_tokens=75,
    temperature=0.1
)

In [9]:
# 5. Create RetrievalQA chain
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template(
    """Answer using only this context:
    {context}

    Question: {input}"""
)

document_chain = create_stuff_documents_chain(
    llm,
    prompt,
    document_separator="\n\n")

retriever = vector_store.as_retriever(search_type="similarity", k=1)

qa_chain = create_retrieval_chain(
    retriever,
    document_chain)


In [37]:
# 6. Ask something
query = "Who climbs Mount Everest?"
context = "Use only the most relevant document, if unsure say 'I don't know'. Answer as short as possible. Do not confuse animals."
result = qa_chain.invoke({"input": query}, {"context": context})




In [38]:
print("\nQuestion:", query)
print(result["answer"])


Question: Who climbs Mount Everest?


Answer: Many climbers attempt to reach the top of Mount Everest each year despite the extreme conditions.


In [39]:
result

{'input': 'Who climbs Mount Everest?',
 'context': [Document(id='aa26c7c2-7a97-4b85-8563-668311cf0265', metadata={'source': 'docs/sample3.txt'}, page_content="Mount Everest is Earth's highest mountain above sea level, located in the Himalayas. Its summit is 8,848 meters above sea level. Many climbers attempt to reach the top each year, despite the extreme conditions."),
  Document(id='cc77e5da-c143-4d3b-9e05-d9b550a0cdd9', metadata={'source': 'docs/sample2.txt'}, page_content='The blue whale is the largest animal known to have ever existed. Blue whales are marine mammals and can reach lengths of up to 30 meters. They primarily feed on tiny shrimp-like animals called krill.'),
  Document(id='3469015b-5858-419a-8a06-1e838754af0d', metadata={'source': 'docs/sample1.txt'}, page_content='The honey bee (Apis mellifera) is one of the most important and fascinating creatures on the planet, playing a critical role in the health of ecosystems and the production of food. Known for its distinctive

In [40]:
from google.colab import output
output.clear()