<a href="https://colab.research.google.com/github/ranaalisaeed/college-student-qa-bot/blob/main/acap_ss_rag_step3_inspect_similarity_search.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# Install necessary packages:
# jq: JSON processor for document loader
# langchain-community: LangChain community tools
# chromadb: a vector database for embeddings [deprecated]
# tiktoken: required by OpenAIEmbeddings for tokenization
# langchain-chroma: Chroma is deprecated in LangChain 0.2.9

%pip install langchain-openai langchain-chroma tiktoken --quiet

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/55.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m55.3/55.3 kB[0m [31m2.0 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
import os
from pprint import pprint
import textwrap
import json
from google.colab import drive, userdata
from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma

In [None]:
os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")

In [None]:
embeddings = OpenAIEmbeddings(model="text-embedding-ada-002")
drive.mount('/content/drive')
persist_dir = '/content/drive/MyDrive/acap-db'
vectorstore = Chroma(persist_directory=persist_dir, embedding_function=embeddings)

Mounted at /content/drive


In [None]:
collection = vectorstore._collection
print("Number of items in the collection: ", collection.count())

Number of items in the collection:  240


In [None]:
question = "What is SSAF?"

## Similarity Search

In [None]:
retrieved_docs = vectorstore.similarity_search(question, k=3)

In [None]:
pprint(retrieved_docs)

[Document(id='2f12ac9b-74dd-4ce9-aa12-26f95d69fc8a', metadata={'seq_num': 8, 'source': '/content/drive/Shareddrives/cnigenai/fees-payments-acap-claude.json'}, page_content='{"page_title": "Fees and Payments", "page_url": "https://www.acap.edu.au/pages/fees-payments/", "section_title": "SSAF Frequently asked questions", "section_content": "Why do we need a SSAF? The SSAF model allows funding to areas which would not normally be allocated through operational costs or cannot be funded due to legislation or education guidelines. It allows ACAP to allocate funding to services, facilities and student-focused initiatives that would otherwise be unsupported or unable to go ahead. How can I benefit directly from the SSAF? Services provided through the SSAF funding are made available to all ACAP students, and it is up to each individual student to choose their level of engagement with these services and activities. What is SA-HELP? SA-HELP is a Commonwealth loan available to eligible students to

In [None]:
for doc in retrieved_docs:

  data = json.loads(doc.page_content)
  metadata = doc.metadata

  section_title = data.get('section_title')
  section_content = data.get('section_content')
  metadata_str = ' '.join(f'{key}: {value}' for key, value in metadata.items())

  print(
      f"Section Content: {textwrap.fill(section_content, width=120)}\n"
      f"--------------\nSection Title: {section_title}\n"
      f"--------------\nMetadata: {metadata_str}\n"
  )

Section Content: Why do we need a SSAF? The SSAF model allows funding to areas which would not normally be allocated through operational
costs or cannot be funded due to legislation or education guidelines. It allows ACAP to allocate funding to services,
facilities and student-focused initiatives that would otherwise be unsupported or unable to go ahead. How can I benefit
directly from the SSAF? Services provided through the SSAF funding are made available to all ACAP students, and it is up
to each individual student to choose their level of engagement with these services and activities. What is SA-HELP? SA-
HELP is a Commonwealth loan available to eligible students to pay their Student Services and Amenities Fee (SSAF). I’m
using FEE-HELP, do I still need to pay the SSAF? Your Student Services and Amenities Fee (SSAF) is not covered under
FEE-HELP. However, eligible students may apply for SA-HELP to pay for all or part of the SSAF.
--------------
Section Title: SSAF Frequently asked q

## Maximal Marginal Relevance (MMR) Search

In [None]:
retrieved_docs_mmr = vectorstore.max_marginal_relevance_search(question, k=3, lambda_mult=0.7)

In [None]:
for doc in retrieved_docs_mmr:
    print("Metadata:", doc.metadata)


Metadata: {'seq_num': 8, 'source': '/content/drive/Shareddrives/cnigenai/fees-payments-acap-claude.json'}
Metadata: {'seq_num': 71, 'source': '/content/drive/Shareddrives/cnigenai/fees-payments-acap-claude.json'}
Metadata: {'seq_num': 5, 'source': '/content/drive/Shareddrives/cnigenai/fees-payments-acap-claude.json'}


In [None]:
for doc in retrieved_docs_mmr:

  data = json.loads(doc.page_content)
  metadata = doc.metadata

  section_title = data.get('section_title')
  section_content = data.get('section_content')
  metadata_str = ' '.join(f'{key}: {value}' for key, value in metadata.items())

  print(
      f"Section Content: {textwrap.fill(section_content, width=120)}\n"
      f"--------------\nSection Title: {section_title}\n"
      f"--------------\nMetadata: {metadata_str}\n"
  )

Section Content: Why do we need a SSAF? The SSAF model allows funding to areas which would not normally be allocated through operational
costs or cannot be funded due to legislation or education guidelines. It allows ACAP to allocate funding to services,
facilities and student-focused initiatives that would otherwise be unsupported or unable to go ahead. How can I benefit
directly from the SSAF? Services provided through the SSAF funding are made available to all ACAP students, and it is up
to each individual student to choose their level of engagement with these services and activities. What is SA-HELP? SA-
HELP is a Commonwealth loan available to eligible students to pay their Student Services and Amenities Fee (SSAF). I’m
using FEE-HELP, do I still need to pay the SSAF? Your Student Services and Amenities Fee (SSAF) is not covered under
FEE-HELP. However, eligible students may apply for SA-HELP to pay for all or part of the SSAF.
--------------
Section Title: SSAF Frequently asked q