<a href="https://colab.research.google.com/github/Tar-ive/txst-ai/blob/master/txst_agentic_rag_system_updated.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

 This tutorial will guide you through building a Retrieval-Augmented Generation (RAG) system with an AI agent capable of answering complex queries about the TXST course catalogue using external information.

Author: Saksham Adhikari (ChatGPT, Claude, XAi)

#Table of Contents
1. Overview of Agentic RAG
2. Prerequisites
3. Steps

Step 1: Set Up Your Environment

Step 2: Install and Import Relevant Libraries

Step 3: Set Up OpenAI API Credentials

Step 4: Initialize a Basic Agent with No Tools

Step 5: Establish the Knowledge Base and Retriever

Step 6: Define the Agent's RAG Tool

Step 7: Establish the Prompt Template

Step 8: Set Up the Agent's Memory and
Chain
Step 9: Generate Responses with the Agentic RAG System

4. Summary
5. Further Resources

#Overview of Agentic RAG
## What is RAG?
Retrieval-Augmented Generation (RAG) is a technique in natural language processing (NLP) that combines information retrieval with generative models to produce more accurate, relevant, and contextually aware responses. Traditional language models generate responses based solely on the input prompt and their pre-trained knowledge. However, RAG systems enhance this by fetching external information from a knowledge base or documents, ensuring that responses are up-to-date and grounded in specific data sources.

## What are AI Agents?
AI Agents are systems or programs capable of autonomously performing tasks on behalf of a user or another system. They design their workflows, utilize available tools, and can interact with external data sources, APIs, and other agents to solve complex tasks. In the context of RAG, AI agents can decide when and how to retrieve information, perform calculations, analyze data, and generate responses based on multiple inputs and tools.


## Agentic RAG vs. Traditional RAG
While traditional RAG systems primarily focus on augmenting generative models with information retrieval from a vector database, Agentic RAG systems are more versatile. They can incorporate multiple tools beyond data retrieval, such as performing mathematical calculations, sending emails, conducting data analysis, and more. Additionally, agentic RAG systems can operate collaboratively in multi-agent environments, enhancing scalability and adaptability.

# Agentic AI Advisor made for TXST- Part 1: Single Agent System

### Step 2: Install and Import Relevant Libraries

In [None]:
!pip install langchain openai chromadb tiktoken python-dotenv bs4


Collecting chromadb
  Downloading chromadb-0.5.18-py3-none-any.whl.metadata (6.8 kB)
Collecting tiktoken
  Downloading tiktoken-0.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.6 kB)
Collecting python-dotenv
  Downloading python_dotenv-1.0.1-py3-none-any.whl.metadata (23 kB)
Collecting bs4
  Downloading bs4-0.0.2-py2.py3-none-any.whl.metadata (411 bytes)
Collecting build>=1.0.3 (from chromadb)
  Downloading build-1.2.2.post1-py3-none-any.whl.metadata (6.5 kB)
Collecting chroma-hnswlib==0.7.6 (from chromadb)
  Downloading chroma_hnswlib-0.7.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (252 bytes)
Collecting fastapi>=0.95.2 (from chromadb)
  Downloading fastapi-0.115.4-py3-none-any.whl.metadata (27 kB)
Collecting uvicorn>=0.18.3 (from uvicorn[standard]>=0.18.3->chromadb)
  Downloading uvicorn-0.32.0-py3-none-any.whl.metadata (6.6 kB)
Collecting posthog>=2.4.0 (from chromadb)
  Downloading posthog-3.7.0-py2.py3-none-any.whl.metadata (2.

In [None]:
!pip install langchain_community

Collecting langchain_community
  Downloading langchain_community-0.3.5-py3-none-any.whl.metadata (2.9 kB)
Collecting SQLAlchemy<2.0.36,>=1.4 (from langchain_community)
  Downloading SQLAlchemy-2.0.35-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.6 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain_community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting httpx-sse<0.5.0,>=0.4.0 (from langchain_community)
  Downloading httpx_sse-0.4.0-py3-none-any.whl.metadata (9.0 kB)
Collecting langchain<0.4.0,>=0.3.6 (from langchain_community)
  Downloading langchain-0.3.7-py3-none-any.whl.metadata (7.1 kB)
Collecting langchain-core<0.4.0,>=0.3.15 (from langchain_community)
  Downloading langchain_core-0.3.15-py3-none-any.whl.metadata (6.3 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain_community)
  Downloading pydantic_settings-2.6.1-py3-none-any.whl.metadata (3.5 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from datac

In [None]:
import os
from dotenv import load_dotenv
import openai

from langchain import OpenAI, LLMChain, PromptTemplate
from langchain.agents import AgentExecutor, Tool, initialize_agent
from langchain.agents import AgentType
from langchain.tools import tool
from langchain.vectorstores import Chroma
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.memory import ConversationBufferMemory

from bs4 import BeautifulSoup




### Step 3: Set Up OpenAI API Credentials

In [77]:
from google.colab import userdata
userdata.get('OPENAI_API_KEY')

OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')

In [78]:
openai_api_key = userdata.get('OPENAI_API_KEY')
if not openai_api_key:
    raise ValueError("OpenAI API key not found in environment variables.")

In [None]:
%pip install -qU langchain-openai



[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/50.4 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.4/50.4 kB[0m [31m2.8 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/389.6 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m389.1/389.6 kB[0m [31m25.1 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m389.6/389.6 kB[0m [31m11.5 MB/s[0m eta [36m0:00:00[0m
[?25h

### Initialize OpenAI in LangChain:

In [68]:
from langchain_openai import OpenAI
from langchain.chat_models import ChatOpenAI # Import the ChatOpenAI class from langchain.chat_models


openai_api_key = userdata.get('OPENAI_API_KEY')

# Check if the API key is valid
if not openai_api_key:
    raise ValueError("OpenAI API key not found in userdata.")

# Initialize the OpenAI LLM with the API key
llm = ChatOpenAI(
    # Pass the api_key directly to the OpenAI class
    openai_api_key=openai_api_key,
    model_name="gpt-4o-mini", #Changed to the correct name for gpt-4-turbo
    temperature=0,  # Set to 0 for deterministic outputs
)

In [None]:
openai_api_key

### Step 4: Initialize a Basic Agent with No Tools

In [None]:
template = "Answer the following question as accurately as possible. If you do not know the answer, simply say you do not know.\n\nQuestion: {query}\nAnswer:"
prompt = PromptTemplate(template=template, input_variables=["query"])


In [None]:
chain = LLMChain(llm=llm, prompt=prompt)


  chain = LLMChain(llm=llm, prompt=prompt)


In [None]:
response = chain.run({"query": "What sport is played at the US Open?"})
print(response)


  response = chain.run({"query": "What sport is played at the US Open?"})


The US Open is primarily known for tennis. It is one of the four Grand Slam tennis tournaments. Additionally, there is also a US Open in golf, which is a major championship in that sport.


In [None]:
response = chain.run({"query": "Where was the 2024 US Open Tennis Championship held?"})
print(response)


I do not know.


### Step 5: Establish the Knowledge Base and Retriever

In [None]:
urls = [
    "http://mycatalog.txstate.edu/undergraduate/majors/#A",
    "http://mycatalog.txstate.edu/undergraduate/majors/#B",
    "http://mycatalog.txstate.edu/undergraduate/majors/#C",
    "http://mycatalog.txstate.edu/undergraduate/majors/#D",
    "http://mycatalog.txstate.edu/undergraduate/majors/#E",
    "http://mycatalog.txstate.edu/undergraduate/majors/#F",
    "http://mycatalog.txstate.edu/undergraduate/majors/#G",
    "http://mycatalog.txstate.edu/undergraduate/majors/#H",
    "http://mycatalog.txstate.edu/undergraduate/majors/#I",
    "http://mycatalog.txstate.edu/undergraduate/majors/#J",
    "http://mycatalog.txstate.edu/undergraduate/majors/#K",
    "http://mycatalog.txstate.edu/undergraduate/majors/#L",
    "http://mycatalog.txstate.edu/undergraduate/majors/#M",
    "http://mycatalog.txstate.edu/undergraduate/majors/#N",
    "http://mycatalog.txstate.edu/undergraduate/majors/#O",
    "http://mycatalog.txstate.edu/undergraduate/majors/#P",
    "http://mycatalog.txstate.edu/undergraduate/majors/#Q",
    "http://mycatalog.txstate.edu/undergraduate/majors/#R",
    "http://mycatalog.txstate.edu/undergraduate/majors/#S",
    "http://mycatalog.txstate.edu/undergraduate/majors/#T",
    "http://mycatalog.txstate.edu/undergraduate/majors/#U",
    "http://mycatalog.txstate.edu/undergraduate/majors/#V",
    "http://mycatalog.txstate.edu/undergraduate/majors/#W",
    "http://mycatalog.txstate.edu/undergraduate/majors/#X",
    "http://mycatalog.txstate.edu/undergraduate/majors/#Y",
    "http://mycatalog.txstate.edu/undergraduate/majors/#Z",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/accounting/accounting-bba/",
    "https://mycatalog.txstate.edu/graduate/mccoy-business-administration/accounting/accounting-bba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/advertising-mass-communication-sports-media-concentration-bs/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/advertising-mass-communication-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/agriculturalsciences/agriculture-business-management-agribusiness-specialization-bsag/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/agriculturalsciences/agriculture-agricultural-mechanics-concentration/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/agriculturalsciences/agriculture-agricultural-horticulture-concentration/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/agriculturalsciences/agriculture-teacher-certification-science-technology-grades-612-bsag/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/agriculturalsciences/agriculture-bsag/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/agriculturalsciences/agriculture-animal-science-preveterinary-concentration/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/agriculturalsciences/agriculture-animal-science/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/anthropology/anthropology-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/anthropology/anthropology-bs/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/organization-workforce-leadership-studies/applied-sciences-baas/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/mathematics/applied-bs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/sociology/applied-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/biology/aquatic-bs/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/art-design/art-history-ba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/art-design/art-ba/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/chemistry-biochemistry/biochemistryacs-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/chemistry-biochemistry/biochemistry-early-entry-combined-program-bs-ms/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/chemistry-biochemistry/biochemistry-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/biology/biology-teacher-certification-life-sciences-grades-7-12-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/biology/biology-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/chemistry-biochemistry/chemistry-early-entry-combined-program-bs-ms/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/chemistry-biochemistry/chemistry-teacher-certification-grades-7-12-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/chemistry-biochemistry/chemistry-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/ingram-school/civil-engineering-bs/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/art-design/communication-design-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/communication-disorders/bscd/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/studies/persuasive-communication-studies-ba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/studies/organizational-communication-studies-ba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/studies/interpersonal-communication-studies-ba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/studies/communication-studies-teacher-certification-speech-grades-712-ba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/studies/communication-studies-ba/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/computer-information-systems-quantitative-methods/computer-information-systems-bba/cis-business-analytics/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/computer-information-systems-quantitative-methods/cis-info-security-bba/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/computer-information-systems-quantitative-methods/cis-software-dev-bba/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/computer-information-systems-quantitative-methods/computer-information-systems-bba/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/computer/computer-science-concentration-engineering-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/computer/computer-science-ba/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/computer/computer-science-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/technology/concrete-industry-management-minor-business-administration-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/technology/construction-science-management-residential-construction-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/technology/construction-science-management-minor-business-administration-bs/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/family-consumer-sciences/consumer-affairs-family-sciences-option-bsfcs/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/family-consumer-sciences/consumer-affairs-bsfcs/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/family-consumer-sciences/consumer-affairs-teacher-certification-family-sciences-grades-612-bsfcs/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/criminal-justice/criminal-justice-bscj/",
    "http://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/dance-studies-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/dance-performance-choreography-emphasis-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/dance-single-field-teaching-certification-grades-812-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/dance-two-field-teaching-certification-grades-812-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/dance-ba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/digital-mass-communication-sports-media-concentration-bs/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/digital-mass-communication-bs/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/finance-economics/economics-ba/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/finance-economics/economics-bba/",
    "https://mycatalog.txstate.edu/undergraduate/education/curriculum-instruction/education-bilingual-biliteracy-teacher-certficiation-ec6-bilingual-spanish-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/curriculum-instruction/education-elementary-education-teacher-certficiation-ec6-esl-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/curriculum-instruction/education-middle-teacher-certification-48-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/curriculum-instruction/education-middle-teacher-certification-eng-langarts-reading-ss-48-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/curriculum-instruction/education-middle-teacher-certification-math-science-48-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/curriculum-instruction/education-middle-teacher-certification-math-48-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/curriculum-instruction/education-middle-teacher-certification-science-48-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/curriculum-instruction/education-ba/",
        "https://mycatalog.txstate.edu/undergraduate/education/curriculum-instruction/education-secondary-education-teacher-certification-double-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/curriculum-instruction/education-special-education-teacher-certification-special-education-ec12-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/ingram-school/electrical-engineering-computer-specialization-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/ingram-school/electrical-engineering-micro-nano-devices-systems-specialization-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/ingram-school/electrical-engineering-networks-communication-systems-specialization-bs/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/electronic-media-mass-communication-sports-media-concentration-bs/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/electronic-media-mass-communication-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/technology/engineering-technology-civil-specialization-bst/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/technology/electrical-specialization-bst/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/technology/engineering-technology-environmental-specialization-bst/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/technology/engineering-technology-manufacturing-specialization-bst/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/technology/engineering-technology-mechanical-specialization-bst/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/english/creative-writing-emphasis-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/english/film-emphasis-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/english/english-secondary-tcert-double-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/english/writing-rhetoric-emphasis-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/english/english-ba/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/exercise-sports-science-health-wellness-promotion-clinical-populations-concentration-bess/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/exercise-sports-science-prerehab-sciences/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/exercise-sports-science-teacher-certification-physical-education-grades-ec12-bess/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/exercise-sports-sci-athletic-trainig-bessms/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/family-consumer-sciences/fashion-merchandising-plan-presentation-bsfcs/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/family-consumer-sciences/fashion-merchandising-bsfcs/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/finance-economics/finance-bba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/world-languages-literatures/french-secondary-tcert-double-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/world-languages-literatures/french-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/geography/geographic-information-science-bs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/geography/resource-environmental-studies-bs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/geography/urban-regional-planning-bs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/geography/water-resources-bs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/geography/geography-bs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/world-languages-literatures/german-secondary-tcert-double-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/world-languages-literatures/german-ba/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/health-fitness-management-business-administration-minor-bess/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/information-management/bshim/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/health-sciences-fine-motor-therapy-bs/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/health-sciences-gross-motor-therapy-bs/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/health-sciences-prechiropractic-bs/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/health-sciences-precomm-disorders-bs/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/health-sciences-preclinical-lab-science-bs/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/health-sciences-prenursing-bs/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/health-sciences-preradiation-therapy-bs/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/health-sciences-prerespiratory-bs/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/health-sciences-bs/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/administration/healthcare-administration-bha/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/history/history-secondary-tcert-double-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/history/history-secondary-tcert-ss-double-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/history/history-two-fields-teacher-certification-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/history/history-ba/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/family-consumer-sciences/human-dev-family-sciences-teacher-certification-human-studies-grades-812-bsfcs/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/family-consumer-sciences/human-dev-family-sciences-bsfcs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/geography/geography-ba/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/ingram-school/industrial-engineering-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/counseling-leadership-adult-school-psychology/integrated-studies-bgs/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/family-consumer-sciences/interior-design-bsfcs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/international-studies/relations-bais/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/international-studies/asian-focus-bais/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/international-studies/european-focus-bais/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/international-studies/business-focus-bais/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/international-studies/interamerican-focus-bais/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/international-studies/middle-east-african-focus-bais/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/international-studies/russian-east-european-focus-bais/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/international-studies/travel-tourism-focus-bais/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/journalism-mass-communication-sports-media-concentration-bs/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/journalism-mass-communication-bs/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/management/management-entrepreneurial-studies-concentration-bba/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/management/management-human-resource-concentration-bba/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/management/management-bba/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/ingram-school/manufacturing-engineering-mechanical-systems-concentration-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/ingram-school/manufacturing-engineering-smart-concentration-bs/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/marketing/marketing-professional-sales-concentration-bba/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/marketing/marketing-services-concentration-bba/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/marketing/marketing-bba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/mass-communication-ba/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/mathematics/mathematics-teacher-certification-grades-7-12-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/mathematics/mathematics-ba/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/mathematics/mathematics-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/ingram-school/mechanical-bs/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/medical-laboratory-science-program/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/biology/microbiology-molecular-genetics-bs/",
    "https://next.mycatalog.txstate.edu/undergraduate/fine-arts-communication/music/music-studies-band-concentration-teacher-certification-grades-ec12-BM/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/music/music-studies-choral-concentration-teacher-certification-grades-ec12-bm/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/music/music-studies-mariachi-concentration-teacher-certification-grades-ec12-bm/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/music/music-ba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/musical-theatre-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/st-davids-nursing/rn-to-bsn/",
    "https://mycatalog.txstate.edu/graduate/health-professions/st-davids-nursing/leadershipandadminnursing-rn-bsn-msn/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/st-davids-nursing/bsn/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/family-consumer-sciences/nutrition-foods-dietetics-track-bsfcs/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/family-consumer-sciences/nutrition-foods-teacher-certification-hospitality-nutrition-food-sciences-grades-812-bsfcs/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/family-consumer-sciences/nutrition-foods-bsfcs/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/music/performance-composition-concentration-bm/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/music/performance-guitar-concentration-bm/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/music/performance-instrumental-concentration-bm/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/music/performance-jazz-concentration-bm/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/music/performance-keyboard-concentration-bm/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/music/performance-vocal-concentration-bm/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/philosophy/philosophy-ba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/art-design/photography-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/geography/physical-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/physics/physics-astronomy-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/physics/physics-teacher-certification-physics-math-grades-7-12-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/physics/physics-ba/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/physics/physics-bs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/political-science/political-science-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/psychology/psychology-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/psychology/psychology-science-minor-bs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/political-science/public-administration-bpa/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/public-health-applied-epidemiology-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/public-health-health-wellness-coaching-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/public-health-health-equity-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/public-health-bs/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/public-relations-mass-communication-sports-media-concentration-bs/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/public-relations-mass-communication-bs/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/radiation-therapy-program/bsrt/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/recreation-studies-community-recreation-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/recreation-studies-outdoor-recreation-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/recreation-studies-therapeutic-recreation-bs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/philosophy/religious-studies-ba/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/respiratory-care/rrt-to-bsrc/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/respiratory-care/bsrc/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/social-work/social-work-bsw/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/sociology/sociology-ba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/music/sound-recording-technology-bs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/world-languages-literatures/spanish-hispanic-litandculture-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/world-languages-literatures/spanish-secondary-tcert-double-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/world-languages-literatures/spanish-profesions-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/world-languages-literatures/spanish-linguistics-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/world-languages-literatures/spanish-ba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/art-design/studio-art-teacher-certification-art-grades-ec12-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/art-design/studio-art-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/theatre-acting-preprofessional-option-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/theatre-film-production-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/theatre-performance-production-preprofessional-option-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/theatre-teacher-certification-grades-ec12-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/theatre-technical-production-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/theatre-ba/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/biology/wildlife-certification-biologist-bs/"
]




In [None]:
loaders = [WebBaseLoader(url) for url in urls]
docs = []
for loader in loaders:
    loaded_docs = loader.load()
    docs.extend(loaded_docs)


In [None]:
print(docs[0].page_content[:500])  # Print the first 500 characters






Undergraduate Degree Programs | Texas State University


























Skip to Content
AZ Index
Catalog Home
Institution Home








Search Catalog

Search



HomeUndergraduateUndergraduate Degree Programs


Catalogs 2024-2025

Catalog Home
Undergraduate Degree Programs
Graduate Degree Programs

Previous Catalogs

Undergraduate
Graduate





BobcatMail
CatsWeb
Canvas



More Tools 



SAP Portal
Pay Tuition
Online Toolkit
ePortfolio
Catalogs
Shuttle Tracker






About
Athletics
G


In [None]:
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=250,
    chunk_overlap=0,
    separators=["\n\n", "\n", " ", ""]
)
doc_splits = text_splitter.split_documents(docs)


In [None]:
embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)
vectorstore = Chroma.from_documents(
    documents=doc_splits,
    embedding=embeddings,
    collection_name="agentic-rag-chroma"
)


  embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)


In [None]:
retriever = vectorstore.as_retriever()


### Step 6: Define the Agent's RAG Tool


In [None]:
@tool
def get_mccoy_business_administration_context(question: str) -> str:
    """Retrieve context about Texas State University's McCoy College of Business Administration."""
    docs = retriever.get_relevant_documents(f"McCoy Business Administration {question}")
    if not docs:
        return "I could not find any relevant information about McCoy Business Administration."
    context = "\n".join([doc.page_content for doc in docs])
    return context


@tool
def get_fine_arts_context(question: str) -> str:
    """Retrieve context about Texas State University's Fine Arts and Communication programs."""
    docs = retriever.get_relevant_documents(f"Fine Arts and Communication {question}")
    if not docs:
        return "I could not find any relevant information about Fine Arts and Communication."
    context = "\n".join([doc.page_content for doc in docs])
    return context


@tool
def get_liberal_arts_context(question: str) -> str:
    """Retrieve context about Texas State University's Liberal Arts programs."""
    docs = retriever.get_relevant_documents(f"Liberal Arts {question}")
    if not docs:
        return "I could not find any relevant information about Liberal Arts."
    context = "\n".join([doc.page_content for doc in docs])
    return context


@tool
def get_science_and_engineering_context(question: str) -> str:
    """Retrieve context about Texas State University's Science and Engineering programs."""
    docs = retriever.get_relevant_documents(f"Science and Engineering {question}")
    if not docs:
        return "I could not find any relevant information about Science and Engineering."
    context = "\n".join([doc.page_content for doc in docs])
    return context


@tool
def get_health_professions_context(question: str) -> str:
    """Retrieve context about Texas State University's Health Professions programs."""
    docs = retriever.get_relevant_documents(f"Health Professions {question}")
    if not docs:
        return "I could not find any relevant information about Health Professions."
    context = "\n".join([doc.page_content for doc in docs])
    return context


@tool
def get_applied_arts_context(question: str) -> str:
    """Retrieve context about Texas State University's Applied Arts programs."""
    docs = retriever.get_relevant_documents(f"Applied Arts {question}")
    if not docs:
        return "I could not find any relevant information about Applied Arts."
    context = "\n".join([doc.page_content for doc in docs])
    return context


@tool
def get_education_context(question: str) -> str:
    """Retrieve context about Texas State University's Education programs."""
    docs = retriever.get_relevant_documents(f"Education {question}")
    if not docs:
        return "I could not find any relevant information about Education."
    context = "\n".join([doc.page_content for doc in docs])
    return context


In [None]:
tools = [
    get_mccoy_business_administration_context,
    get_fine_arts_context,
    get_liberal_arts_context,
    get_science_and_engineering_context,
    get_health_professions_context,
    get_applied_arts_context,
    get_education_context
]


### Step 7: Establish the Prompt Template

In [None]:
system_prompt = """You are an AI academic advisor for Texas State University, specializing in providing detailed information about various undergraduate programs. You have access to the following tools: {tools}.

Use these tools to retrieve department-specific information as needed for Texas State’s McCoy Business Administration, Fine Arts and Communication, Liberal Arts, Science and Engineering, Health Professions, Applied Arts, Education, or IBM's involvement in the 2024 US Open.

When answering questions, follow this response format:

1. Analyze the question to determine the relevant department or topic.
2. Use the corresponding tool to retrieve context if necessary.
3. Provide a clear and detailed response based on the retrieved information or your existing knowledge.

Format:
{{ "action": "Tool Name", "action_input": "Tool Input" }}
Provide only ONE action per JSON block.

Example:
User: "Tell me about the Computer Information Systems major."

Response:
- Identify that this major is part of McCoy Business Administration.
- Use `get_mccoy_business_administration_context` to retrieve specific information.
- Respond with: "The Computer Information Systems major at Texas State University offers... [details from retrieved context]."

If no specific information is available, notify the user with a message such as:
"I couldn't find specific information on that topic, but I can help with general questions."

Your objective is to deliver comprehensive, department-focused answers that aid students in exploring and understanding Texas State University's programs and offerings.
"""


In [None]:
human_prompt = """Question: {input}
{agent_scratchpad}
"""


In [None]:
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.agents import AgentType, initialize_agent, load_tools
from langchain.llms import OpenAI
from langchain import SerpAPIWrapper
from langchain.agents import Tool
from langchain.tools import BaseTool
from typing import Optional, Type
from pydantic import BaseModel, Field

In [None]:
from langchain.vectorstores import Chroma
from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.prompts import PromptTemplate
from langchain.tools import tool
from langchain.tools.render import render_text_description_and_args
from langchain.agents.output_parsers import JSONAgentOutputParser
from langchain.agents.format_scratchpad import format_log_to_str
from langchain.agents import AgentExecutor
from langchain.memory import ConversationBufferMemory
from langchain_core.runnables import RunnablePassthrough

In [None]:
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages([
    ("system", system_prompt),
    MessagesPlaceholder(variable_name="chat_history", optional=True),
    ("human", human_prompt),
])

# Finalize the prompt with tool names and descriptions
# from langchain.agents import render_tools_text

prompt = prompt.partial(
    tools=render_text_description_and_args(tools),
    tool_names=", ".join([tool.name for tool in tools]),
)


### Step 8: Set Up the Agent's Memory and Chain


In [None]:
memory = ConversationBufferMemory(memory_key="chat_history")


  memory = ConversationBufferMemory(memory_key="chat_history")


In [None]:
agent_executor = initialize_agent(
    tools=tools,
    llm=llm,
    agent=AgentType.OPENAI_FUNCTIONS,  # Using OpenAI functions for tool usage
    verbose=True,
    memory=memory,
    prompt=prompt,
)


  agent_executor = initialize_agent(


### Step 9: Generate Responses with the Agentic RAG System

In [None]:
response = agent_executor.run("Can you provide details about the Computer Information Systems program at Texas State University, including available concentrations?")
print(response)




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `get_mccoy_business_administration_context` with `{'question': 'What are the details of the Computer Information Systems program at Texas State University, including available concentrations?'}`


[0m

  docs = retriever.get_relevant_documents(f"McCoy Business Administration {question}")


[36;1m[1;3mInformation Systems at Texas State University. More information about the Field of Study is available in the Academic Policies section of this catalog. If transferring additional business courses, please contact the McCoy College of Business
Information Systems at Texas State University. More information about the Field of Study is available in the Academic Policies section of this catalog. If transferring additional business courses, please contact the McCoy College of Business
Information Systems at Texas State University. More information about the Field of Study is available in the Academic Policies section of this catalog. If transferring additional business courses, please contact the McCoy College of Business
Information Systems at Texas State University. More information about the Field of Study is available in the Academic Policies section of this catalog. If transferring additional business courses, please contact the McCoy College of Business[0m[32;1m[1;3mIt 

In [None]:
response = agent_executor.run(
    "What are the core courses, concentrations, and career opportunities available for students in the Computer Information Systems program at Texas State University?"
)
print(response)




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `get_mccoy_business_administration_context` with `{'question': 'What are the core courses, concentrations, and career opportunities available for students in the Computer Information Systems program?'}`


[0m[36;1m[1;3mBachelor of Business Administration (B.B.A.) Major in Computer Information Systems
HomeUndergraduateEmmett and Miriam McCoy College of BusinessDepartment of Information Systems and AnalyticsB.B.A. Major in Computer Information Systems (Software Development Concentration)






Catalog Menu



Undergraduate
Bachelor of Business Administration (B.B.A.) Major in Computer Information Systems (Business Analytics Concentration)
HomeUndergraduateEmmett and Miriam McCoy College of BusinessDepartment of Information Systems and AnalyticsB.B.A. Major in Computer Information Systems (Business Analytics Concentration)






Catalog Menu



Undergraduate[0m[32;1m[1;3mThe Computer Information Systems program 

In [None]:
response = agent_executor.run("What is the capital of France?")
print(response)




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThe capital of France is Paris.[0m

[1m> Finished chain.[0m
The capital of France is Paris.


In [None]:
print(memory.buffer)


Human: Can you provide details about the Computer Information Systems program at Texas State University, including available concentrations?
AI: It seems that I wasn't able to retrieve specific details about the Computer Information Systems program at Texas State University, including its concentrations. However, I can provide a general overview based on common offerings in such programs.

Typically, a Computer Information Systems (CIS) program may include concentrations such as:

1. **Data Analytics**: Focuses on data management, analysis, and visualization techniques.
2. **Cybersecurity**: Concentrates on protecting information systems from cyber threats and attacks.
3. **Software Development**: Emphasizes programming, software engineering, and application development.
4. **Network Administration**: Covers the management and maintenance of computer networks.
5. **Information Technology Management**: Focuses on the strategic use of technology in business settings.

For the most accura

### Summary
In this tutorial, you've successfully built a Retrieval-Augmented Generation (RAG) system using LangChain and the OpenAI API. The system employs an AI agent capable of:

Answering Basic Queries: Responding directly when the information is within the model's training data.
Retrieving External Information: Utilizing a retrieval tool to access up-to-date and specific information from a curated knowledge base.
Maintaining Context: Leveraging memory to keep track of past interactions, enhancing the relevance and coherence of responses.
Deciding When to Use Tools: Determining whether to fetch external data or answer directly based on the query's nature.
This agentic RAG system showcases the power of combining generative models with information retrieval, enabling more accurate and contextually rich responses.



#Part 2- Multi Agent System


### testing the initial system


In [None]:
# Install necessary packages
# !pip install langchain chromadb openai

import os
from typing import List, Dict, Any
from langchain.agents import AgentExecutor, Tool, initialize_agent, AgentType
from langchain.memory import ConversationBufferMemory
from langchain.chat_models import ChatOpenAI
from langchain.prompts import (
    ChatPromptTemplate,
    MessagesPlaceholder,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
)
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import LLMChain
from langchain.schema import SystemMessage, HumanMessage, AIMessage
from langchain.docstore.document import Document

# Set your OpenAI API key
openai_api_key

class ClassifierAgent:
    """Central classifier agent that routes queries to appropriate department agents"""

    def __init__(self, llm):
        self.llm = llm
        self.classifier_prompt = """You are a query classifier for a university academic advising system.
Your job is to determine which department should handle the query.

Departments:
- Business (McCoy College of Business Administration)
- Fine Arts
- Liberal Arts
- Science and Engineering
- Health Professions
- Applied Arts
- Education

Query: {query}

Return ONLY the department name that should handle this query. If uncertain, return "uncertain".
"""
        self.chain = LLMChain(
            llm=self.llm,
            prompt=ChatPromptTemplate.from_template(self.classifier_prompt)
        )

    def classify_query(self, query: str) -> str:
        """Classify which department should handle the query"""
        result = self.chain.run(query=query)
        return result.strip().lower()

class BusinessAgent:
    """Agent specialized in handling business department queries"""

    def __init__(self, llm, retriever):
        self.llm = llm
        self.retriever = retriever

        # Define business-specific tool
        def get_business_info(question: str) -> str:
            """Retrieve information about McCoy College of Business Administration programs"""
            docs = self.retriever.get_relevant_documents(f"McCoy Business Administration {question}")
            return "\n".join([doc.page_content for doc in docs])

        self.tools = [
            Tool(
                name="get_business_info",
                func=get_business_info,
                description="Retrieve information about McCoy College of Business Administration programs"
            )
        ]

        # Business-specific prompt
        self.prompt = ChatPromptTemplate.from_messages([
            SystemMessagePromptTemplate.from_template("""You are a specialized academic advisor for the McCoy College of Business Administration
at Texas State University. Use the available tools to provide detailed information about business programs,
courses, and requirements. Always aim to be specific and accurate.
"""),
            MessagesPlaceholder(variable_name="chat_history"),
            HumanMessagePromptTemplate.from_template("{input}")
        ])

        # Initialize the agent
        self.memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
        self.agent = initialize_agent(
            tools=self.tools,
            llm=self.llm,
            agent=AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,
            verbose=False,
            memory=self.memory,
            agent_kwargs={'prompt': self.prompt}
        )

    def process_query(self, query: str) -> str:
        """Process business-related queries"""
        return self.agent.run(query)

class AdvisorSystem:
    """Main system that coordinates between agents"""

    def __init__(self, openai_api_key: str):
        # Initialize LLM
        self.llm = ChatOpenAI(
            openai_api_key=openai_api_key,
            temperature=0,
            model_name="gpt-4o-mini"
        )

        # Initialize embeddings and retriever
        self.embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)

        # Sample documents
        sample_documents = [
            Document(page_content="The McCoy College of Business Administration offers a Bachelor of Business Administration in Computer Information Systems."),
            Document(page_content="The McCoy College of Business Administration offers a Master of Business Administration (MBA) program. The prerequisites include completion of a bachelor's degree and submission of GMAT scores."),
            Document(page_content="The School of Music offers courses in piano, violin, guitar, and voice.")
        ]

        # Create the vectorstore and add documents
        self.vectorstore = Chroma.from_documents(sample_documents, self.embeddings)
        self.retriever = self.vectorstore.as_retriever()

        # Initialize agents
        self.classifier = ClassifierAgent(self.llm)
        self.business_agent = BusinessAgent(self.llm, self.retriever)

    def process_query(self, query: str) -> str:
        """Process incoming queries through the multi-agent system"""

        # First, classify the query
        department = self.classifier.classify_query(query)

        # Route to appropriate agent based on classification
        if "business" in department:
            return self.business_agent.process_query(query)
        elif "uncertain" in department:
            return "I apologize, but I'm not sure which department would best handle your query. Could you please provide more specific information about your question?"
        else:
            return f"I apologize, but the {department} department agent is currently not active. I can only assist with business-related queries at the moment."


# Example usage
def main():
    # Initialize the system
    advisor = AdvisorSystem(openai_api_key)

    # Example queries
    queries = [
        "Tell me about the Computer Information Systems program",
        "What are the prerequisites for the MBA program?",
        "What musical instruments can I learn?",  # This should go to Fine Arts
    ]

    # Process each query
    for query in queries:
        print(f"\nQuery: {query}")
        response = advisor.process_query(query)
        print(f"Response: {response}")

if __name__ == "__main__":
    main()



Query: Tell me about the Computer Information Systems program
Response: I apologize, but the science and engineering department agent is currently not active. I can only assist with business-related queries at the moment.

Query: What are the prerequisites for the MBA program?




Response: The prerequisites for the MBA program at McCoy College of Business Administration include the completion of a bachelor's degree and the submission of GMAT scores.

Query: What musical instruments can I learn?
Response: I apologize, but the fine arts department agent is currently not active. I can only assist with business-related queries at the moment.


In [None]:
# prompt: How can I check what is in the "agentic-rag-chroma" vector db?

# Assuming 'vectorstore' is defined as in your provided code
# You can access the contents of the vector database using the following methods:

# 1. Get all the documents in the database
all_documents = vectorstore.get()

# 2. Get documents that are similar to a specific query:
query = "Computer Information Systems"
similar_documents = vectorstore.similarity_search(query)

# 3. Access specific information from the documents
for doc in similar_documents:
  print(f"Document ID: {doc.metadata.get('id', 'N/A')}")
  print(f"Document content: {doc.page_content[:200]}...")  # Print the first 200 characters of the document

# 4. You can also use the 'persist' and 'from_persist_directory' functions
#    to save and load the vector database to a directory on your local machine.



Document ID: N/A
Document content: Bachelor of Business Administration (B.B.A.) Major in Computer Information Systems...
Document ID: N/A
Document content: Bachelor of Business Administration (B.B.A.) Major in Computer Information Systems (Information Security Concentration)...
Document ID: N/A
Document content: Bachelor of Business Administration (B.B.A.) Major in Computer Information Systems (Software Development Concentration)...
Document ID: N/A
Document content: Bachelor of Business Administration (B.B.A.) Major in Computer Information Systems (Business Analytics Concentration)...


In [55]:
# improving functionality by adding different vector dbs for each department.

import re
from collections import defaultdict

# Define departments
departments = [
    "Business (McCoy College of Business Administration)",
    "Fine Arts",
    "Liberal Arts",
    "Science and Engineering",
    "Health Professions",
    "Applied Arts",
    "Education"
]

# Initialize a dictionary to hold department-wise URLs
dept_urls = defaultdict(list)

# Define patterns to match URLs to departments
department_patterns = {
    "Business (McCoy College of Business Administration)": re.compile(r'/mccoy-business-administration/'),
    "Fine Arts": re.compile(r'/fine-arts-communication/|/art-design/|/theatre/|/music/'),
    "Liberal Arts": re.compile(r'/liberal-arts/|/english/|/history/|/philosophy/|/world-languages-literatures/|/international-studies/|/geography/|/political-science/|/sociology/|/psychology/'),
    "Science and Engineering": re.compile(r'/science-engineering/|/computer/|/mathematics/|/biology/|/chemistry-biochemistry/|/physics/|/ingram-school/|/technology/'),
    "Health Professions": re.compile(r'/health-professions/|/st-davids-nursing/|/radiation-therapy-program/|/respiratory-care/'),
    "Applied Arts": re.compile(r'/applied-arts/|/family-consumer-sciences/|/social-work/'),
    "Education": re.compile(r'/education/|/curriculum-instruction/|/health-human-performance/|/counseling-leadership-adult-school-psychology/')
}

# Assign URLs to departments
for url in urls:
    assigned = False
    for dept, pattern in department_patterns.items():
        if pattern.search(url):
            dept_urls[dept].append(url)
            assigned = True
            break
    if not assigned:
        print(f"URL did not match any department: {url}")


URL did not match any department: http://mycatalog.txstate.edu/undergraduate/majors/#A
URL did not match any department: http://mycatalog.txstate.edu/undergraduate/majors/#B
URL did not match any department: http://mycatalog.txstate.edu/undergraduate/majors/#C
URL did not match any department: http://mycatalog.txstate.edu/undergraduate/majors/#D
URL did not match any department: http://mycatalog.txstate.edu/undergraduate/majors/#E
URL did not match any department: http://mycatalog.txstate.edu/undergraduate/majors/#F
URL did not match any department: http://mycatalog.txstate.edu/undergraduate/majors/#G
URL did not match any department: http://mycatalog.txstate.edu/undergraduate/majors/#H
URL did not match any department: http://mycatalog.txstate.edu/undergraduate/majors/#I
URL did not match any department: http://mycatalog.txstate.edu/undergraduate/majors/#J
URL did not match any department: http://mycatalog.txstate.edu/undergraduate/majors/#K
URL did not match any department: http://my

In [57]:
import re

def sanitize_collection_name(department_name):
    """
    Sanitizes the department name to conform to Chroma's collection naming rules.
    """
    # Convert to lowercase
    sanitized = department_name.lower()

    # Replace spaces and hyphens with underscores
    sanitized = re.sub(r'[\s\-]+', '_', sanitized)

    # Remove all characters except alphanumerics and underscores
    sanitized = re.sub(r'[^a-z0-9_]', '', sanitized)

    # Ensure the name starts and ends with an alphanumeric character
    sanitized = re.sub(r'^[^a-z0-9]+', '', sanitized)
    sanitized = re.sub(r'[^a-z0-9]+$', '', sanitized)

    # Ensure the length is between 3 and 63 characters
    if len(sanitized) < 3:
        sanitized = sanitized.ljust(3, '_')
    elif len(sanitized) > 63:
        sanitized = sanitized[:63]

    return sanitized


In [58]:
from langchain.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma

# Initialize embeddings once
embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)

# Function to process and store documents for a department
def process_department(department_name, department_urls):
    print(f"Processing department: {department_name}")

    # Load documents
    loaders = [WebBaseLoader(url) for url in department_urls]
    docs = []
    for loader in loaders:
        try:
            loaded_docs = loader.load()
            docs.extend(loaded_docs)
        except Exception as e:
            print(f"Error loading {loader.url}: {e}")

    if not docs:
        print(f"No documents loaded for {department_name}. Skipping...")
        return

    # Split documents
    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=250,
        chunk_overlap=0,
        separators=["\n\n", "\n", " ", ""]
    )
    doc_splits = text_splitter.split_documents(docs)

    # Create vector store|
    vectorstore = Chroma.from_documents(
        documents=doc_splits,
        embedding=embeddings,
        collection_name=department_name.lower().replace(" ", "_")  # e.g., business_mccoy_college
    )

    print(f"Vector store created for {department_name} with {len(doc_splits)} documents.")

# Iterate over each department and process
for dept, dept_specific_urls in dept_urls.items():
    process_department(dept, dept_specific_urls)


Processing department: Business (McCoy College of Business Administration)


ValueError: Expected collection name that (1) contains 3-63 characters, (2) starts and ends with an alphanumeric character, (3) otherwise contains only alphanumeric characters, underscores or hyphens (-), (4) contains no two consecutive periods (..) and (5) is not a valid IPv4 address, got business_(mccoy_college_of_business_administration)

In [76]:
openai_api_key

### seperated document loading

In [84]:
import re
import os
from collections import defaultdict
from langchain.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import Chroma

# ============================
# Step 1: Define Your Data
# ============================

# Replace this list with your actual URLs
urls = [
    "http://mycatalog.txstate.edu/undergraduate/majors/#A",
    "http://mycatalog.txstate.edu/undergraduate/majors/#B",
    "http://mycatalog.txstate.edu/undergraduate/majors/#C",
    "http://mycatalog.txstate.edu/undergraduate/majors/#D",
    "http://mycatalog.txstate.edu/undergraduate/majors/#E",
    "http://mycatalog.txstate.edu/undergraduate/majors/#F",
    "http://mycatalog.txstate.edu/undergraduate/majors/#G",
    "http://mycatalog.txstate.edu/undergraduate/majors/#H",
    "http://mycatalog.txstate.edu/undergraduate/majors/#I",
    "http://mycatalog.txstate.edu/undergraduate/majors/#J",
    "http://mycatalog.txstate.edu/undergraduate/majors/#K",
    "http://mycatalog.txstate.edu/undergraduate/majors/#L",
    "http://mycatalog.txstate.edu/undergraduate/majors/#M",
    "http://mycatalog.txstate.edu/undergraduate/majors/#N",
    "http://mycatalog.txstate.edu/undergraduate/majors/#O",
    "http://mycatalog.txstate.edu/undergraduate/majors/#P",
    "http://mycatalog.txstate.edu/undergraduate/majors/#Q",
    "http://mycatalog.txstate.edu/undergraduate/majors/#R",
    "http://mycatalog.txstate.edu/undergraduate/majors/#S",
    "http://mycatalog.txstate.edu/undergraduate/majors/#T",
    "http://mycatalog.txstate.edu/undergraduate/majors/#U",
    "http://mycatalog.txstate.edu/undergraduate/majors/#V",
    "http://mycatalog.txstate.edu/undergraduate/majors/#W",
    "http://mycatalog.txstate.edu/undergraduate/majors/#X",
    "http://mycatalog.txstate.edu/undergraduate/majors/#Y",
    "http://mycatalog.txstate.edu/undergraduate/majors/#Z",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/accounting/accounting-bba/",
    "https://mycatalog.txstate.edu/graduate/mccoy-business-administration/accounting/accounting-bba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/advertising-mass-communication-sports-media-concentration-bs/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/advertising-mass-communication-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/agriculturalsciences/agriculture-business-management-agribusiness-specialization-bsag/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/agriculturalsciences/agriculture-agricultural-mechanics-concentration/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/agriculturalsciences/agriculture-agricultural-horticulture-concentration/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/agriculturalsciences/agriculture-teacher-certification-science-technology-grades-612-bsag/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/agriculturalsciences/agriculture-bsag/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/agriculturalsciences/agriculture-animal-science-preveterinary-concentration/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/agriculturalsciences/agriculture-animal-science/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/anthropology/anthropology-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/anthropology/anthropology-bs/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/organization-workforce-leadership-studies/applied-sciences-baas/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/mathematics/applied-bs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/sociology/applied-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/biology/aquatic-bs/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/art-design/art-history-ba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/art-design/art-ba/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/chemistry-biochemistry/biochemistryacs-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/chemistry-biochemistry/biochemistry-early-entry-combined-program-bs-ms/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/chemistry-biochemistry/biochemistry-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/biology/biology-teacher-certification-life-sciences-grades-7-12-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/biology/biology-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/chemistry-biochemistry/chemistry-early-entry-combined-program-bs-ms/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/chemistry-biochemistry/chemistry-teacher-certification-grades-7-12-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/chemistry-biochemistry/chemistry-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/ingram-school/civil-engineering-bs/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/art-design/communication-design-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/communication-disorders/bscd/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/studies/persuasive-communication-studies-ba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/studies/organizational-communication-studies-ba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/studies/interpersonal-communication-studies-ba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/studies/communication-studies-teacher-certification-speech-grades-712-ba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/studies/communication-studies-ba/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/computer-information-systems-quantitative-methods/computer-information-systems-bba/cis-business-analytics/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/computer-information-systems-quantitative-methods/cis-info-security-bba/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/computer-information-systems-quantitative-methods/cis-software-dev-bba/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/computer-information-systems-quantitative-methods/computer-information-systems-bba/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/computer/computer-science-concentration-engineering-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/computer/computer-science-ba/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/computer/computer-science-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/technology/concrete-industry-management-minor-business-administration-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/technology/construction-science-management-residential-construction-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/technology/construction-science-management-minor-business-administration-bs/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/family-consumer-sciences/consumer-affairs-family-sciences-option-bsfcs/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/family-consumer-sciences/consumer-affairs-bsfcs/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/family-consumer-sciences/consumer-affairs-teacher-certification-family-sciences-grades-612-bsfcs/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/criminal-justice/criminal-justice-bscj/",
    "http://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/dance-studies-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/dance-performance-choreography-emphasis-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/dance-single-field-teaching-certification-grades-812-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/dance-two-field-teaching-certification-grades-812-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/dance-ba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/digital-mass-communication-sports-media-concentration-bs/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/digital-mass-communication-bs/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/finance-economics/economics-ba/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/finance-economics/economics-bba/",
    "https://mycatalog.txstate.edu/undergraduate/education/curriculum-instruction/education-bilingual-biliteracy-teacher-certficiation-ec6-bilingual-spanish-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/curriculum-instruction/education-elementary-education-teacher-certficiation-ec6-esl-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/curriculum-instruction/education-middle-teacher-certification-48-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/curriculum-instruction/education-middle-teacher-certification-eng-langarts-reading-ss-48-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/curriculum-instruction/education-middle-teacher-certification-math-science-48-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/curriculum-instruction/education-middle-teacher-certification-math-48-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/curriculum-instruction/education-middle-teacher-certification-science-48-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/curriculum-instruction/education-ba/",
        "https://mycatalog.txstate.edu/undergraduate/education/curriculum-instruction/education-secondary-education-teacher-certification-double-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/curriculum-instruction/education-special-education-teacher-certification-special-education-ec12-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/ingram-school/electrical-engineering-computer-specialization-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/ingram-school/electrical-engineering-micro-nano-devices-systems-specialization-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/ingram-school/electrical-engineering-networks-communication-systems-specialization-bs/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/electronic-media-mass-communication-sports-media-concentration-bs/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/electronic-media-mass-communication-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/technology/engineering-technology-civil-specialization-bst/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/technology/electrical-specialization-bst/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/technology/engineering-technology-environmental-specialization-bst/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/technology/engineering-technology-manufacturing-specialization-bst/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/technology/engineering-technology-mechanical-specialization-bst/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/english/creative-writing-emphasis-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/english/film-emphasis-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/english/english-secondary-tcert-double-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/english/writing-rhetoric-emphasis-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/english/english-ba/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/exercise-sports-science-health-wellness-promotion-clinical-populations-concentration-bess/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/exercise-sports-science-prerehab-sciences/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/exercise-sports-science-teacher-certification-physical-education-grades-ec12-bess/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/exercise-sports-sci-athletic-trainig-bessms/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/family-consumer-sciences/fashion-merchandising-plan-presentation-bsfcs/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/family-consumer-sciences/fashion-merchandising-bsfcs/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/finance-economics/finance-bba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/world-languages-literatures/french-secondary-tcert-double-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/world-languages-literatures/french-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/geography/geographic-information-science-bs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/geography/resource-environmental-studies-bs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/geography/urban-regional-planning-bs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/geography/water-resources-bs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/geography/geography-bs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/world-languages-literatures/german-secondary-tcert-double-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/world-languages-literatures/german-ba/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/health-fitness-management-business-administration-minor-bess/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/information-management/bshim/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/health-sciences-fine-motor-therapy-bs/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/health-sciences-gross-motor-therapy-bs/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/health-sciences-prechiropractic-bs/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/health-sciences-precomm-disorders-bs/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/health-sciences-preclinical-lab-science-bs/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/health-sciences-prenursing-bs/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/health-sciences-preradiation-therapy-bs/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/health-sciences-prerespiratory-bs/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/health-sciences-bs/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/administration/healthcare-administration-bha/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/history/history-secondary-tcert-double-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/history/history-secondary-tcert-ss-double-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/history/history-two-fields-teacher-certification-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/history/history-ba/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/family-consumer-sciences/human-dev-family-sciences-teacher-certification-human-studies-grades-812-bsfcs/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/family-consumer-sciences/human-dev-family-sciences-bsfcs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/geography/geography-ba/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/ingram-school/industrial-engineering-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/counseling-leadership-adult-school-psychology/integrated-studies-bgs/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/family-consumer-sciences/interior-design-bsfcs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/international-studies/relations-bais/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/international-studies/asian-focus-bais/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/international-studies/european-focus-bais/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/international-studies/business-focus-bais/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/international-studies/interamerican-focus-bais/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/international-studies/middle-east-african-focus-bais/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/international-studies/russian-east-european-focus-bais/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/international-studies/travel-tourism-focus-bais/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/journalism-mass-communication-sports-media-concentration-bs/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/journalism-mass-communication-bs/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/management/management-entrepreneurial-studies-concentration-bba/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/management/management-human-resource-concentration-bba/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/management/management-bba/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/ingram-school/manufacturing-engineering-mechanical-systems-concentration-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/ingram-school/manufacturing-engineering-smart-concentration-bs/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/marketing/marketing-professional-sales-concentration-bba/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/marketing/marketing-services-concentration-bba/",
    "https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/marketing/marketing-bba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/mass-communication-ba/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/mathematics/mathematics-teacher-certification-grades-7-12-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/mathematics/mathematics-ba/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/mathematics/mathematics-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/ingram-school/mechanical-bs/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/medical-laboratory-science-program/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/biology/microbiology-molecular-genetics-bs/",
    "https://next.mycatalog.txstate.edu/undergraduate/fine-arts-communication/music/music-studies-band-concentration-teacher-certification-grades-ec12-BM/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/music/music-studies-choral-concentration-teacher-certification-grades-ec12-bm/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/music/music-studies-mariachi-concentration-teacher-certification-grades-ec12-bm/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/music/music-ba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/musical-theatre-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/st-davids-nursing/rn-to-bsn/",
    "https://mycatalog.txstate.edu/graduate/health-professions/st-davids-nursing/leadershipandadminnursing-rn-bsn-msn/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/st-davids-nursing/bsn/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/family-consumer-sciences/nutrition-foods-dietetics-track-bsfcs/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/family-consumer-sciences/nutrition-foods-teacher-certification-hospitality-nutrition-food-sciences-grades-812-bsfcs/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/family-consumer-sciences/nutrition-foods-bsfcs/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/music/performance-composition-concentration-bm/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/music/performance-guitar-concentration-bm/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/music/performance-instrumental-concentration-bm/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/music/performance-jazz-concentration-bm/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/music/performance-keyboard-concentration-bm/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/music/performance-vocal-concentration-bm/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/philosophy/philosophy-ba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/art-design/photography-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/geography/physical-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/physics/physics-astronomy-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/physics/physics-teacher-certification-physics-math-grades-7-12-bs/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/physics/physics-ba/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/physics/physics-bs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/political-science/political-science-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/psychology/psychology-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/psychology/psychology-science-minor-bs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/political-science/public-administration-bpa/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/public-health-applied-epidemiology-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/public-health-health-wellness-coaching-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/public-health-health-equity-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/public-health-bs/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/public-relations-mass-communication-sports-media-concentration-bs/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/public-relations-mass-communication-bs/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/radiation-therapy-program/bsrt/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/recreation-studies-community-recreation-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/recreation-studies-outdoor-recreation-bs/",
    "https://mycatalog.txstate.edu/undergraduate/education/health-human-performance/recreation-studies-therapeutic-recreation-bs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/philosophy/religious-studies-ba/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/respiratory-care/rrt-to-bsrc/",
    "https://mycatalog.txstate.edu/undergraduate/health-professions/respiratory-care/bsrc/",
    "https://mycatalog.txstate.edu/undergraduate/applied-arts/social-work/social-work-bsw/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/sociology/sociology-ba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/music/sound-recording-technology-bs/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/world-languages-literatures/spanish-hispanic-litandculture-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/world-languages-literatures/spanish-secondary-tcert-double-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/world-languages-literatures/spanish-profesions-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/world-languages-literatures/spanish-linguistics-ba/",
    "https://mycatalog.txstate.edu/undergraduate/liberal-arts/world-languages-literatures/spanish-ba/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/art-design/studio-art-teacher-certification-art-grades-ec12-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/art-design/studio-art-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/theatre-acting-preprofessional-option-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/theatre-film-production-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/theatre-performance-production-preprofessional-option-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/theatre-teacher-certification-grades-ec12-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/theatre-technical-production-bfa/",
    "https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/theatre/theatre-ba/",
    "https://mycatalog.txstate.edu/undergraduate/science-engineering/biology/wildlife-certification-biologist-bs/"
]




# Define departments
departments = [
    "Business (McCoy College of Business Administration)",
    "Fine Arts",
    "Liberal Arts",
    "Science and Engineering",
    "Health Professions",
    "Applied Arts",
    "Education"
]

# ============================
# Step 2: Map URLs to Departments
# ============================

# Initialize a dictionary to hold department-wise URLs
dept_urls = defaultdict(list)

# Define patterns to match URLs to departments
department_patterns = {
    "Business (McCoy College of Business Administration)": re.compile(r'/mccoy-business-administration/'),
    "Fine Arts": re.compile(r'/fine-arts-communication/|/art-design/|/theatre/|/music/'),
    "Liberal Arts": re.compile(r'/liberal-arts/|/english/|/history/|/philosophy/|/world-languages-literatures/|/international-studies/|/geography/|/political-science/|/sociology/|/psychology/'),
    "Science and Engineering": re.compile(r'/science-engineering/|/computer/|/mathematics/|/biology/|/chemistry-biochemistry/|/physics/|/ingram-school/|/technology/'),
    "Health Professions": re.compile(r'/health-professions/|/st-davids-nursing/|/radiation-therapy-program/|/respiratory-care/'),
    "Applied Arts": re.compile(r'/applied-arts/|/family-consumer-sciences/|/social-work/'),
    "Education": re.compile(r'/education/|/curriculum-instruction/|/health-human-performance/|/counseling-leadership-adult-school-psychology/')
}

# Assign URLs to departments
for url in urls:
    assigned = False
    for dept, pattern in department_patterns.items():
        if pattern.search(url):
            dept_urls[dept].append(url)
            assigned = True
            break
    if not assigned:
        print(f"URL did not match any department: {url}")

# ============================
# Step 3: Sanitize Collection Names
# ============================

def sanitize_collection_name(department_name):
    """
    Sanitizes the department name to conform to Chroma's collection naming rules.
    """
    # Convert to lowercase
    sanitized = department_name.lower()

    # Replace spaces and hyphens with underscores
    sanitized = re.sub(r'[\s\-]+', '_', sanitized)

    # Remove all characters except alphanumerics and underscores
    sanitized = re.sub(r'[^a-z0-9_]', '', sanitized)

    # Ensure the name starts and ends with an alphanumeric character
    sanitized = re.sub(r'^[^a-z0-9]+', '', sanitized)
    sanitized = re.sub(r'[^a-z0-9]+$', '', sanitized)

    # Ensure the length is between 3 and 63 characters
    if len(sanitized) < 3:
        sanitized = sanitized.ljust(3, '_')
    elif len(sanitized) > 63:
        sanitized = sanitized[:63]

    return sanitized

# ============================
# Step 4: Initialize Embeddings
# ============================

# Replace with your actual OpenAI API key securely
# It's recommended to use environment variables to store sensitive information
# openai_api_key = os.getenv("OPENAI_API_KEY")  # Ensure you set this in your environment

if not openai_api_key:
    raise ValueError("OpenAI API key not found. Please set the OPENAI_API_KEY environment variable.")

embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)

# ============================
# Step 5: Define the Processing Function
# ============================

def load_documents(urls: List[str]) -> List:
    """Load documents from a list of URLs."""
    docs = []
    for url in urls:
        loader = WebBaseLoader(url)
        try:
            loaded_docs = loader.load()
            docs.extend(loaded_docs)
            print(f"Loaded {len(loaded_docs)} documents from: {url}")
        except Exception as e:
            print(f"Error loading {url}: {e}", file=sys.stderr)
    return docs

def process_department(department_name, department_urls, embeddings, persist_directory):
    print(f"\nProcessing department: {department_name}")

    # Load documents
    docs = load_documents(department_urls)

    if not docs:
        print(f"No documents loaded for {department_name}. Skipping...")
        return

    # Split documents
    print("Splitting documents...")
    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=250,
        chunk_overlap=0,
        separators=["\n\n", "\n", " ", ""]
    )
    doc_splits = text_splitter.split_documents(docs)
    print(f"Total document chunks after splitting: {len(doc_splits)}")

    # Sanitize collection name
    collection_name = sanitize_collection_name(department_name)
    print(f"Sanitized collection name: {collection_name}")

    # Create vector store
    try:
        print(f"Creating vector store for collection: {collection_name}")
        vectorstore = Chroma.from_documents(
            documents=doc_splits,
            embedding=embeddings,
            collection_name=collection_name,
            persist_directory=persist_directory  # Specify your directory
        )
        vectorstore.persist()
        print(f"Vector store created and persisted for {department_name}.")
    except Exception as e:
        print(f"Error creating vector store for {department_name}: {e}", file=sys.stderr)

# ============================
# Step 6: Set Persist Directory
# ============================

# Define your persist directory
# Ensure this directory exists or the script can create it
persist_directory = "chroma_db"  # You can change this to an absolute path if desired

# Create the directory if it doesn't exist
if not os.path.exists(persist_directory):
    try:
        os.makedirs(persist_directory)
        print(f"Created persist directory at: {persist_directory}")
    except Exception as e:
        raise OSError(f"Failed to create persist directory {persist_directory}: {e}")

# ============================
# Step 7: Process All Departments
# ============================

for dept, dept_specific_urls in dept_urls.items():
    process_department(dept, dept_specific_urls, embeddings, persist_directory)


URL did not match any department: http://mycatalog.txstate.edu/undergraduate/majors/#A
URL did not match any department: http://mycatalog.txstate.edu/undergraduate/majors/#B
URL did not match any department: http://mycatalog.txstate.edu/undergraduate/majors/#C
URL did not match any department: http://mycatalog.txstate.edu/undergraduate/majors/#D
URL did not match any department: http://mycatalog.txstate.edu/undergraduate/majors/#E
URL did not match any department: http://mycatalog.txstate.edu/undergraduate/majors/#F
URL did not match any department: http://mycatalog.txstate.edu/undergraduate/majors/#G
URL did not match any department: http://mycatalog.txstate.edu/undergraduate/majors/#H
URL did not match any department: http://mycatalog.txstate.edu/undergraduate/majors/#I
URL did not match any department: http://mycatalog.txstate.edu/undergraduate/majors/#J
URL did not match any department: http://mycatalog.txstate.edu/undergraduate/majors/#K
URL did not match any department: http://my

  vectorstore.persist()


Vector store created and persisted for Business (McCoy College of Business Administration).

Processing department: Fine Arts
Loaded 1 documents from: https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/advertising-mass-communication-sports-media-concentration-bs/
Loaded 1 documents from: https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/journalism-mass/advertising-mass-communication-bs/
Loaded 1 documents from: https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/art-design/art-history-ba/
Loaded 1 documents from: https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/art-design/art-ba/
Loaded 1 documents from: https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/art-design/communication-design-bfa/
Loaded 1 documents from: https://mycatalog.txstate.edu/undergraduate/fine-arts-communication/studies/persuasive-communication-studies-ba/
Loaded 1 documents from: https://mycatalog.txstate.edu/unde

### querying the different dbs.

In [85]:
def query_department(department_name: str, query: str, persist_directory: str = "chroma_db", top_k: int = 5):
    """
    Queries a specific department's vector store with a given query.

    Parameters:
        department_name (str): The name of the department to query.
        query (str): The search query.
        persist_directory (str): The directory where Chroma stores the vector databases.
        top_k (int): The number of top similar documents to retrieve.

    Returns:
        List of retrieved documents.
    """
    # Sanitize the department name to get the correct collection name
    collection_name = sanitize_collection_name(department_name)

    print(f"Sanitized Collection Name: {collection_name}")

    try:
        # Initialize the Chroma vector store for the specified collection
        vectorstore = Chroma(
            embedding_function=embeddings,
            collection_name=collection_name,
            persist_directory=persist_directory
        )

        # Perform a similarity search
        results = vectorstore.similarity_search(query, k=top_k)

        print(f"\nTop {top_k} results for query in {department_name}:\n")
        for idx, doc in enumerate(results, 1):
            print(f"Result {idx}:")
            print(doc.page_content)
            print("-" * 80)

        return results

    except Exception as e:
        print(f"Error querying {department_name}: {e}")
        return []


In [89]:
# Example Query
department = "Business (McCoy College of Business Administration)"
user_query = "What are the courses for the ISAN degree?"

# Perform the query
query_department(department, user_query)


Sanitized Collection Name: business_mccoy_college_of_business_administration

Top 5 results for query in Business (McCoy College of Business Administration):

Result 1:
ISAN 3325Business Programming II3 ISAN 3360Web Design and Development3 ISAN 3389Programming for Data Processing3 ISAN 3390Agile Project Management 3 ISAN 4318Object Oriented Development3 ISAN 4321Mobile Application Development for Android3
--------------------------------------------------------------------------------
Result 2:
ISAN 3350Information Systems Security3 ISAN 3360Web Design and Development3 ISAN 3389Programming for Data Processing3 ISAN 3390Agile Project Management 3 ISAN 4318Object Oriented Development3 ISAN 4321Mobile Application Development for Android3
--------------------------------------------------------------------------------
Result 3:
test. ISAN Advanced Electives  Course List        Code Title Hours    ISAN 3348Data Communications and Network Architecture3 ISAN 3350Information Systems Security3 

[Document(metadata={'language': 'en', 'source': 'https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/computer-information-systems-quantitative-methods/cis-info-security-bba/', 'title': 'Bachelor of Business Administration (B.B.A.) Major in Computer Information Systems (Information Security Concentration) | Texas State University'}, page_content='ISAN\xa03325Business Programming II3 ISAN\xa03360Web Design and Development3 ISAN\xa03389Programming for Data Processing3 ISAN\xa03390Agile Project Management 3 ISAN\xa04318Object Oriented Development3 ISAN\xa04321Mobile Application Development for Android3'),
 Document(metadata={'language': 'en', 'source': 'https://mycatalog.txstate.edu/undergraduate/mccoy-business-administration/computer-information-systems-quantitative-methods/computer-information-systems-bba/', 'title': 'Bachelor of Business Administration (B.B.A.) Major in Computer Information Systems | Texas State University'}, page_content='ISAN\xa03350Information S

In [87]:
def query_all_departments(query: str, persist_directory: str = "chroma_db", top_k: int = 3):
    """
    Queries all departments' vector stores with the given query.

    Parameters:
        query (str): The search query.
        persist_directory (str): The directory where Chroma stores the vector databases.
        top_k (int): The number of top similar documents to retrieve per department.
    """
    for department in departments:
        print(f"\n=== Querying Department: {department} ===")
        query_department(department, query, persist_directory, top_k)


In [88]:
user_query = "What are the courses for the ISAN degree?"
query_all_departments(user_query)



=== Querying Department: Business (McCoy College of Business Administration) ===
Sanitized Collection Name: business_mccoy_college_of_business_administration

Top 3 results for query in Business (McCoy College of Business Administration):

Result 1:
ISAN 3325Business Programming II3 ISAN 3360Web Design and Development3 ISAN 3389Programming for Data Processing3 ISAN 3390Agile Project Management 3 ISAN 4318Object Oriented Development3 ISAN 4321Mobile Application Development for Android3
--------------------------------------------------------------------------------
Result 2:
ISAN 3350Information Systems Security3 ISAN 3360Web Design and Development3 ISAN 3389Programming for Data Processing3 ISAN 3390Agile Project Management 3 ISAN 4318Object Oriented Development3 ISAN 4321Mobile Application Development for Android3
--------------------------------------------------------------------------------
Result 3:
test. ISAN Advanced Electives  Course List        Code Title Hours    ISAN 3348Dat

In [91]:
import chromadb


In [94]:
import chromadb
from chromadb.config import Settings

# Initialize the Chroma client with the specified persist directory
# Create a Settings object with the desired persist_directory
client_settings = Settings(persist_directory="/content/chroma_db")

# Pass the Settings object to the Client constructor
client = chromadb.Client(settings=client_settings)

ValueError: An instance of Chroma already exists for ephemeral with different settings

### Importing the necessary dependecies and initialzing environemnt variables.

In [None]:
import os
from typing import List, Dict, Any

from langchain.agents import AgentExecutor, Tool, initialize_agent, AgentType
from langchain.memory import ConversationBufferMemory
from langchain.chat_models import ChatOpenAI
from langchain.prompts import (
    ChatPromptTemplate,
    MessagesPlaceholder,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
)
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.chains import LLMChain
from langchain.docstore.document import Document

from dotenv import load_dotenv

# Load environment variables from a .env file if available
load_dotenv()

# Set your OpenAI API key from environment variables
OPENAI_API_KEY = openai_api_key


### Defining the ClassifierAgent Class

In [None]:
# Initialize the vector store and retriever using the existing database
from langchain.vectorstores import Chroma

# Initialize embeddings (assuming the same embeddings were used)
embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY)

# Load the existing vector store
vectorstore = Chroma(
    embedding_function=embeddings,
    collection_name="agentic-rag-chroma"
)

# Initialize retriever
retriever = vectorstore.as_retriever()


  vectorstore = Chroma(


In [None]:
import logging


In [None]:
class ClassifierAgent:
    """Brain: Central classifier agent that routes queries to appropriate department agents"""

    def __init__(self, llm):
        self.llm = llm
        self.classifier_prompt = """You are the brain of the system, responsible for interpreting queries and deciding which body system should handle them.

Departments (Body Systems):
- Business (McCoy College of Business Administration)
- Fine Arts
- Liberal Arts
- Science and Engineering
- Health Professions
- Applied Arts
- Education

If the query requires gathering more information about the user's motivations and goals, initiate sensory input to ask clarifying questions.

Query: {query}

Return ONLY the department name that should handle this query. If uncertain, return "clarify" to ask for more information.
"""
        self.chain = LLMChain(
            llm=self.llm,
            prompt=ChatPromptTemplate.from_template(self.classifier_prompt)
        )

    def classify_query(self, query: str) -> str:
        """Classify which department should handle the query"""
        result = self.chain.run(query=query)
        return result.strip().lower()


In [None]:
class ChatbotInterface:
    """Sensory Organs: Interface that interacts with the user"""

    def __init__(self, advisor_system):
        self.advisor_system = advisor_system
        self.memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

    def start_conversation(self):
        """Start the chatbot interaction"""
        print("Hello! I'm your academic advisor assistant. How can I help you today?")
        while True:
            user_input = input("You: ")
            if user_input.lower() in ["exit", "quit", "stop"]:
                print("Assistant: Goodbye! Feel free to reach out anytime.")
                break
            response = self.advisor_system.process_query(user_input, memory=self.memory)
            print(f"Assistant: {response}")


In [None]:
class DepartmentAgent:
    """Muscles: Agent specialized in handling specific department queries and generating plans"""

    def __init__(self, llm, retriever, tool):
        self.llm = llm
        self.retriever = retriever
        self.tool = tool  # Function to retrieve context

        # Define department-specific tool
        self.tools = [self.tool]

        # Department-specific prompt
        self.prompt = ChatPromptTemplate.from_messages([
            SystemMessagePromptTemplate.from_template(f"""You are an academic advisor specialized in {self.tool.__doc__}.
Use the available tools to provide detailed academic plans, course recommendations, and guidance based on the student's goals and motivations.
Always aim to be specific, accurate, and empathetic.
"""),
            MessagesPlaceholder(variable_name="chat_history"),
            HumanMessagePromptTemplate.from_template("{input}")
        ])

        # Initialize the agent
        self.memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
        self.agent = initialize_agent(
            tools=self.tools,
            llm=self.llm,
            agent=AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,
            verbose=False,
            memory=self.memory,
            prompt=self.prompt
        )

    def process_query(self, query: str, memory):
        """Process department-related queries and generate plans"""
        self.agent.memory = memory  # Use the shared memory
        return self.agent.run(query)


In [None]:
class FallbackLayer:
    """Immune System: Handles errors and ensures system robustness"""

    def __init__(self):
        pass  # Placeholder for actual implementation

    def handle_failure(self, query):
        """Handle system failures or unclassified queries"""
        return "I'm sorry, but I need more information to assist you properly. Could you please provide more details about your academic interests or goals?"


In [43]:
class ClassifiedDepartment:
    def __init__(self):
        self.classified_department = None
        self.clarifying_questions_asked = False
        self.initial_query = ""


In [47]:
from langchain.schema import SystemMessage, HumanMessage
import logging

class AdvisorSystem:
    """Main system that coordinates between agents, acting as the human body"""

    def __init__(self, openai_api_key: str):
        # Initialize LLM
        self.llm = ChatOpenAI(
            openai_api_key=openai_api_key,
            temperature=0,
            model_name="gpt-4o-mini",  # Use "gpt-4" or your desired model
        )

        # Initialize the Classifier Agent (Brain)
        self.classifier = ClassifierAgent(self.llm)

        # Initialize Department Agents (Muscles)
        self.business_agent = DepartmentAgent(self.llm, retriever, get_mccoy_business_administration_context)
        self.fine_arts_agent = DepartmentAgent(self.llm, retriever, get_fine_arts_context)
        self.liberal_arts_agent = DepartmentAgent(self.llm, retriever, get_liberal_arts_context)
        self.science_engineering_agent = DepartmentAgent(self.llm, retriever, get_science_and_engineering_context)
        self.health_professions_agent = DepartmentAgent(self.llm, retriever, get_health_professions_context)
        self.applied_arts_agent = DepartmentAgent(self.llm, retriever, get_applied_arts_context)
        self.education_agent = DepartmentAgent(self.llm, retriever, get_education_context)

        # Mapping of departments to their agents
        self.department_agents = {
            "business": self.business_agent,
            "fine arts": self.fine_arts_agent,
            "liberal arts": self.liberal_arts_agent,
            "science and engineering": self.science_engineering_agent,
            "health professions": self.health_professions_agent,
            "applied arts": self.applied_arts_agent,
            "education": self.education_agent,
        }

        # Initialize Fallback Layer (Immune System)
        self.fallback_layer = FallbackLayer()

    def process_query(self, query: str, memory, classified_department):
        """Process incoming queries through the multi-agent system"""

        logging.debug(f"Processing query: {query}")

        # Step 1: Check if department is already classified
        if classified_department.classified_department:
            department = classified_department.classified_department
            logging.debug(f"Using stored classified department: {department}")
        else:
            # Classify the query
            department = self.classifier.classify_query(query)
            logging.debug(f"Classified Department: {department}")

            # If classification is uncertain, ask clarifying questions
            if department == "clarify":
                logging.debug("Department classification unclear, generating clarifying questions.")
                # Check if clarifying questions have already been asked
                if not classified_department.clarifying_questions_asked:
                    classified_department.clarifying_questions_asked = True
                    classified_department.initial_query = query  # Store the initial query
                    clarifying_questions = self.generate_clarifying_questions(query)
                    return clarifying_questions
                else:
                    # Reclassify using the additional information
                    combined_query = classified_department.initial_query + " " + query
                    department = self.classifier.classify_query(combined_query)
                    logging.debug(f"Reclassified Department after clarification: {department}")
                    if department == "clarify":
                        # Use the Fallback Layer if still unclear
                        return self.fallback_layer.handle_failure(query)
                    else:
                        # Store the classified department
                        classified_department.classified_department = department
            else:
                # Store the classified department
                classified_department.classified_department = department

        # Step 2: Route to appropriate agent based on classification
        if department in self.department_agents:
            logging.debug(f"Routing query to {department} agent.")
            agent = self.department_agents[department]
            return agent.process_query(query, memory)
        else:
            # Use the Fallback Layer to handle unclassified queries
            logging.debug("Department not recognized, using fallback layer.")
            return self.fallback_layer.handle_failure(query)

    def generate_clarifying_questions(self, query):
        """Generate two clarifying questions to understand the student's goals"""
        messages = [
            SystemMessage(content="As an academic advisor assistant, your goal is to understand the student's motivations and goals for this semester."),
            HumanMessage(content=f'Previous query: "{query}"'),
            SystemMessage(content="Ask two open-ended, empathetic questions to clarify what they want to achieve. Your response should be the two questions only.")
        ]
        response = self.llm(messages).content.strip()
        return response


In [48]:
class ChatbotInterface:
    """Sensory Organs: Interface that interacts with the user"""

    def __init__(self, advisor_system):
        self.advisor_system = advisor_system
        self.memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
        self.classified_department = ClassifiedDepartment()  # Stores the classified department

    def start_conversation(self):
      """Start the chatbot interaction"""
      print("Hello! I'm your academic advisor assistant. How can I help you today?")
      while True:
          user_input = input("You: ")
          if user_input.lower() in ["exit", "quit", "stop"]:
              print("Assistant: Goodbye! Feel free to reach out anytime.")
              break
          response = self.advisor_system.process_query(
              user_input,
              memory=self.memory,
              classified_department=self.classified_department
          )
          print(f"Assistant: {response}")




In [54]:
if __name__ == "__main__":
    # Initialize the Advisor System
    advisor_system = AdvisorSystem(openai_api_key=OPENAI_API_KEY)

    # Start the Chatbot Interface
    chatbot = ChatbotInterface(advisor_system)
    chatbot.start_conversation()


Hello! I'm your academic advisor assistant. How can I help you today?
You: I am a sophomore CS student. What courses should I take? 
Assistant: As a sophomore Computer Science student, you should consider taking the following courses: CS 1428 (Foundations of Computer Science I), PHYS 2426 or 2326/2126 (Life and Physical Sciences), and EE 2400 (Circuits I). Additionally, you may want to explore electives and courses that fulfill other component requirements such as Language, Philosophy, and Culture, as well as American History.
You: No I want to take my major classes. 
Assistant: As a sophomore CS student, you should focus on taking core major classes that build on your foundational knowledge. Common courses to consider include:

1. **Data Structures and Algorithms** - This course is essential for understanding how to organize and manipulate data efficiently.
2. **Computer Organization** - Learn about the hardware components of computers and how they interact with software.
3. **Discret

KeyboardInterrupt: Interrupted by user

### Testing systems


In [52]:
query = "Provide the ISAN Advanced Electives"
results = retriever.get_relevant_documents(query)
for doc in results:
    print(doc.page_content)


or ISAN Advanced Elective3 ANLY Advanced Electives6Free Electives3 15 12Total Hours: 120   1   Credit can be earned by successfully passing a test. Students must pay a fee to take the test.  Course List        Code Title Hours    ISAN Advanced
test. ISAN Advanced Electives  Course List        Code Title Hours    ISAN 3348Data Communications and Network Architecture3 ISAN 3350Information Systems Security3 ISAN 3389Programming for Data Processing3 ISAN 4318Object Oriented Development3
4318, ISAN 4321, ISAN 4322, ISAN 4349, ISAN 4373B, ISAN 4373C6ISAN Advanced Electives3 Restricted Business Elective3Free Electives3 15 12Total Hours: 120   1  Credit can be earned by successfully passing a test. Students must pay a fee to take the
can be earned by successfully passing a test. Students must pay a fee to take the test. ISAN Advanced Electives  Course List        Code Title Hours    ISAN 3325Business Programming II3 ISAN 3348Data Communications and Network Architecture3
