In [20]:
import glob
import os
from langchain_community.document_loaders import UnstructuredMarkdownLoader
from langchain_ollama import OllamaEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import PGVector
from langchain.prompts import ChatPromptTemplate, PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_ollama.chat_models import ChatOllama
from langchain_core.runnables import RunnablePassthrough
from langchain.retrievers.multi_query import MultiQueryRetriever
from IPython.display import display, Markdown

from dotenv import load_dotenv
load_dotenv(dotenv_path='.env')

#suppress warnings
import warnings
warnings.filterwarnings('ignore')

In [None]:
md_path = f"/Users/donaldparker/Projects/lab/rag/ollama-pgvector/wikis/itrvl.wiki/*.md"
print(f"Loading files from: {md_path}")
paths = glob.glob(f"{md_path}")

docs = [UnstructuredMarkdownLoader(file_path=path).load() for path in paths]
docs_list = [item for sublist in docs for item in sublist]
print(f"Loaded {len(docs_list)} documents")

In [None]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = text_splitter.split_documents(docs_list)
print(f"Text split into {len(chunks)} chunks")

In [None]:
collection_name="rag-itrvl-wiki"
pgvector_db = PGVector.from_documents(
    collection_name=collection_name,
    documents=chunks,
    embedding=OllamaEmbeddings(model="nomic-embed-text"),
    use_jsonb=True
)

print(f"Vector database created successfully")

In [None]:
local_model = "llama3.2:latest"
llm = ChatOllama(model=local_model)

query_prompt = PromptTemplate(
    input_variables=["question"],
    template="""You are an AI language model assistant.  Your task is to generate 2 
    different versions of the give user question to retrieve relevant documents from
    a vector database. By generating multiple perspectives on user question, your
    goal is to help users overcome some of the limitations of distance-based
    similarity search. Provide these alternative questions separated by newlines.
    Original question: {question}"""
)

retriever = MultiQueryRetriever.from_llm(
    pgvector_db.as_retriever(),
    llm,
    prompt=query_prompt
)

In [None]:
template = """Answer the question on on the following context: 
{context}
Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

In [None]:
chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [None]:
def chat_with_documents(question): 
    """
    Chat with the PDF using our chain
    """
    return display(Markdown(chain.invoke(question)))

In [32]:
chat_with_documents("What are services that are provided by itrvl?")

Based on the context, it appears that itrvl provides a range of services related to trip planning and management. The documents mention various features and functionalities, such as:

1. Itinerary building: Allowing users to create customized itineraries with links, photos, videos, and other media.
2. Saved Itineraries: Enabling users to save their itineraries for future reference or to share with others.
3. Client Portal: Providing a platform for clients to view and manage their itineraries, including access to photos, videos, and accommodation stays.
4. Media components: Allowing agents to upload and manage media content, such as photos and videos, for activities.
5. Agency onboarding: Facilitating the setup of new agency accounts, inviting lead agents, and allowing them to log in and agree to terms.
6. User stories: Defining the features and functionalities of itrvl, including user experience, functionality, and acceptance criteria.

Some services that can be inferred from these documents include:

1. Trip planning and management
2. Itinerary building and customization
3. Media content management (photos, videos, etc.)
4. Client portal access for clients to view and manage their itineraries
5. Agency onboarding and management
6. User experience and functionality enhancements

However, the documents do not explicitly mention all services provided by itrvl. Further information would be needed to determine a comprehensive list of services offered by itrvl.

In [33]:
chat_with_documents("Can you describe services iTrvl provides?  Please provide 2 paragraphs of each.")

**General Info Services**

iTrvl appears to provide various services for travel agents, tour operators, and other industry stakeholders. One primary service is the General Info tab, which allows agents to access and manage essential information about destinations, activities, and accommodations. This includes features such as uploading photos and videos, deleting uploaded content, and reordering or rearranging media assets. The General Info tab provides a centralized platform for managing this type of information, making it easier for agents to stay organized and up-to-date.

Another key service provided by iTrvl is the Media component, which enables agents to upload and manage multimedia content related to their destinations and activities. This includes photos, videos, and other digital assets that can be used to showcase tour offerings, promote products, or enhance customer engagement. The Media tab provides a convenient interface for uploading, editing, and managing this content, ensuring that it is easily accessible and easily shared with clients.

**Client Portal Services**

iTrvl also appears to provide a Client Portal service, which allows clients to access and manage their itinerary information online. This portal enables clients to view, edit, and save their trip details, including activities, accommodations, and other travel arrangements. The Client Portal reflects changes to the client's itinerary in real-time, ensuring that all stakeholders are aware of any updates or changes. Additionally, the portal provides a user-friendly interface for clients to communicate with agents, submit requests, or request changes to their itinerary.

The Client Portal service also includes features such as sorting and filtering options for photos and videos, making it easier for clients to find specific content related to their trip. Overall, the Client Portal is designed to enhance client satisfaction by providing a convenient, user-friendly interface for managing travel arrangements and staying connected with agents. By leveraging this portal, iTrvl aims to build stronger relationships with its clients and provide a more personalized experience.

**Additional Services**

In addition to General Info and Client Portal services, iTrvl may offer other features and functionalities to support the business operations of its customers. These could include advanced reporting tools, customizable dashboards, or integration with other third-party systems. The specific services offered by iTrvl would depend on the needs and requirements of its target market, as well as the company's overall goals and objectives.

Some potential additional services that iTrvl may provide include:

* Integration with existing CRM systems to streamline sales and marketing efforts
* Customizable reporting templates to help agents analyze sales data and track key performance indicators (KPIs)
* Advanced search functionality to enable clients to quickly find specific destinations, activities, or accommodations
* Automated workflows and notifications to ensure seamless communication between agents and clients

By offering a range of services and functionalities, iTrvl can help its customers streamline their operations, improve client satisfaction, and stay competitive in the market.

In [34]:
chat_with_documents("What are the steps of the code review process?")

Based on the provided documents, the steps of the code review process can be summarized as follows:

1. Tasks to be done before requesting for a Code Reviewer:
 * PR Branch name is ready (e.g. dev-1234)
 * Prefix title of the Pull Request using semantic-release's format
 * PR has latest master updates
 * All automated tests are passing
 * Code coverage meets or exceeds threshold
 * Acceptance criteria has been met
 * PR has reference to ZenHub issue number
 * No red/yellow box errors
 * Documentation is updated

2. Code Reviewer Checklist:
 * Product Code conforms to standards (wiki link - tbd)
 * Test code conforms to standards (wiki link - tbd)
 * Code coverage meets threshold
 * Documentation has been updated
 * Commits are squashed if necessary
 * Branch name conforms to standards

3. Additional steps mentioned in the documents:
 * The PR Owner should annotate the PR to clarify intentions and highlight specific areas for feedback.
 * The PR Owner should clearly describe the change intention in commit messages.
 * After updating a PR, the PR Owner should re-request review from the code reviewer.

It's worth noting that these steps are outlined in the Code Reviews.md document, which provides guidelines for both PR Owners and Code Reviewers.

In [23]:
pgvector_db.delete_collection()
print("Vector database deleted successfully")

Vector database created successfully


The documents provided appear to be from a developer's guild meeting, specifically from April 17, 2024, and August 20, 2024. The first document discusses proposed agendas for the meeting, including proposals from Don Parker and Eddi regarding API testing and models. However, the content of this document seems incomplete as it only provides snippets of text.

The second document, "Quoting-Tests.md", outlines three types of test windows: transfers between Botswana and Zimbabwe, transfers between Johannesburg and Kruger, and transfers between Cape Town and Kruger. The document appears to be a list of test cases or scenarios that need to be implemented.

The third and fourth documents, both titled "User-Stories.md", provide information on how to write user stories in the context of software development. They outline five key elements: title/name, description, acceptance criteria and tests, attachments, and implementation notes. The documents seem to be providing guidance on how to structure and implement user stories in a way that is clear, concise, and comprehensive.