In [1]:
pip install ollama

Note: you may need to restart the kernel to use updated packages.


In [2]:
pip install langchain langchain_community chromadb unstructured qdrant_client

Note: you may need to restart the kernel to use updated packages.


Ok, now lets do our RAG. Here we can use the cloud resources. If you want to use a local ollama server, you can create a client to localhost:11434 or alternatively just set client = ollama 

In [51]:
ollamaBase = 'http://ollama-alibo-gpu-testing.apps.private.okd4.teh-2.snappcloud.io/'

from ollama import Client

client = ollama.Client(ollamaBase)


In [52]:
from langchain.document_loaders import DirectoryLoader
from langchain_community.document_loaders import UnstructuredMarkdownLoader

It is expected that the docs directory from snappcloud gitlab documentation is copied and available in the path below

In [54]:
def mname(x):
    return x['name']
    

result = client.list()
print(list(map(mname,result['models'])))

['llama3:70b', 'llama3:latest', 'llama3:instruct']


In [64]:
loader = DirectoryLoader('./docs',glob="**/*.md",loader_cls=UnstructuredMarkdownLoader)
embeddingModel = 'mxbai-embed-large'
llmModel = 'llama3'
systemPrompt = "You are a helpful assistant that answers questions using the context provided. Cite the relevant documents everytime you provide an answer"

In [82]:
docs = loader.load()
print(list(map(lambda x: x.metadata['source'], docs)))

['docs/vpn-access.md', 'docs/servicedesk.md', 'docs/overview.md', 'docs/support.md', 'docs/terms.md', 'docs/reference/api-documentation.md', 'docs/reference/cli-documentation.md', 'docs/storage/storage-volumes.md', 'docs/storage/volume-snapshots.md', 'docs/storage/object-store/aws-s3-sdk.md', 'docs/storage/object-store/backup-and-restore.md', 'docs/storage/object-store/overview.md', 'docs/storage/object-store/aws-s3-cli.md', 'docs/storage/object-store/s3-cli.md', 'docs/storage/object-store/s3-operator.md', 'docs/storage/object-store/advanced-features/bucket-lifecycle.md', 'docs/storage/object-store/advanced-features/multipart-upload.md', 'docs/storage/object-store/advanced-features/object-versioning.md', 'docs/storage/object-store/advanced-features/bucket-policy.md', 'docs/chaos/chaos.md', 'docs/openstack/IaC.md', 'docs/openstack/overview.md', 'docs/openstack/migrate-1g-to-10g.md', 'docs/openstack/images.md', 'docs/openstack/networking.md', 'docs/networking/loadbalancer-as-a-service.md

We can skip the splitting etc, as the documents are already split, but if you wanted to, you can use the following code

text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
print(len(splits))

Now we use our embedding model to create the embeddings and store them in chroma

In [66]:
%%time
from langchain.embeddings import OllamaEmbeddings
embeddings = OllamaEmbeddings(model=embeddingModel)

CPU times: user 25 µs, sys: 1 µs, total: 26 µs
Wall time: 27.7 µs


We should save the embeddings in a vector db, such as chroma or qdrant. Below are both the examples but you can change the one you want to use as code, and the other field can be set as markdown text.

In [67]:
from langchain_community.vectorstores import Chroma
vectorstore = Chroma.from_documents(documents=docs, embedding=embeddings,persist_directory='./chromadb')

In [68]:
retriever = vectorstore.as_retriever()

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

def ollama_llm(question, context):
    formatted_prompt = f"Question: {question}\n\nContext: {context}"
    response = client.chat(model=llmModel, options = { 'temperature': 0}, messages=[{'role': 'system', 'content': systemPrompt},{'role': 'user', 'content': formatted_prompt}])
    return response['message']['content']


In [84]:
# Define the RAG chain
def rag_chain(question):
    retrieved_docs = retriever.invoke(question)
    formatted_context = format_docs(retrieved_docs)
    print("using from context",list(map(lambda x: x.metadata['source'],retrieved_docs)))
    return ollama_llm(question, formatted_context)

In [85]:
%%time
result = rag_chain("What address should I use for jaeger agent on snappcloud?")
print(result)

using from context ['../mlscratchpad/README.md', '../mlscratchpad/README.md', '../mlscratchpad/README.md', '../anaconda3/pkgs/distributed-2023.11.0-py311h06a4308_0/lib/python3.11/site-packages/distributed-2023.11.0.dist-info/AUTHORS.md']
To set up the Jaeger Agent on SnapCloud, you can follow these steps:

1. First, make sure you have the Jaeger Agent installed and running on your machine.
2. Next, you need to configure the Jaeger Agent to send spans to SnapCloud. You can do this by setting the `agent.host` environment variable to the address of your SnapCloud instance.

According to the Jaeger documentation [1], the default host for SnapCloud is `localhost:14250`. However, if you are running your Jaeger Agent on a different machine or in a containerized environment, you may need to use a different address.

For example, if you are running your Jaeger Agent in a Kubernetes deployment, you can set the `agent.host` environment variable to the hostname of your SnapCloud instance. You can 

In [86]:
%%time
result = rag_chain("What are the external IPs for snappcloud that i need to whitelist?")
print(result)

using from context ['docs/cache-proxy/cache-proxy.md', '../mlscratchpad/docs/cache-proxy/cache-proxy.md', 'docs/cache-proxy/cache-proxy.md', 'docs/cache-proxy/cache-proxy.md']
According to the SnapCloud documentation, the external IPs that you need to whitelist are:

* `34.216.235.139`
* `34.216.235.140`
* `34.216.235.141`
* `34.216.235.142`

These IP addresses are used by SnapCloud's caching proxy services, including Go Cache Proxy, Container Cache Proxy, Alpine Cache Proxy, NPM Cache Proxy, and HTTP Proxy.

Please note that these IPs may be subject to change, so it's always a good idea to check the SnapCloud documentation for any updates or changes.
CPU times: user 16.7 ms, sys: 2.96 ms, total: 19.6 ms
Wall time: 11.9 s


In [87]:
result = rag_chain("I need to increase the quota for my project, how can I do this?")
print(result)

using from context ['../anaconda3/pkgs/mistune-2.0.4-py311h06a4308_0/info/test/tests/fixtures/include/hello.md', '../anaconda3/pkgs/mistune-2.0.4-py311h06a4308_0/info/test/tests/fixtures/include/hello.md', '../anaconda3/pkgs/mistune-2.0.4-py311h06a4308_0/info/test/tests/fixtures/include/hello.md', '../jan/extensions/@janhq/tensorrt-llm-extension/node_modules/core-util-is/README.md']
To increase the quota for your project, you'll need to follow the guidelines and procedures outlined by your organization or institution. The specific steps may vary depending on where you are and what type of project it is.

However, I can provide some general guidance based on common practices:

1. **Check if you have the necessary permissions**: Before increasing the quota, ensure that you have the required permissions to do so. This might involve checking with your supervisor, manager, or project lead.
2. **Review the current quota and usage**: Understand how much of the quota is currently being used an