### Trying out RAG with ollama and chromadb to informally access the text material

(ollama is recommended over hugging face for local experimentation)

#### Preliminaries

In [3]:
import gradio as gr
import ollama
import chromadb
import html

chromaclient = chromadb.PersistentClient(path="/home/imagery/crc5rag")
collection = chromaclient.get_collection(name="crc5rag")

Choose a distilled deepseek LLM appropriate to your local computer or the cloud version if you have a deepseek account. In the latter case, be surethat you are signed in to ollama,

In [2]:
#deepseek = 'deepseek-r1:7b'
#deepseek = 'deepseek-r1:14b'
deepseek = 'deepseek-v3.1:671b-cloud'

Ollama uses a docker-like syntax to pull LLM's. 

Open a terminal window in the Launcher and enter

    ollama serve &
    
Then run the following cell (will take a few minutes the first time):

In [2]:
%%capture
!ollama pull nomic-embed-text
!ollama pull $deepseek

If you chose the cloud version of deepseek then open another terminal window and run

   ollama signin

and follow the instructions.   

#### Execute a RAG query with deepseek-r1 on the textbook contents

If you want to re-start the cell, bump the server port by one (e.g. ... .launch(server_name="0.0.0.0", server_port=7861)

In [4]:
def ragask(query):
    # Embed the query
    queryembed = ollama.embed(model="nomic-embed-text", input=query)['embeddings']
    # Retrieve related documents (eight 512-token chunks)
    relateddocs = '\n\n'.join(collection.query(query_embeddings=queryembed, n_results=4)['documents'][0])
    # Generate a response
    prompt = f"Answer the question: {query}, referring to the following text as a resource: {relateddocs}"
    response = ollama.generate(model=deepseek, prompt=prompt, stream=False)['response']   
    # Ensure the response is valid Markdown
    return html.escape(response)

# Launch Gradio Interface
gr.Interface(fn=ragask,inputs=gr.Textbox(lines=2, placeholder="Enter your question here..."),
             outputs="markdown",
             description="Ask questions about the book contents",
             title="Image Analysis, Classification and Change Detection in Remote Sensing").launch(server_name="0.0.0.0", server_port=7860)

* Running on local URL:  http://0.0.0.0:7860

To create a public link, set `share=True` in `launch()`.


