In [2]:
import ollama
import chromadb

In [3]:
emb_model = "mxbai-embed-large"
llm = "llama3"

In [4]:
%load_ext autoreload
%autoreload 2

In [5]:
#load the processed chunks
with open('processed_chunks.txt', 'r') as f:
    data = f.read().splitlines()

In [6]:
#to remove any empty lines
bel = []
for i in range(len(data)):
    if data[i].strip() != "":
        bel.append(data[i])
data = bel.copy()
bel.clear()

In [7]:
#ChromaDB saving data to the 'rag_db'
client = chromadb.PersistentClient(path="rag_db")
collection = client.get_or_create_collection(name="evs_docs")

#converting each chunk to an embedding and adding it to the collection
for i, chunk in enumerate(data):
  embedding = ollama.embed(model=emb_model, input=chunk)['embeddings'][0]

  collection.add(
      ids=[str(i)],
      embeddings=[embedding],
      documents=[chunk]
  )

print(f"Successfully added {len(data)} chunks to the ChromaDB collection.")

Successfully added 96 chunks to the ChromaDB collection.


In [8]:
def retrieve(query, top_n=3):
  """Retrieves top_n relevant documents from ChromaDB based on the query."""
  #embedding the query
  query_embedding = ollama.embed(model=emb_model, input=query)['embeddings'][0]

  #queries the collection
  results = collection.query(
      query_embeddings=[query_embedding],
      n_results=top_n
  )

  return results['documents'][0]

In [9]:
def ask(query):
    retrieved_knowledge = retrieve(query)

    if not retrieved_knowledge:
        print("Could not find relevant information in the document.")
        return

    #the retrieved_knowledge is already a list of strings, so joining is simple
    context_text = "\n".join([f"- {chunk}" for chunk in retrieved_knowledge])

    instruction_prompt = f'''You are a helpful chatbot.
    Use only the following pieces of context to answer the question. Don't make up any new information:
    {context_text}
    '''

    stream = ollama.chat(
      model=llm,
      messages=[
        {'role': 'system', 'content': instruction_prompt},
        {'role': 'user', 'content': query},
      ],
      stream=True,
    )

    for chunk in stream:
      print(chunk['message']['content'], end='', flush=True)

In [17]:
ask("what is python?")

I'm not seeing any context about Python in the provided information, so I'll provide a brief answer.

Python is a high-level programming language and one of the most popular programming languages used for various purposes such as web development, data analysis, artificial intelligence, scientific computing, and more. It's often referred to as a "scripting" language because it's easy to learn and use.

In [11]:
ask("what is plastic pollution?")

Based on the given context, plastic pollution refers to the problem caused by plastics not being disposed off properly, resulting in a significant amount of plastic waste entering the oceans. Specifically:

* "About three metric tones of plastic is estimated to be entering the oceans every 15 seconds" (point 13)
* Plastic debris can entangle marine animals, and tiny pieces can enter the marine food chain causing harm.
* Fishing nets made of plastics can become ghost nets, trapping fish, dolphins, sea turtles, sharks, and other creatures.

Overall, plastic pollution is a significant environmental issue caused by the improper disposal of plastics, resulting in harm to marine life and ecosystems.

In [12]:
ask("What is laterite soil?")

The provided context does not mention "laterite soil". However, based on the general knowledge, I can provide information about laterite soil.

Laterite soil is a type of reddish-brown soil that is common in tropical regions. It is formed through the weathering and oxidation of iron-rich rocks and minerals, such as ferric oxides and hydrated iron oxide. The high concentration of iron compounds gives laterite soil its characteristic red or yellow color.

In the context you provided, there is no mention of laterite soil. If you have any specific questions related to sedimentary rocks, wetlands, or other geological concepts mentioned in the text, I'd be happy to help!

In [13]:
ask("what are different types of medicinal plants?")

Based on the provided context, here are some medicinal plants mentioned:

1. Drumstick tree (Moringa oleifer) - used in soups and sambar, good for health.
2. Aloe vera (Aloe barbadensis Mill) - used in shampoos, face wash, and eaten to flush out waste from the body.
3. Long pepper (Piper longum) - helps cure cold, cough, and fever; oil used in skin aromatherapy.
4. Harsingar (Nyctanthes arbor-tristis) - fresh juice of leaves can solve joint pain and arthritis problems.
5. Arbi (Wild Colocasia ) - helps cure problems related to ulcers.
6. Curry leaves (Bergera koenigii) - source of calcium, helps with hair-related issues, diabetes, and blood pressure.

These are the medicinal plants mentioned in the provided context.

In [14]:
ask("what is biodiversity?")

According to the provided context, biodiversity is defined as:

"Biodiversity is the diversity of life in all its forms and at all its levels of organisation."

In other words, biodiversity refers to the variety of different living organisms that exist in a particular geographic area or ecosystem. It encompasses the richness of life at all levels, from individual species to communities and ecosystems.

Source: Biodiversity Module 1, Point 1.

In [15]:
ask("who are the builders of african savannah?")

According to the context, the builders of African savannah are elephants. Elephants help define the ecosystem by preserving grasslands by eating small trees, allowing grassy trees to receive sunlight and preventing the savanna from inverting into a forest. They also create corridors that help prevent the spread of wildfires, saving lives.

In [16]:
while True:
    prompt = input("> ")
    if prompt in "quit" or prompt in "exit":
        break
    ask(prompt)
    print("\n\n")
    continue


Based on the provided context, there is no mention of Mt. Fuji or its fauna. The context only mentions other species, reasons for their poor adaptation to different environmental conditions, and geographical barriers like sea and mountains. It also talks about various ecosystems, bio geographic zones, and India's biodiversity. If you have any specific questions related to the provided context, I'll be happy to help!


Based on the provided context, there is no information about fauna found in Uttar Pradesh. However, I can suggest that you may want to look up online resources or consult field guides related to the wildlife of Uttar Pradesh for more information.

If you're interested in learning about the fauna found in other regions of India, such as the Aravali Range mentioned in the context, some examples include:

* Birds: Red-tailed hawks, Gila woodpeckers, Purple martins, Elf owls
* Other animals: Porcupines (which depend on calcium from feathers for quill formation)

Please note t