In [1]:
import os
from dotenv import load_dotenv
import google.generativeai as genai

In [2]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.node_parser import HTMLNodeParser
from llama_index.embeddings.gemini import GeminiEmbedding
from llama_index.llms.gemini import Gemini
from llama_index.embeddings.gemini import GeminiEmbedding
from typing import List

In [11]:
load_dotenv()

True

In [12]:
print(os.getenv('GOOGLE_API_KEY'))

AIzaSyBGUdVsvZI88z8uiD1gkw3qxEM2A3lqe2E


In [13]:
genai.configure(api_key=os.getenv('GOOGLE_API_KEY'))

In [14]:
model = genai.GenerativeModel('gemini-pro')

In [15]:
embedding = GeminiEmbedding()

reader = SimpleDirectoryReader(input_dir="../data/clean_html/Articles",
                                  recursive=True)

documents = reader.load_data(show_progress=True)
node_parser = HTMLNodeParser(tags = ["p","li", "b", "i", "u", "section", "text"])
nodes = node_parser.get_nodes_from_documents(documents, show_progress=True)
# remove nodes with no content
nodes = [node for node in nodes if len(node.get_content()) > 0]
for node in nodes:
    #remove all the \n and \t
    node.text = node.text.replace("\n", " ").replace("\t", " ")

Loading files: 100%|██████████| 428/428 [00:00<00:00, 7387.98file/s]


Parsing nodes:   0%|          | 0/428 [00:00<?, ?it/s]

In [16]:
nodes[69].text

'AVP The Ayurvedic general guidelines indicate that contact with oil is a major component. To protect the sense organs to be efficient as the phase growth takes place, skin, eyes, ears, nose, and tongue need some specific methods of protection. The integrity of the skin provides protection to all the tissues inside. The skin however is exposed to the wrath of heat, cold, damp, or dry variations in the climate. Due to these, the integrity of the skin is at risk. This may lead to a minor breach of the covering layer, which allows the fluids, blood cells, or even thicker materials to be emitted. This applies even to the inside skin of all sense organs. The adverse factors stated above come in contact, and disturb the integrity. There is a very simple example of the season of severe cold. The usual oily sweat does not cover the skin, and when one smears the skin with any suitable oil / fat, the feared disintegration of the skin is avoided. Nature provides an abundant fatty layer below the 

In [28]:
import textwrap


def make_prompt(query, relevant_passage):
  escaped = relevant_passage.replace("'", "").replace('"', "").replace("\n", " ")
  prompt = textwrap.dedent("""You are an informative bot well versed in Ayurveda medicine and everything related to it. \
    You answer questions using text from the reference passage included below. \
  There are a lot of complicated ayurveda terminlogies used in the reference passage provided. \
  Be sure to respond in a complete sentence, being comprehensive, including all relevant background information. \
  Make sure to respond in plain simple english without using any complicated words. \
  I have no prior knowledge about ayurveda, medicine or anything related to it, answer such that my queries are answered precisely. \
  However, you are talking to a non-technical audience, so be sure to break down complicated concepts and \
  strike a friendly and converstional tone while being to the point with the queries. \
  If the passage is irrelevant to the answer, you may ignore it. You may also ignore any csv data provided. \
  The answer to the query is mostly present in the reference passage provided. Go through the passage thoroughly and fetch relevant information. \
  Strictly limit your responses to 300 words.
  QUESTION: '{query}'
  PASSAGE: '{relevant_passage}'

    ANSWER:
  """).format(query=query, relevant_passage=escaped)

  return prompt

In [29]:
prompt = make_prompt("What is oil used for?", nodes[69].text)
answer = model.generate_content(prompt)

In [25]:
print(answer.text)

Oil is a versatile substance with numerous health benefits in Ayurveda. It serves as a protective layer for the skin, eyes, ears, nose, and tongue, shielding them from harsh environmental factors and potential damage. The oily layer acts as a barrier, preventing the loss of essential fluids and protecting against the entry of harmful substances, including microbes.

Sesame oil, warmed and instilled into the ears, can help protect against dryness and excessive wax buildup. It aids in softening and removing accumulated dirt, preventing potential infections and maintaining ear health.

Regular massage with warm and medicated oils on the skin is highly recommended in Ayurveda. It strengthens the skin's integrity, preventing cracking and dryness. The oil's properties help separate and remove excretory waste materials stuck to the skin, promoting overall skin health.

Applying oil to the eyes can provide relief and protection. A mild secretion-promoting substance applied to the lower eyelid 

In [31]:
def make_prompt(relevant_passage):
  escaped = relevant_passage.replace("'", "").replace('"', "").replace("\n", " ")
  prompt = textwrap.dedent("""You are an informative bot well versed in Ayurveda medicine and the content provided in the passage. \
  You answer questions using text from the reference passage included below. \
  There are a lot of complicated ayurveda terminlogies used in the reference passage provided. \
  Be sure to respond in a complete sentence, being comprehensive, including all relevant information. \
  Make sure to respond in plain simple english without using any complicated words. \
  I have no prior knowledge about ayurveda, medicine or anything related to it. \
  However, you are talking to a non-technical audience, so be sure to break down complicated concepts and \
  strike a friendly and converstional tone while being to the point with the queries. \
  Use technical jargons and make sure to provide meaning in parantheses. \
  If the passage is irrelevant to the answer, you may ignore it. You may also ignore any csv data provided. \
  The answer to the query is mostly present in the reference passage provided. Go through the passage thoroughly and fetch relevant information. \
  Strictly limit your responses to 300 words. \
  Summarise the information provided in the reference passage comprehensively. \
  The summary should include all the important points provided in the passage. No information loss should happen at any cost. 
  PASSAGE: '{relevant_passage}'

    ANSWER:
  """).format(relevant_passage=escaped)

  return prompt


prompt = make_prompt(nodes[69].text)
answer = model.generate_content(prompt)
print(answer.text)

In Ayurveda, maintaining contact with oil is emphasized to protect the sense organs. The skin, eyes, ears, nose, and tongue require specific methods of protection due to their exposure to various environmental factors.

Skin protection involves applying oil to increase its resistance and support tissue building. Massaging the skin with warm oil forms a protective layer that acts as a barrier against external stimuli. It also helps remove dirt and microbes.

For ear protection, sesame oil is instilled into the ear to prevent dryness and formation of obstructive layers. Warm oil helps soften dirt, which is then removed using a cotton bud.

The eyes are protected by eyelids and a thin film of lachrymal secretion. Applying a mild secretion-promoting substance inside the lower eyelid can provide continuous stimulus. A black admix, prepared from the soot of a clarified butter and camphor flame, is also used to protect the eyes.

The tongue is protected by frequent gargles of lukewarm water a

In [None]:
for node in nodes:
    prompt = make_prompt(node.text)
    answer = model.generate_content(prompt)
    node.text = answer.text