**Step 1:** Create LLM Model  
**Step 2:** Load Text File  
**Step 3:** Perform Embeddings  
**Step 4:** Creating Index on Embeddings   
**Step 5:** Store Indexing in Disc  
**Step 6:** Load Index from Disc  
**Step 7:** Query Data Using Stored Index


**Data Preprocessing**

In [7]:
! pip install -qU pandas openpyxl

In [8]:
import pandas as pd

df=pd.read_csv('C:/Users/admin/OneDrive/Desktop/Symptoms_based_disease_prediction/Datasets/preprocessed_data.csv')
print(df.shape)
df.head(2)

(304, 2)


Unnamed: 0,symptoms,prognosis
0,"itching, skin_rash, nodal_skin_eruptions, disc...",Fungal Infection
1,"skin_rash, nodal_skin_eruptions, dischromic _p...",Fungal Infection


In [9]:
df['symptoms']='symptoms : '+df['symptoms']
df['prognosis']='prognosis : '+df['prognosis']
df['symptoms based disease prediction']=df['symptoms']+" -> "+df['prognosis']
df.head()

Unnamed: 0,symptoms,prognosis,symptoms based disease prediction
0,"symptoms : itching, skin_rash, nodal_skin_erup...",prognosis : Fungal Infection,"symptoms : itching, skin_rash, nodal_skin_erup..."
1,"symptoms : skin_rash, nodal_skin_eruptions, di...",prognosis : Fungal Infection,"symptoms : skin_rash, nodal_skin_eruptions, di..."
2,"symptoms : itching, nodal_skin_eruptions, disc...",prognosis : Fungal Infection,"symptoms : itching, nodal_skin_eruptions, disc..."
3,"symptoms : itching, skin_rash, dischromic _pat...",prognosis : Fungal Infection,"symptoms : itching, skin_rash, dischromic _pat..."
4,"symptoms : itching, skin_rash, nodal_skin_erup...",prognosis : Fungal Infection,"symptoms : itching, skin_rash, nodal_skin_erup..."


In [10]:
# Assuming 'df' is your DataFrame and 'symptoms based disease prediction' is the column you want to convert
df['symptoms based disease prediction'].to_csv('C:/Users/admin/OneDrive/Desktop/Symptoms_based_disease_prediction/Datasets/symptoms_disease_prediction.txt', index=False, header=False)

**Step:1**

In [3]:
# cmd:
# >>ollama run nomic-embed-text
! ollama pull nomic-embed-text
! pip install -qU langchain-ollama llama-index llama-index-embeddings-langchain  llama-index-llms-langchain

[?25lpulling manifest ⠋ [?25h[?25l[2K[1Gpulling manifest ⠙ [?25h[?25l[2K[1Gpulling manifest ⠹ [?25h[?25l[2K[1Gpulling manifest ⠸ [?25h[?25l[2K[1Gpulling manifest ⠼ [?25h[?25l[2K[1Gpulling manifest ⠴ [?25h[?25l[2K[1Gpulling manifest ⠦ [?25h[?25l[2K[1Gpulling manifest ⠧ [?25h[?25l[2K[1Gpulling manifest ⠇ [?25h[?25l[2K[1Gpulling manifest ⠏ [?25h[?25l[2K[1Gpulling manifest ⠋ [?25h[?25l[2K[1Gpulling manifest 
pulling 970aa74c0a90... 100% ▕████████████████▏ 274 MB                         
pulling c71d239df917... 100% ▕████████████████▏  11 KB                         
pulling ce4a164fc046... 100% ▕████████████████▏   17 B                         
pulling 31df23ea7daa... 100% ▕████████████████▏  420 B                         
verifying sha256 digest 
writing manifest 
success [?25h


**Step:2,3**

In [4]:
from langchain_ollama import OllamaEmbeddings
from llama_index.core import Document, VectorStoreIndex

# Initialize embedding model and list to store embedded documents
embed = OllamaEmbeddings(model="nomic-embed-text")
embedded_documents = []

# Read the file and embed each line
file_path = "C:/Users/admin/OneDrive/Desktop/Symptoms_based_disease_prediction/Datasets/symptoms_disease_prediction.txt"
with open(file_path, encoding='utf-8') as file:
    for line in file:
        line = line.strip()  # Remove extra whitespace
        if line:
            embedding = embed.embed_query(line)  # Get embedding
            embedded_documents.append(Document(text=line, embedding=embedding))  # Add to list

**Step4:**

In [5]:
# Create the vector store index from the embedded documents
index = VectorStoreIndex.from_documents(embedded_documents,embed_model=embed)

**Step5:**

In [6]:
# Save the index to a local directory (optional)
path="C:/Users/admin/OneDrive/Desktop/Symptoms_based_disease_prediction/Datasets/vector db"
index.storage_context.persist(persist_dir="path")

**Step6:**

In [7]:
#Load indexing
from llama_index.core import StorageContext, load_index_from_storage

StorageContext=StorageContext.from_defaults(persist_dir="path")
load_index=load_index_from_storage(StorageContext,embed_model=embed)

**Step7:**

In [9]:
import os
from langchain_groq import ChatGroq

os.environ["GROQ_API_KEY"] = "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"
llm = ChatGroq(model="llama-3.3-70b-versatile")

query_engine = load_index.as_query_engine(llm=llm)
Query="What are diseases associated with this symptoms burning_micturition, bladder_discomfort, continuous_feel_of_urine ?"
vector_db_response= query_engine.query(Query)
print(vector_db_response)

Urinary Tract Infection.


### Agentic AI

In [17]:
# Import necessary modules
from langchain.agents import initialize_agent, Tool, AgentType
from langchain_community.tools import DuckDuckGoSearchRun

# Initialize DuckDuckGo search tool
ddg_search = DuckDuckGoSearchRun()

# Define the tools the agent can use
tools = [
    Tool(
        name="DuckDuckGo Search",
        func=ddg_search.run,
        description="Search for diseases based on symptoms using DuckDuckGo"
    )
]

# Initialize the agent with Zero-Shot-React-Description agent type
agent = initialize_agent(
    llm=llm,
    tools=tools,
    agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True
)


# Query the agent to recommend diseases based on the symptoms
Agent_response = agent.invoke("What are top diseases are associated with itching, skin_rash, nodal_skin_eruptions?")

print(Agent_response['output'])



[1m> Entering new AgentExecutor chain...[0m


  agent = initialize_agent(


[32;1m[1;3mTo find the top diseases associated with itching, skin_rash, and nodal_skin_eruptions, I should use the DuckDuckGo Search tool to search for diseases based on these symptoms.

Action: DuckDuckGo Search
Action Input: {'symptoms': ['itching', 'skin_rash', 'nodal_skin_eruptions']}
[0m
Observation: [36;1m[1;3mSkin discoloration is a common rash symptom, which can present differently on different skin tones. Rashes may appear red or pink on lighter skin tones, while on darker skin tones they may be ... From eczema to allergic reactions to bug bites, here's what common skin rashes look like in photos, and the symptoms that can help you I.D. the condition. Learn how in some cases, persistent itchiness accompanied by other symptoms can be a sign of underlying conditions like diabetes, kidney disease or chronic liver disease. In most adults, a rash can be mild and resolve on its own. But if you have other symptoms, such as fever, pain, or a rash that spreads, you may need urgent

### Vector DB + Agentic Response

In [19]:
import os
import re
from langchain_groq import ChatGroq

# Set your Groq API key
os.environ["GROQ_API_KEY"] = "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"

# Initialize the ChatGroq model
llm = ChatGroq(
    model="llama-3.3-70b-versatile",
    temperature=1,
    max_tokens=1024,
    stream=True
)

# Define the conversation
messages = [
    ("system", "You are a highly knowledgeable and detail-oriented medical assistant with the ability to critically analyze responses for errors and inconsistencies."),
    ("human", f"I have two sets of responses related to symptoms:\n"
              f"1. Knowledge Graph Response: {vector_db_response}\n"
              f"2. Web Search Response: {Agent_response}\n"
              f"\n"
              f"Please follow these steps to refine and merge the responses:\n"
              f"1. Review both responses for mistakes, inconsistencies, and redundant diseases. Ensure no disease is listed more than once and prioritize based on severity and commonality.\n"
              f"2. Suggest a healthy diet to manage or alleviate the symptoms. Include specific foods and drinks to consume and avoid.\n"
              f"3. Recommend safe and easy natural techniques or exercises that can complement medical treatment.\n"
              f"4. Specify the type of doctor or specialist to consult and provide a brief reason for each recommendation.\n"
              f"\n"
              f"After reviewing and enhancing the responses, follow these steps:\n"
              f"    1. List all (highly, medium,low) associated diseases, eliminating duplicates and prioritizing based on severity.\n"
              f"    2. Suggest a healthy diet, including specific foods to consume and avoid.\n"
              f"    3. Recommend natural techniques or exercises that are easy and safe.\n"
              f"    4. Specify the type of doctor or specialist to consult with a brief reason.\n"
              f"Ensure the response is clear, concise, and easy to understand."),
]


# Stream and clean markdown on the fly
for chunk in llm.stream(messages):
    cleaned_chunk = re.sub(r'(\*\*|\*|__|_|\[.*?\]\(.*?\))', '', chunk.content)
    print(cleaned_chunk, end='', flush=True)

                    stream was transferred to model_kwargs.
                    Please confirm that stream is what you intended.


Refined and Merged Response:

After reviewing both responses, I have refined and merged the information to provide a comprehensive answer.

1. Associated Diseases:
Based on the symptoms of itching, skin rash, and nodal skin eruptions, the following diseases are associated, listed in order of severity and commonality:
- High: Eczema (atopic dermatitis), Psoriasis, Fungal Infection
- Medium: Scabies, Parasitic infections, Drug rashes
- Low: Dry skin (xerosis), Anemia, Diabetes, Thyroid problems, Liver disease, Kidney disease

2. Healthy Diet:
To manage or alleviate the symptoms, a healthy diet is essential. The following foods and drinks are recommended:
- Consume: Fatty fish (rich in omega-3 fatty acids), Leafy greens (rich in antioxidants), Berries (rich in antioxidants), Whole grains, Probiotic-rich foods (yogurt, kefir)
- Avoid: Processed foods, Sugary drinks, Dairy products (if lactose intolerant), Spicy or acidic foods

3. Natural Techniques or Exercises:
Safe and easy natural tech