# MediSelect: AI-Based Hospital Recommendation Engine

In the medical field, it is important to make informed decisions both for the patient and medical practitioner. Deciding on the proper hospital or organization for appointments is usually cumbersome, given a large number of available options. To aid this decision-making, we can utilize sophisticated AI algorithms that compare corresponding data from organizations and hospitals.

This project seeks to develop an AI system that analyzes hospital or organization records and offers customized recommendations based on in-depth data analysis. Through the integration of Retrieval-Augmented Generation (RAG) and LangChain-based AI agents, the system retrieves useful insights from a hospital dataset (in CSV format) and utilizes a generative model to offer a recommendation to the user.

The system is intended to:
1. Upload and process hospital data – The data, including satisfaction scores and performance indicators, is uploaded       into the system.
2. Query handling – The user provides a query concerning hospital choice, e.g., asking information about the performance    of a given hospital.
3. Data retrieval – Employing RAG, the system retrieves appropriate data from the hospital dataset to respond to the        query.
4. Analysis and suggestion – The agent based on LangChain analyzes the data retrieved and makes a suggestion regarding      whether the user should make an appointment with the hospital or not.

This system can be useful for medical patients and institutions interested in offering tailored, data-backed recommendations to their users. Automating data processing and decision-making, the system presents a seamless, efficient means of assessing hospitals and institutions for health appointment considerations.


## Technologies used:

1. Google Gemini AI: Powers generative responses to user queries and processes natural language.
2. RAG (Retrieval-Augmented Generation): Combines data retrieval with AI generation to provide accurate, context-aware      answers.
3. CSV File Handling: Stores and processes hospital data for query responses.
4. LangChain : Runs the AI agent to process user queries and generate responses based on data.
5. Few Shot Prompting Technique : The model is given a few examples of input-output pairs in the prompt itself to guide its responses.

## Dataset Used

Link : https://www.kaggle.com/datasets/blueblushed/hospital-dataset-for-practice


This dataset contains records of 1000 patients treated at a hospital, including demographic information, medical conditions, treatments administered, medical expenses incurred, and current health status. Explore various patterns, trends, and insights in patient care, recovery rates, and treatment efficacy using this rich dataset.

## Code Demonstration

In [1]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

/kaggle/input/hospital-dataset-for-practice/hospital data analysis.csv


### Installing libraries

In [2]:
!pip install -U -q "google"
!pip install -U -q "google.genai"

In [3]:
!pip install langchain langchain-google-genai google-generativeai




### Import Libraries

In [4]:
import pandas as pd
from kaggle_secrets import UserSecretsClient
from langchain.agents import Tool, AgentExecutor, initialize_agent
from langchain.agents.agent_types import AgentType
from langchain_google_genai import ChatGoogleGenerativeAI
from google import genai
from google.genai import types


import warnings
warnings.filterwarnings("ignore")


### Seting API Key

In [5]:
user_secrets = UserSecretsClient()
secret_value_0 = user_secrets.get_secret("GOOGLE_API_KEY")

In [6]:
import os
from langchain.agents import Tool, AgentExecutor, initialize_agent
from langchain.agents.agent_types import AgentType
from langchain_google_genai import ChatGoogleGenerativeAI
from google import genai
from google.genai import types

### RAG Function

In [7]:
# ----------------- 1. RAG Function ------------------
def generate_rag_response(user_text: str) -> str:
    client = genai.Client(api_key=secret_value_0)

    uploaded_file = client.files.upload(file="/kaggle/input/hospital-dataset-for-practice/hospital data analysis.csv")  # Adjust path if needed

    contents = [
        types.Content(
            role="user",
            parts=[
                types.Part.from_uri(
                    file_uri=uploaded_file.uri,
                    mime_type=uploaded_file.mime_type,
                ),
                types.Part.from_text(text=f"""You are a helpful assistant. Use the data in the file to answer this question directly and also note that the data which you present should be detailed. The user input is: {user_text}"""),
            ],
        ),
    ]

    config = types.GenerateContentConfig(response_mime_type="text/plain")

    try:
        rag_response = ""
        for chunk in client.models.generate_content_stream(
            model="gemini-2.0-flash-lite",
            contents=contents,
            config=config,
        ):
            rag_response += chunk.text
        return rag_response
    except genai.errors.ClientError as e:
        return "RAG failed to retrieve data."

In [8]:
# ----------------- 2. Set up LangChain LLM ------------------
llm = ChatGoogleGenerativeAI(
    model="gemini-1.5-flash-latest",
    temperature=0.7,
    verbose=True,
    api_key=secret_value_0
)

### Custom Tool or Function Calling

In [9]:
# ----------------- 3. Define Tool for RAG ------------------
def rag_tool_func(query: str) -> str:
    return generate_rag_response(query)

rag_tool = Tool(
    name="RAGDataSearch",
    func=rag_tool_func,
    description="Use this tool to retrieve detailed hospital or organization data from a CSV based on user query."
)

# ----------------- 4. Agent Setup ------------------
tools = [rag_tool]

agent = initialize_agent(tools, llm, agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True,handle_parsing_errors = True)

# Function to process the prompt with the agent
def process_prompt_with_agent(prompt):
    return agent.run(prompt)

### RAG & Agent Invocation with query 

In [10]:
# ----------------- 5. Run Agent ------------------
if __name__ == "__main__":
    user_input = "I am age 55, suffering from Appendicitis will I get cured?"  # Sample User Input
    # user_input = input("💬 Please enter your question: ")
    print("\n🧠 Agent reasoning and decision in progress...\n")
    # Call the rag_tool function to get the response before using it in the prompt
    response = rag_tool_func(user_input) 
    text = user_input  
    
    # Now define the prompt with 'text' and 'response' available
    # Used few shot prompting technique.
    prompt = f"""You are a helpful assistant to recommend a hospital or not for a user based on their records.
      You don't have any tools.
      You are given an input from the user: {text} and a response from RAG: {response}.
      The RAG response includes satisfaction scores and various columns based on a hospital or organization.
      Based on the RAG data, your job is to analyze it and say whether the user should take an appointment or not in that hospital.
      The ouput should be strictly be with a justification:
      Example 1:
      Yes, according to the data given you can reach out to the particular hospital or organisation.
      Example 2: No, according to the data given you cannot reach out to the particular hospital or organisation.
      Example 3: Not sure, according to the data it is difficult to understand whether to go to the particular hospital or organisation or not.
    """
    # Run the agent with the updated prompt that includes text and response
    output = agent.run(prompt) 

    print("\n✅ Final Recommendation:\n", output)

💬 Please enter your question:  I am age 55, suffering from Appendicitis will I get cured?



🧠 Agent reasoning and decision in progress...



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought:The provided data shows a 100% recovery rate for appendicitis patients.  However, the data only includes female patients and a limited age range.  The satisfaction score is consistently 3, which needs further context to interpret.  While the recovery rate is encouraging, the lack of diversity in the data makes it difficult to definitively recommend the hospital.

Thought:I need to consider the limitations of the data before making a recommendation.

Thought:I can answer the question based on the available information.

Final Answer:Yes, according to the data given you can reach out to the particular hospital or organisation.  However, it's crucial to understand that the data only represents a subset of patients (all female, limited age range) and a satisfaction score of 3 needs further context for interpretation.  This recommendation is based solely on the 100% recovery 

#### Example 2

In [11]:
# ----------------- 5. Run Agent ------------------
if __name__ == "__main__":
    user_input = "How many days should a person admit for if he is suffering from Fractured Leg?" # Sample User Input
    # user_input = input("💬 Please enter your question: ")
    print("\n🧠 Agent reasoning and decision in progress...\n")
    # Call the rag_tool function to get the response before using it in the prompt
    response = rag_tool_func(user_input) 
    # Assign the user input to the 'text' variable
    text = user_input  
    
"
    # Now define the prompt with 'text' and 'response' available
    # Used few shot prompting technique.
    prompt = f"""You are a helpful assistant to recommend a hospital or not for a user based on their records.
      You don't have any tools.
      You are given an input from the user: {text} and a response from RAG: {response}.
      The RAG response includes satisfaction scores and various columns based on a hospital or organization.
      Based on the RAG data, your job is to analyze it and say whether the user should take an appointment or not in that hospital.
      The ouput should be strictly be with a justification:
      Example 1:
      Yes, according to the data given you can reach out to the particular hospital or organisation.
      Example 2: No, according to the data given you cannot reach out to the particular hospital or organisation.
      Example 3: Not sure, according to the data it is difficult to understand whether to go to the particular hospital or organisation or not.
    """
    # Run the agent with the updated prompt that includes text and response
    output = agent.run(prompt) 

    print("\n✅ Final Recommendation:\n", output)

💬 Please enter your question:  How many days should a person admit for if he is suffering from Fractured Leg? 



🧠 Agent reasoning and decision in progress...



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought:The RAG response provides a range of hospital stays (6-71 days) for fractured legs, but doesn't offer information about hospital quality, patient satisfaction, or other factors relevant to recommending a specific hospital.  The question asks for a recommendation based on the provided data, and the data is insufficient.

Thought:I now know the final answer

Final Answer: Not sure, according to the data it is difficult to understand whether to go to the particular hospital or organisation or not.[0m

[1m> Finished chain.[0m

✅ Final Recommendation:
 Not sure, according to the data it is difficult to understand whether to go to the particular hospital or organisation or not.
