<a href="https://colab.research.google.com/github/whygit-dot/machine-learning/blob/main/MedicalQuestionAnsweringUsingAI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [16]:
# Install necessary libraries
!pip install spacy gradio faiss-cpu

# Import necessary libraries
import spacy
import gradio as gr
import faiss
import numpy as np

# Load spaCy's pre-trained model for embedding generation
nlp = spacy.load('en_core_web_sm')

# Sample medical knowledge base (this would be much larger in a real-world system)
medical_knowledge = [
    "Diabetes is a chronic medical condition that occurs when the body is unable to properly process glucose.",
    "Type 1 diabetes is usually diagnosed in children and young adults, and involves the immune system attacking insulin-producing cells.",
    "Type 2 diabetes is more common and typically affects adults, often due to lifestyle factors like poor diet and lack of exercise.",
    "Hypertension, or high blood pressure, is a condition that can increase the risk of heart disease and stroke.",
    "The best way to manage high blood pressure is through a combination of medication, a healthy diet, and regular physical activity.",
    "Cancer is a group of diseases involving abnormal cell growth with the potential to spread to other parts of the body.",
    "A healthy diet is essential for managing many chronic conditions, including diabetes and hypertension.",
    "Symptoms of a heart attack can include chest pain, shortness of breath, and nausea."
]

# Function to convert text into vectors using spaCy embeddings
def get_text_vectors(text_list):
    return np.array([nlp(text).vector for text in text_list])

# Build a FAISS index for the medical knowledge base
def build_faiss_index(text_list):
    text_vectors = get_text_vectors(text_list)
    index = faiss.IndexFlatL2(text_vectors.shape[1])  # Using L2 distance
    index.add(text_vectors)  # Add text vectors to the index
    return index

# Function to retrieve the most relevant information based on the user's query
def retrieve_relevant_information(query, index, knowledge_base):
    query_vector = nlp(query).vector
    query_vector = np.expand_dims(query_vector, axis=0)
    _, indices = index.search(query_vector, k=3)  # Retrieve top 3 relevant documents
    relevant_texts = [knowledge_base[i] for i in indices[0]]
    return relevant_texts

# Generate a response by concatenating the retrieved information
def generate_answer(relevant_texts):
    return " ".join(relevant_texts)

# Function to answer a medical question based on the query
def medical_question_answering(query):
    # Build the FAISS index from the medical knowledge base
    index = build_faiss_index(medical_knowledge)

    # Retrieve the most relevant documents for the given query
    relevant_texts = retrieve_relevant_information(query, index, medical_knowledge)

    # Generate the answer by concatenating the relevant information
    answer = generate_answer(relevant_texts)
    return answer

# Create a Gradio interface
interface = gr.Interface(fn=medical_question_answering, inputs="text", outputs="text",
                         title="Medical Question Answering",
                         description="Ask any medical-related question, and the system will answer based on the knowledge base.")
interface.launch()


It looks like you are running Gradio on a hosted a Jupyter notebook. For the Gradio app to work, sharing must be enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://12d1f53c57a2c7deaf.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


