In [10]:
!pip install PyPDF2 pdf2image pytesseract pillow pydub tts

Collecting pydub
  Downloading pydub-0.25.1-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting tts
  Downloading TTS-0.22.0-cp311-cp311-manylinux1_x86_64.whl.metadata (21 kB)
Collecting anyascii>=0.3.0 (from tts)
  Downloading anyascii-0.3.2-py3-none-any.whl.metadata (1.5 kB)
Collecting pysbd>=0.3.4 (from tts)
  Downloading pysbd-0.3.4-py3-none-any.whl.metadata (6.1 kB)
Collecting umap-learn>=0.5.1 (from tts)
  Downloading umap_learn-0.5.7-py3-none-any.whl.metadata (21 kB)
Collecting pandas<2.0,>=1.4 (from tts)
  Downloading pandas-1.5.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (11 kB)
Collecting trainer>=0.0.32 (from tts)
  Downloading trainer-0.0.36-py3-none-any.whl.metadata (8.1 kB)
Collecting coqpit>=0.0.16 (from tts)
  Downloading coqpit-0.0.17-py3-none-any.whl.metadata (11 kB)
Collecting pypinyin (from tts)
  Downloading pypinyin-0.53.0-py2.py3-none-any.whl.metadata (12 kB)
Collecting hangul-romanize (from tts)
  Downloading hangul_romanize-0.1.0-py3-non

In [1]:
from PyPDF2 import PdfReader
from pdf2image import convert_from_path
import pytesseract
import re

def extract_text_from_pdf(pdf_path):
    text = ""
    try:
        reader = PdfReader(pdf_path)
        for page in reader.pages:
            text += page.extract_text()
    except:
        images = convert_from_path(pdf_path)
        for img in images:
            text += pytesseract.image_to_string(img)
    return text

def preprocess_text(text):
    text = re.sub(r'Page \d+ of \d+', '', text)
    text = re.sub(r'\n\s*\n', '\n', text)
    return text

In [2]:
paper_text = extract_text_from_pdf("/content/Addressing_the_Productivity_Paradox_in_Healthcare_with_Retrieval_Augmented_Generative_AI_Chatbots.pdf")
clean_text = preprocess_text(paper_text)

In [4]:
!pip install gTTs
import google.generativeai as genai
import transformers
from pydub import AudioSegment
import os
from gtts import gTTS  # Using Google Text-to-Speech instead of OpenAI

# Configure Gemini
genai.configure(api_key='AIzaSyA16O7NopjXv9St4KwYB_IWwuQZBJnTzcY')

def summarize_with_gemini(text):
    model = genai.GenerativeModel('gemini-pro')
    prompt = """Summarize this research paper for a non-expert audience in a well-structured paragraph format.
    Do not use bullet points, numbering, asterisks, or bold text. Write naturally and cohesively while covering the key contributions, methodology, analysis, implications, limitations, and conclusion in a fluid and engaging manner.

    Paper: """ + text[:30000]

    response = model.generate_content(prompt)
    return response.text

def extract_key_concepts(text):
    # Using same BERT model as before
    nlp = transformers.pipeline("ner", model="dslim/bert-base-NER")
    entities = nlp(text)
    keywords = [entity["word"] for entity in entities if entity["entity"] in ["B-ORG", "B-MISC"]]
    return list(set(keywords))



Collecting gTTs
  Downloading gTTS-2.5.4-py3-none-any.whl.metadata (4.1 kB)
Downloading gTTS-2.5.4-py3-none-any.whl (29 kB)
Installing collected packages: gTTs
Successfully installed gTTs-2.5.4


In [5]:
summary = summarize_with_gemini(clean_text)
print(summary)

This paper introduces a framework for developing Retrieval Augmented Generative AI chatbots that address the productivity paradox in healthcare, a phenomenon where advancements in technology fail to improve productivity due to inherent limitations. The framework integrates advanced AI techniques such as Retrieval Augmented Generation (RAG) and Generative AI models to provide comprehensive patient summaries, diagnostic insights, and emotional assessments. Through demonstrations and examples, the authors showcase how healthcare professionals can leverage this framework to enhance their decision-making, improve patient engagement, and streamline workflow efficiencies. The paper concludes by highlighting the potential for future research and development to further enhance the framework's capabilities and address the challenges faced in healthcare.


In [6]:
def simplify_jargon(text):
    model = genai.GenerativeModel('gemini-pro')
    prompt = """Replace technical terms with simple analogies. Example:
    "Summary: This research paper is about optimizing resources in hospital departments through network slicing. The authors propose a method to assign individual network slices to each smart device in various hospital departments using a model named federated learning.

In the methodology, the network is divided into three types: Enhanced Mobile Broadband (eMBB), Ultra-Reliable Low Latency Communication (URLLC), and Massive Machine Type Communication (mMTC). All this to consider aspects such as bandwidth, data transmission speed, the number of devices supported at a time, and reliability. The method uses federated learning to ensure the privacy of patient data, keeping it within the respective department and not shared with others.

The results show that the model learns new patterns accurately (98% accuracy) with this real-time scenario-based approach leading to more efficient resource allocation. The results also show a consistent improvement in the accuracy rate and a decrease in the loss value with each round of learning, indicating the effectiveness of the model.

However, the study's limitation is that it doesn't include encryption techniques, which could further improve the safety of the model. Also, it's based on the assumption that data from all devices is periodically transmitted to a local model, which may not always be the case in actual scenarios.

In conclusion, the study proves the efficiency of federated learning and network slicing in smart healthcare facilities for optimizing resources and maintaining data privacy. The authors suggest that future research could enhance this model by integrating encryption techniques to improve patient data privacy even further." -> "Simplified text: This research paper is like a recipe for a better-run hospital using individual lanes for each smart gadget in different hospital departments, similar to having different checkout lines for different types of groceries. The recipe uses a model called federated learning, which is like a private tutor for each department, ensuring that patient data stays within that department and isn't shared with others.

The method splits the hospital's network into three types, like a highway with lanes for motorbikes, cars, and trucks, considering things like how wide the lanes are, how fast the vehicles can go, how many vehicles can fit in at a time, and how reliable the lanes are.

The outcomes show that this recipe works really well, with the model learning new routines accurately (98% accuracy) with this real-world test, leading to a smoother run hospital department. The outcomes also show a continuous increase in success rate and a decrease in errors with each learning session, showing the recipe works well.

However, the recipe's limitation is that it doesn't include a security guard (encryption techniques), which could make the model even safer. Also, it assumes that data from all gadgets is regularly sent to a local model, kind of like a department head, which may not always happen in real life.

To sum up, the study shows that the recipe using federated learning and network slicing works well in smart hospitals for making things run more smoothly and keeping patient data private. The authors suggest that future research could add a security guard to the recipe to keep patient data even safer."
    Text to simplify: """ + text

    response = model.generate_content(prompt)
    return response.text

def generate_analogy(concept):
    model = genai.GenerativeModel('gemini-pro')
    response = model.generate_content(f"Create a relatable analogy for this concept: {concept}")
    return response.text

In [7]:
def generate_video_script(summary: str, analogy: str, video_type) -> str:
    model = genai.GenerativeModel('gemini-pro')

    if video_type == "reel":
        structure = f"""
        - Hook (15 seconds): Grab attention with a surprising fact/question
        - Problem (30 seconds): Explain the research gap
        - Analogy (35 seconds): Simplify using {analogy}
        - Impact (30 seconds): Why this matters
        - Call-to-action (15 seconds)
        """
    else:
        structure = f"""
        - Intro (30s): Context + thesis
        - Methodology (90s): Non-technical explanation
        - Key Findings (90s): Visualized results
        - Real-World Example (60s): {analogy}
        - Conclusion (30s)
        """

    response = model.generate_content(
        f"Create a {video_type} script using this structure: {structure}\n"
        f"Content: {summary}"
    )
    return response.text

In [8]:
def generate_narration(script: str, output_path: str = "narration.mp3"):
    tts = gTTS(text=script, lang='en', slow=False)
    tts.save(output_path)
    return output_path

def generate_podcast_script(summary: str, analogy: str) -> str:
    model = genai.GenerativeModel('gemini-pro')
    prompt = f"""Create a podcast dialogue:
    - Host asks 5 questions
    - Expert answers using analogy: {analogy}
    - Keep answers under 120 seconds
    Content: {summary}"""

    response = model.generate_content(prompt)
    return response.text

In [9]:
def generate_podcast_audio(script: str, output: str = "podcast.mp3") -> str:
    host_lines = [line.replace("Host: ", "")
                 for line in script.split("\n") if line.startswith("Host:")]
    expert_lines = [line.replace("Expert: ", "")
                   for line in script.split("\n") if line.startswith("Expert:")]

    # Generate voices with different parameters
    generate_narration("\n".join(host_lines), "host_temp.mp3")

    # Generate expert voice with different speed
    tts = gTTS(text="\n".join(expert_lines), lang='en', slow=False)
    tts.save("expert_temp.mp3")

    host_audio = AudioSegment.from_file("host_temp.mp3")
    expert_audio = AudioSegment.from_file("expert_temp.mp3")
    combined = host_audio + AudioSegment.silent(500) + expert_audio
    combined.export(output, format="mp3")

    os.remove("host_temp.mp3")
    os.remove("expert_temp.mp3")

    return output

In [10]:
summary = summarize_with_gemini(clean_text)
print(summary)
keywords = extract_key_concepts(clean_text)
simplified_text = simplify_jargon(summary)
print(simplified_text)
analogy = generate_analogy("Generative Adversarial Networks")

# Generate YouTube script
youtube_script = generate_video_script(simplified_text, analogy, video_type="youtube")
generate_narration(youtube_script, "youtube_narration.mp3")

# Generate Reel script
reel_script = generate_video_script(simplified_text, analogy, video_type="reel")
generate_narration(reel_script, "reel_narration.mp3")

# Generate podcast
podcast_script = generate_podcast_script(summary, analogy)
generate_podcast_audio(podcast_script)

Researchers have developed a Retrieval Augmented Generative AI Chatbot framework that addresses the productivity paradox in healthcare, where the inherent limitations of Generative AI models hinder their widespread adoption and effectiveness in complex healthcare systems. This framework combines the capabilities of retrieval-based and generative models to enhance the accuracy, relevance, and contextual appropriateness of generated responses, overcoming the challenge of "AI hallucinations" and ensuring reliable and informative results. By integrating external data sources and leveraging advanced analysis modules for patient telehealth conversation summarization, disease diagnosis, and emotional assessment, this framework empowers healthcare professionals with comprehensive patient profiles and valuable insights. This framework has the potential to revolutionize healthcare service delivery by improving efficiency, enhancing patient engagement, and providing clinicians with the tools they

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/829 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/433M [00:00<?, ?B/s]

Some weights of the model checkpoint at dslim/bert-base-NER were not used when initializing BertForTokenClassification: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


tokenizer_config.json:   0%|          | 0.00/59.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/2.00 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

Device set to use cpu


Simplified text: Imagine a supermarket where you have a self-checkout line for quick purchases and a regular checkout line for bigger shopping trips. Now, imagine a chatbot for healthcare that's like a smart assistant, helping you with your questions. Instead of just making things up (like some chatbots do), this new chatbot combines two ways of understanding:

1. It remembers and retrieves information from a big database, like a well-stocked supermarket.
2. It can also create new stuff, like a chef cooking a new dish.

This hybrid approach helps the chatbot give you more accurate, helpful, and relevant answers, kind of like having a personal shopper and a chef helping you at the same time. It uses extra information and tools to summarize your health chats, figure out what's wrong, and understand your feelings.

Just like a good healthcare professional, it gives you a complete picture and helps you understand what's going on. This means better, faster healthcare, and happier, healthier

AssertionError: No text to speak

In [11]:
print(summary)

Researchers have developed a Retrieval Augmented Generative AI Chatbot framework that addresses the productivity paradox in healthcare, where the inherent limitations of Generative AI models hinder their widespread adoption and effectiveness in complex healthcare systems. This framework combines the capabilities of retrieval-based and generative models to enhance the accuracy, relevance, and contextual appropriateness of generated responses, overcoming the challenge of "AI hallucinations" and ensuring reliable and informative results. By integrating external data sources and leveraging advanced analysis modules for patient telehealth conversation summarization, disease diagnosis, and emotional assessment, this framework empowers healthcare professionals with comprehensive patient profiles and valuable insights. This framework has the potential to revolutionize healthcare service delivery by improving efficiency, enhancing patient engagement, and providing clinicians with the tools they

In [None]:
print(simplified_text)

**Key Contributions:**

* **AI Doctor:** Created a talking robot that uses two different ways to understand patients' questions and provide helpful advice.
* **Fixing AI's Problem:** Improved the robot's responses by combining the two ways it learns.
* **New Tool for Healthcare Workers:** Gave doctors and nurses a way to see a patient's information and make better decisions.
* **Talking to Technology:** Patients can now ask questions and get answers from the robot, making it easier to get healthcare advice.
* **Saving Time:** The robot can help doctors and nurses spend less time on routine tasks, like writing summaries or checking for diseases.

**How It Works:**

* **Building Blocks:** Put together a system with different parts, like a puzzle.
* **Where It Learns:** Used patient conversations, medical records, and other information to make the robot smart.
* **Understanding Patients:** Used special techniques to figure out what patients are talking about and how they're feeling.
* **C

In [14]:
input_text = summary
model = genai.GenerativeModel('gemini-pro')
response = model.generate_content(
    f"Create a conversation for a podcast episode between a female host and a male expert discussing the following paper content: {input_text}"
)
script = response.text.strip()

In [15]:
script

'**Host:** Welcome to the podcast, Dr. Williams. Today, we\'re discussing your exciting new research on Retrieval Augmented Generative AI Chatbots in healthcare. Can you tell us a bit about what led you and your team to pursue this project?\n\n**Dr. Williams:** Thank you for inviting me. We were motivated by the growing need for efficient and effective AI tools in healthcare. Generative AI models have shown great promise in tasks like text generation and translation. However, in healthcare, their limitations have hindered their widespread adoption. We wanted to address these challenges and create a framework that could overcome them.\n\n**Host:** Can you elaborate on the specific limitations of Generative AI models in healthcare?\n\n**Dr. Williams:** These models are prone to "AI hallucinations", where they generate false or misleading information. Additionally, their responses may lack accuracy, relevance, and contextual appropriateness. This is particularly concerning in healthcare, 

In [17]:
cleaned_script = script.replace("**", "").replace("*", "")
cleaned_script = cleaned_script.replace("[", "").replace("]", "")

# Step 2: Split the script into lines and remove extra spaces
cleaned_script = cleaned_script.strip().split("\n")
cleaned_script = [s.strip() for s in cleaned_script if s]  # Remove empty lines

# Step 3: Filter the lines to only include those starting with "Host" or "Expert"
filtered_lines = [line for line in cleaned_script if line.startswith("Host") or line.startswith("Dr. Williams")]

# Print the final filtered script
filtered_lines

["Host: Welcome to the podcast, Dr. Williams. Today, we're discussing your exciting new research on Retrieval Augmented Generative AI Chatbots in healthcare. Can you tell us a bit about what led you and your team to pursue this project?",
 'Dr. Williams: Thank you for inviting me. We were motivated by the growing need for efficient and effective AI tools in healthcare. Generative AI models have shown great promise in tasks like text generation and translation. However, in healthcare, their limitations have hindered their widespread adoption. We wanted to address these challenges and create a framework that could overcome them.',
 'Host: Can you elaborate on the specific limitations of Generative AI models in healthcare?',
 'Dr. Williams: These models are prone to "AI hallucinations", where they generate false or misleading information. Additionally, their responses may lack accuracy, relevance, and contextual appropriateness. This is particularly concerning in healthcare, where reliabl

In [18]:
generate_podcast_audio(filtered_lines)

AttributeError: 'list' object has no attribute 'split'