# Foundation models (Optional Content)

## Introduction to Foundation Models

#### Introduction
- New module on foundation models and generative AI in healthcare.
- Focus on the impact of foundation models like ChatGPT and DALL-E.

#### Key Concepts
1. **Foundation Models**:
   - Popularized by Stanford, foundation models are a type of AI model that:
     - Learn from massive amounts of unlabeled data (unsupervised learning).
     - Use self-supervised learning techniques.
     - Exhibit adaptability and improved sample efficiency as data size and parameters increase.

2. **Model Size**:
   - "Large" in large language models refers to the number of parameters:
     - GPT: 117 million
     - GPT-2: 1.5 billion
     - GPT-3: 175 billion (trained on 500 billion word tokens).
   - Similar architectures but with increasing size and data.

3. **Learning Techniques**:
   - **Few-shot Learning**: Model learns from a small number of examples and generalizes well.
   - **Zero-shot Learning**: Model performs new tasks it has never seen before using general knowledge.
   - Both are aspects of **Transfer Learning**.

#### Applications in Healthcare
- Foundation models can handle multiple data types (multimodal) and tasks, such as:
  - Chatbots
  - Language-based search engines
  - Image creation
  - Automated content generation.
- Benefits include:
  - Efficient data processing and understanding.
  - Improved patient care and outcomes.
  - Cost-effective model development and maintenance.

#### Analogy
- Chef analogy for understanding:
  - Chef trained broadly can quickly adapt to new recipes (few-shot learning).
  - Chef can recreate unfamiliar dishes based on general cooking knowledge (zero-shot learning).

#### Potential Impact
- Foundation models can transform healthcare with applications in:
  - Automated screening
  - Diagnosis
  - Drug design.
- Enhancements include better communication between healthcare professionals and AI systems, leading to:
  - Efficient diagnoses and treatments.
  - Improved workflow and collaboration.

#### Conclusion
- Foundation models bridge the gap in healthcare AI, promoting improved clinical outcomes and patient lives.
- Importance for stakeholders in healthcare to understand and utilize foundation models, making the course accessible to all backgrounds.

#### Key Takeaway
- Foundation models are reshaping AI applications, particularly in healthcare, with their adaptability, efficiency, and multimodal capabilities.

## Adapting to Technology

#### Exponential Growth in Technology
- The rapid advancements in AI, particularly foundation models, represent a shift from academic concepts to transformative forces in industries.
- Exponential phenomena are often counterintuitive in a linear world, making it difficult for humans to adapt and plan.

#### The Chessboard Analogy
- The story of the chessboard illustrates exponential growth:
  - An inventor requests rice grains doubling on each square of a chessboard, highlighting how quickly quantities can escalate.
  - By the 64th square, the total grains far exceed what exists in the world, showing the underestimation of exponential growth's impact.

#### Data and Computing Power
- The growth of data and computational power fuels AI development:
  - Exponential increase in healthcare data and a reduction in the doubling time for healthcare knowledge challenge healthcare systems and professionals.
  
#### Moore's Law
- Gordon Moore’s observation states that the number of transistors on microchips doubles roughly every two years, driving rapid technological advancement.
- Example:
  - In 2000: ~2 million transistors per chip.
  - In 2020: ~100 billion transistors, a 50,000-fold increase.
  - By 2040: projected ~1 trillion transistors, a 50 million-fold increase over 40 years.

#### Challenges and Implications
- The exponential growth presents challenges for planning, governance, and ethical considerations:
  - Society struggles to anticipate the implications of rapid advancements.
  - The need for responsible innovation and ethical frameworks is paramount.

#### Foundation Models and Their Tipping Point
- Foundation models may represent a tipping point in AI, paralleling the chessboard's second half.
- Critical questions arise about their impact on healthcare and society.

#### Best Practices for Adaptation
1. **Invest in R&D**: To prepare for the impact and leverage opportunities from exponential technologies.
2. **Create Regulatory Frameworks**: Governments should ensure responsible development and usage of technologies, holding parties accountable.
3. **Encourage Ethical Considerations**: Stakeholders must consider the social implications of their work.
4. **Foster Cross-disciplinary Collaboration**: Diverse expertise is essential for addressing the challenges and opportunities of these technologies.
5. **Prioritize Education**: Educating all stakeholders about AI and its implications is crucial for informed decision-making.

#### Conclusion
- Understanding and preparing for exponential technologies like AI is vital as they continue to reshape industries, especially healthcare.

## General AI and Emergent Behavior


#### Transformative Potential of Foundation Models
- Foundation models have sparked discussions about their societal impact, similar to past sentiments about computers.
- These models have demonstrated impressive capabilities, such as passing rigorous tests in law, business, and medicine, leading to debates about general AI.

#### Understanding General AI
- General AI is interpreted in various ways, often surrounded by hype and confusion.
- Foundation models exemplify a significant advancement in knowledge-based tasks and can be viewed as a form of general AI.

#### Learning Mechanisms
- Foundation models leverage unsupervised learning from large datasets, enabling efficiencies in tasks like transfer learning, few-shot, and zero-shot learning.

#### Emergent Behaviors
- Emergent behaviors arise as foundation models, with increasing parameters and data, exhibit unexpected capabilities not explicitly programmed:
  - These behaviors can be beneficial (e.g., discovering new patterns) or problematic (e.g., biased outputs).
- Understanding how these behaviors emerge is crucial for effectively working with foundation models.

#### Analogy of Flocking Birds
- Similar to how individual birds coordinate in a flock, foundation models find patterns through complex interrelationships within massive datasets, leading to new capabilities.

#### Healthcare Applications
- In healthcare, emergent behaviors can uncover connections between symptoms, conditions, and treatments, potentially improving diagnostics and patient outcomes.
- They also hold promise for drug discovery by analyzing diverse data sources to identify new drug targets and interactions.

#### Challenges of Emergent Behaviors
- Understanding the complex relationships revealed by foundation models can be difficult, raising issues of transparency and reliability.
- The "black box" phenomenon complicates understanding how models arrive at conclusions.

#### Hallucinations
- Hallucinations refer to false or unrealistic outputs generated by the model, often when the input falls outside the training data.
- In healthcare, this can result in incorrect recommendations or treatment suggestions, posing risks to patient safety.

#### Addressing Challenges
- To mitigate hallucinations, it's essential to:
  - Train models on diverse, representative datasets.
  - Validate predictions through rigorous testing.
  - Enhance model transparency and interpretability.
  
- An innovative approach involves using AI to monitor AI, comparing outputs to validated sources to identify inaccuracies.

#### Conclusion
- The challenges posed by foundation models and emergent behaviors require ongoing research and focus on specific domains like healthcare.
- While complex, strategies such as AI monitoring can help address issues of false outputs, paving the way for responsible and effective use of foundation models in real-world applications.

## How Foundation Models Work

### Key Steps in Training a Foundation Model

1. **Data Collection**:
   - Large volumes of text data are gathered from various sources (books, articles, websites) to create a diverse training set.

2. **Preprocessing**:
   - The text is tokenized, breaking it down into words or sub-words. This step is crucial as it transforms raw text into a format that the model can process.
   - Additional preprocessing steps may include cleaning the data, removing irrelevant content, and normalizing text.

3. **Model Architecture Design**:
   - The backbone of most LLMs is the **transformer architecture**, known for its self-attention mechanism that allows the model to weigh the importance of different words in a sentence.
   - There are two primary types of transformer architectures:
     - **Transformer Encoder**: Processes input sequences and captures semantic meaning.
     - **Transformer Decoder**: Generates output sequences based on an initial vector or input.

4. **Training**:
   - LLMs typically use **self-supervised learning**. For instance, BERT employs masked language modeling where random tokens are masked, and the model learns to predict them using surrounding context.
   - The GPT series predicts the next word in a sequence based on previous words, training the model through exposure to vast text corpora.

5. **Evaluation**:
   - After training, the model is evaluated on specific tasks to assess its performance and fine-tune it if necessary.
   - Evaluation may involve metrics like perplexity for language models or specific benchmarks in the case of downstream tasks.

### Prompt Engineering

- **Prompts**: The input given to the model to generate output. It can be a question, instruction, or statement.
- **Techniques**:
  - **Instruction Prompt**: A simple command (e.g., "What are the symptoms of flu?").
  - **Role Assignment**: Assigning a role (e.g., "You are a doctor.").
  - **Few-Shot Prompting**: Providing examples to guide the model’s output.
  - **Chain of Thought Prompting**: Encouraging the model to explain its reasoning process step-by-step.
  - **Self-Consistency**: Generating multiple responses and choosing the most frequent answer.
  - **Generative Knowledge**: Generating and integrating facts before producing a final response.

### Multimodal Foundation Models

- These models process different data types, such as text and images, using similar transformer architectures.
- **Examples**:
  - **DALL·E**: Generates images from textual descriptions by mapping text to visual features and decoding them into images.
  - **Whisper**: A model for speech recognition that encodes audio into vectors for processing.

### Applications in Healthcare

Understanding how these models work, particularly in generating coherent and contextually relevant information, can be transformative in healthcare applications, such as patient management and diagnostic support. However, it's crucial to address limitations like potential biases in the data and the need for further alignment with human reasoning.


## Healthcare Use Cases for Text Data

1. **Performance of ChatGPT**:
   - ChatGPT passed the USMLE, showing close performance to expert physicians.
   - Highlights potential and limitations of LLMs in healthcare.
   - Outperforms smaller domain-specific models; useful for patient education and clinical question answering.

2. **Integration into Medical Training**:
   - Consideration for LLMs in licensing exams to reflect real-world clinical practice.
   - Importance for healthcare professionals to understand LLMs' benefits and limitations.

3. **Caution in Clinical Use**:
   - Vigilance required in reviewing LLM outputs due to potential inaccuracies and fabrications.
   - Importance of verifying medical literature references generated by LLMs.

4. **Clerical Task Automation**:
   - LLMs can automate scheduling, triaging patient requests, and improving inbox management.
   - Helps reduce clinician burnout and enhances job satisfaction.

5. **Data Processing and NLP Tasks**:
   - **Tokenization**: Breaking down clinical text into manageable pieces for analysis.
   - **Named Entity Recognition**: Identifying entities like drugs and diseases.
   - **Negation Detection**: Understanding relationships and sentiment (e.g., "no renal cell carcinoma present").
   - **Relation Extraction**: Identifying connections between entities (e.g., tests and conditions).
   - **De-identification**: Masking patient information for privacy.

6. **Foundation Models in Data Science**:
   - Capable of performing NLP tasks with minimal training data.
   - Enhance generalization across various clinical notes and health systems.

7. **Advanced Applications**:
   - **Clinical Decision Support**: Suggest treatment options, drug interactions, and best practices.
   - **Clinical Trial Recruitment**: Assess patient eligibility and improve communication about trials.
   - **Patient Communication**: Answering queries, generating reminders, and translating medical jargon.
   - **Billing and Coding**: Assisting in accurate medical coding for billing purposes.
   - **Public Health**: Monitoring outbreaks through data integration and analysis.

8. **Genomic Applications**:
   - Processing genomic data in text formats (e.g., FASTA).
   - Identifying disease-related patterns and relationships through multimodal data analysis.
   - Pharmacogenomics: Unlocking therapy responses and side effects.

9. **Drug Discovery Enhancements**:
   - Foundation models aid in virtual screening, lead optimization, toxicity prediction, and mechanism of action prediction.
   - Streamlines drug development processes and improves accuracy of predictions.

10. **Future Implications**:
    - The role of foundation models in healthcare is rapidly expanding.
    - Potential to revolutionize patient care, clinical processes, and drug discovery.


## Healthcare Use Cases for Non-textual Unstructured Data

1. **Broad Capabilities of Foundation Models**:
   - Large language models (LLMs) have demonstrated effectiveness in healthcare tasks without modification, particularly in low-risk, repetitive clinical roles such as data management and administrative tasks.

2. **Multi-Modal Learning**:
   - Foundation models can process various data types (e.g., text, images, sound) and learn relationships across modalities, enhancing the extraction of valuable insights from unstructured healthcare data.
   - For instance, image-text pairing during training helps models understand relationships between medical images (e.g., chest CTs) and accompanying textual descriptions.

3. **Advancements in Medical Imaging**:
   - Foundation models have the potential to reduce the need for numerous narrow models in medical imaging, improving efficiency in tasks like image quantification, detection, and risk prediction.
   - Radiology, particularly neuro and chest radiology, has seen significant AI adoption, with radiologists expressing satisfaction with AI's added value to patient care.

4. **Comprehensive Cognitive Tasks**:
   - Radiologists' work involves comparing current exams with previous ones, synthesizing patient context, and making treatment recommendations—tasks that require understanding beyond just image interpretation.
   - Foundation models could bridge this gap by integrating multiple data sources, allowing for a more holistic approach to diagnosis.

5. **Enhanced Interpretations**:
   - By combining image data with relevant clinical text, foundation models provide nuanced interpretations of medical images that consider the patient's medical history and treatment context.
   - This broader understanding could lead to new AI models capable of extracting insights that human experts might miss.

6. **Emerging Applications**:
   - Early studies suggest AI-derived measures from imaging can predict adverse events, facilitating screenings for various conditions like osteoporosis and cardiovascular disease.
   - Multimodal foundation models may incorporate genomic and digital pathology data to discover new clinical patterns.

7. **Streamlined Model Development**:
   - Transfer learning allows for the rapid development of specific models with less labeled data, accelerating the advancement of medical imaging AI tools.
   - These models can assist in tasks such as data preprocessing, augmentation, and generating synthetic imaging data.

8. **Voice-Text Applications**:
   - Foundation models can analyze voice data for disease prediction (e.g., detecting Parkinson's symptoms) and assist individuals with speech impairments.
   - They also support virtual medical assistants and chatbots in mental health applications, enhancing access to care and information.

9. **Augmenting Human Decision-Making**:
   - Foundation models can transition humans from comprehension knowledge to fluid reasoning, augmenting problem-solving and decision-making capabilities in real-time.
   - They are tools to support informed decision-making, not replacements for human judgment.

10. **Caveats and Challenges**:
    - Despite their capabilities, foundation models must adhere to the "no free lunch" theorem, indicating trade-offs and challenges exist in their implementation.


## Challenges and Pitfalls

This discussion highlights the vital intersection of technology and healthcare, emphasizing the necessity for technical literacy among medical professionals. As automation and technology progress rapidly, it's crucial for clinicians to not only understand medical knowledge but also how to effectively use advanced technologies like AI and machine learning.

Key points include:

1. **Technical Literacy**: Essential for navigating the growing complexity of healthcare technologies, especially as medical information doubles frequently. 

2. **Challenges of Machine Learning**: Issues such as bias, automation bias, and the risks of incorrect outputs pose significant challenges. Misleading outputs can lead to dangerous situations in patient care.

3. **Foundation Models**: While powerful, they require careful governance to prevent issues stemming from training biases and incorrect human feedback. The potential for misinformation highlights the need for medical expertise in model development.

4. **Resource Imbalances**: The high costs of developing foundation models raise concerns about disparities in access and quality of healthcare technologies across different populations.

5. **Deployment Issues**: Integrating machine learning models into healthcare is complicated by fragmented IT systems, data silos, and strict privacy regulations.

6. **Post-Deployment Monitoring**: Ongoing monitoring is critical to address model drift and ensure that models continue to perform effectively over time. 

7. **Opportunities for Improvement**: Despite challenges, the integration of technology in healthcare offers substantial potential for enhancing patient outcomes.

By fostering technical skills and understanding the complexities of AI in healthcare, professionals can actively contribute to a patient-centered future while navigating the associated risks.