<a href="https://colab.research.google.com/github/Naomie25/DI-Bootcamp/blob/main/Week8_Day4_ExerciceXP.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import pandas as pd

Exercise 1: Comparative Analysis of Generative AI and Traditional AI


1. Automated Medical Diagnosis (Lung Cancer Detection from X-rays)

**Recommendation:** **Traditional AI (Classification Model) + Optional Generative AI for Data Augmentation**

**Justification:**

* **Accuracy & Reliability:** Traditional AI (CNN-based classifiers) are well-validated in medical imaging tasks and are preferred for high-stakes, precision-critical applications like cancer detection.
* **Generative AI Use:** Generative models (GANs, diffusion models) could assist in *data augmentation*—synthesizing additional realistic X-rays to balance datasets, especially for rare cases.
* **Ethical Concerns:** Generating synthetic medical images carries risks of introducing artifacts that could mislead the diagnostic model.
* **Computational Costs:** Generative models are heavier computationally and require rigorous validation to avoid introducing bias or noise.
* **Conclusion:** Use **traditional AI** for the actual diagnosis task, possibly supported by generative AI for enriching training datasets in a controlled manner.

---

2. Legal Document Generation (Drafting Contracts)

Recommendation: Generative AI (Language Models) with Human Oversight**

Justification:

* **Creativity & Flexibility:** Generative AI excels in producing natural language text and can draft varied, context-sensitive documents.
* **Efficiency:** Faster than rule-based systems for creating complex, adaptive documents.
* **Risks:**

  * **Accuracy & Legal Validity:** Language models can hallucinate or generate incorrect legal phrasing without understanding binding clauses.
  * **Ethical Concerns:** Risk of liability if incorrect contracts are generated without proper human review.
* **Conclusion:** **Generative AI models** (e.g., fine-tuned LLMs) are more suitable, but strict **human-in-the-loop validation** is essential to ensure compliance and accuracy.

---

3. AI-Generated Scientific Research (Automated Literature Reviews & Hypothesis Generation)

**Recommendation:** **Generative AI**

**Justification:**

* **Scalability & Efficiency:** Generative AI models like GPT can summarize vast bodies of text quickly and suggest novel hypotheses based on pattern extraction.
* **Creativity:** Generative AI can assist in proposing unconventional research ideas.
* **Risks:**

  * **Factual Accuracy:** Risk of summarizing papers inaccurately or suggesting implausible hypotheses.
  * **Ethical Concerns:** Generated hypotheses must be rigorously vetted by human researchers to prevent propagation of misinformation.
* **Conclusion:** Generative AI is **highly useful** here, especially for summarization and ideation tasks, but outputs must be **critically reviewed by domain experts**.

---

4. Financial Market Predictions (Stock Trend Forecasting)**

**Recommendation:** **Traditional AI (Statistical & Machine Learning Models)**

**Justification:**

* **Accuracy & Predictability:** Time-series models (ARIMA, LSTM, XGBoost) are transparent, interpretable, and optimized for forecasting structured data.
* **Generative AI Limitations:** Generative models are designed for synthesis rather than precise numerical prediction. Using them for direct forecasting lacks explainability and reliability.
* **Computational Costs:** Generative models add unnecessary complexity without clear performance gains in this domain.
* **Conclusion:** Stick with **traditional AI models** for financial forecasting due to their **proven reliability, interpretability, and efficiency**.

---

5. Autonomous Vehicle Decision-Making (Real-Time Driving Decisions)**

**Recommendation:** **Traditional AI (Rule-Based + Reinforcement Learning Models)**

**Justification:**

* **Safety & Reliability:** Autonomous driving demands deterministic, interpretable, and highly reliable systems. Deep reinforcement learning combined with traditional control algorithms provides structured, explainable decision-making.
* **Generative AI Limitations:** Generative models are unsuitable for real-time safety-critical decisions due to unpredictability and lack of explicit constraint adherence.
* **Possible Generative Use:** Generative AI might help in simulating rare driving scenarios for training purposes, but not in direct decision-making.



Exercise 2: Ethical and Security Risks of Generative AI

1. Fake Political Videos (Deepfakes)

Dangers:

- Fake News: People could believe fake videos and vote wrongly, affecting elections.

- No Trust in Real Videos: After seeing many fakes, people might stop trusting real news.

What Could Happen:

- Elections could be unfair, people might lose trust in media, and arguments between political groups could get worse.

How to Stop This:

- Digital Watermarks: Make it a rule that all videos must show if they’re real or AI-made.

- Clear Labels: Force companies to label AI-made videos clearly online.




2. Fake People Used for Crime (Synthetic Identity Fraud)

Dangers:

- Tricking Face & Voice Scans: Criminals can create fake faces and voices that pass security checks.

- Helping Criminal Groups: Fake people are hard to track, helping gangs or terrorists.

What Could Happen:

- Banks could lose money, security systems might fail, and crime could get harder to stop.

How to Stop This:

- Better Security Checks: Use extra steps like asking for video proof that a person is real (not a picture).

- Limit AI Access: Make strict laws so only trusted companies can use powerful AI tools to create faces and voices.

3. Fake War Intelligence (Cyber Warfare)

Dangers:

- Fake Government Reports: AI could create fake reports to trick other countries.

- Nobody Trusts Reports: Countries might stop trusting real intelligence, hurting national safety.

What Could Happen:

- Wars might start by mistake, governments could get tricked, and countries might stop trusting each other.

How to Stop This:

- Double-Check Important Info: Governments should always verify information using multiple sources, including humans.

- Secret Markers: Force countries to mark official documents with hidden digital codes showing they’re real.



4. AI Making Dangerous Viruses (AI-Generated Malware)

Dangers:

- New Viruses: AI can create computer viruses that are harder to catch.

- Finding New Ways to Attack: AI can find security weaknesses faster than humans.

What Could Happen:

- Banks, hospitals, or electricity grids could be attacked without warning.

How to Stop This:

- Smart Defenses: Use AI to detect weird behaviors on systems instead of relying only on virus lists.

- Control AI Tools: Allow only trusted people or companies to use AI tools that generate computer code.

5. AI Copying Art, Music, or Books (Copyright Problems)

Dangers:

- Copying Without Permission: AI can create songs, art, or books that look like someone else’s work.

- Artists Lose Money: People might buy AI-made copies instead of real work from creators.

Who Owns AI Work?: It’s unclear who owns AI-generated content – the AI? The company? The user?

What Could Happen:

- Artists and writers lose money, and creativity is discouraged.

How to Stop This:

- Opt-Out Lists: Let artists choose if their work can be used to train AI models.

- Pay Creators: Make AI companies pay artists a fee when AI copies their style or work.

Exercise 3: Optimization and Fine-Tuning of Generative AI Models

1. Better Prompts (Prompt Engineering)

AI gives better answers when your question (prompt) is clear and detailed. Let’s improve these examples:

- Original: “Generate an image of a futuristic city.”

Better Prompt: “Generate a high-resolution image of a futuristic city at night, with flying cars, tall neon skyscrapers, and glowing roads.”

- Original: “Write a poem about the future.”

Better Prompt: “Write a 4-stanza poem about the future of space travel, using hopeful and inspiring language.”

- Original: “Create a song in the style of classical music.”

Better Prompt: “Write lyrics and a melody outline for a calm, instrumental song in the style of 18th-century classical music, using violins and piano.”

Why? Clear details help the AI know what style, tone, and structure you want.

2. Removing Bias in Training Data
If your AI learns from biased news, it will repeat those biases. To fix this:

- Use Many Sources: Gather training texts from different types of news channels, cultures, and viewpoints.

- Check Data Carefully: Remove or reduce content that shows unfair opinions or stereotypes.

- Balance Topics: Make sure each side of a topic is represented equally.

Goal: Teach the AI from fair, diverse information so it speaks more neutrally.

3. Fine-Tuning for Specific Tasks (Medical Papers Example)
A company wants the AI to write correct medical research. Steps:

- Collect Data: Gather real medical research papers, drug studies, and clinical reports.

- Clean Data: Remove mistakes, duplicates, and unrelated text. Keep only trustworthy sources.

- Transfer Learning: Start with a general AI model, then train it further using the medical papers (this is called fine-tuning).

- Check Output with Metrics: Use special tools to measure how factual and correct the AI’s writing is, like:

  - Precision and Recall: Check if facts are correct and nothing important is missing.

  - Human Experts: Let doctors or scientists read and approve the output.

Goal: Teach the AI to write only scientifically correct papers.



4. Measuring AI Quality (Using Numbers)
AI content is often judged by people, but computers can measure it too:

- BLEU Score: Checks how close AI’s text is to real human writing (used for translations).

- FID (Fréchet Inception Distance): Checks how realistic AI-generated images look.

- Perplexity: Measures how well an AI model predicts the next word (lower perplexity = better understanding).

- BERTScore: Compares AI’s text to real text based on meaning, not just words.

Why? These scores help check AI quality without always needing human reviewers.

5. Controlling AI’s Creativity
Sometimes AI becomes too random or too boring. To control this:

- Temperature Scaling: This adjusts how “creative” the AI is.

  - Low temperature = more serious, focused answers.

  - High temperature = more creative, random results.

- Reinforcement Learning (RLHF): Reward the AI when it gives good answers, to guide its behavior over time.

- Attention Mechanisms: Help the AI focus on important words or parts of a task, improving coherence and meaning in its outputs.

Goal: Balance creativity and logic so the AI is interesting but still makes sense.

🌟 Exercise 4: Evaluating the Trade-offs Between GANs and VAEs

1. Making Fake MRI Scans (Medical Images)

Best Choice: GANs

Why?

- GANs are better at creating sharp and realistic images, which is important for medical training.

- Doctors and researchers need images that look real, not blurry.

- GANs help protect patient privacy by generating fake but realistic MRI scans.

Downside:

- GANs are harder to train and sometimes unstable.

2. AI Writing Short Stories (Creative Text)

Best Choice: Neither GANs nor VAEs directly

Note: GANs and VAEs work better for images. For text, we usually use models like GPT (language models).

But if choosing between VAEs and GANs:

- VAEs would be slightly better because:

  - They are good at learning structured representations (meaning they understand patterns in data).

  - This could help in generating text with some structure.

Why not GANs?

- GANs don’t handle sequences (like sentences) as naturally as images.

3. Finding Fraud in Bank Transactions (Anomaly Detection)

Best Choice: VAEs

Why?

- VAEs are better at learning what “normal” looks like (normal spending behavior).

- They can spot when something looks different or strange (which could be fraud).

- VAEs are easier to interpret and more stable during training.

Why not GANs?

- GANs focus on creating realistic examples, not spotting unusual ones.

4. Making Fashion Designs

Best Choice:  GANs

Why?

- GANs make high-quality, detailed, and realistic images, perfect for fashion designs.

- They’re good at learning from pictures and creating new ones that look like professional designs.

Why not VAEs?

- VAEs often create blurry images, which wouldn’t look good for fashion products.

5. Fake Driving Images for Training Cars (Data for Self-Driving)

Best Choice: GANs

Why?

- Self-driving cars need realistic road images to learn well.

- GANs create more lifelike scenes (roads, cars, weather conditions).

- Real-looking images help the car’s AI learn faster and better.

Why not VAEs?

- VAEs might create simple or blurry scenes that aren’t useful for training.



Exercise 5: Advanced Latent Space Exploration in VAEs

1. Seeing Latent Space (Visualizing Distributions)

Imagine your VAE learned to generate digits like ‘3’ and ‘8’. How do you check if it's organizing things properly inside?

What to Do:

Look at how different digits are placed in the hidden (latent) space — like mapping them on a 2D or 3D graph.

How to Do It:

Use t-SNE or UMAP: These are techniques that help reduce the hidden space to 2D for easy viewing.

Then, color-code each point by the digit (e.g., all ‘3’s in red, ‘8’s in blue).

If the model worked well, each digit should form its own separate cluster.

2. Mixing Two Digits (Interpolation Between Samples)
Goal: Turn a ‘3’ slowly into an ‘8’ in steps.

How?

Take digit ‘3’ and digit ‘8’.

Send both through the encoder to get their positions in the hidden space (latent vectors).

Move step-by-step between these two points (like drawing a straight line between them).

Decode each step to see the images slowly change from ‘3’ to ‘8’.

Why VAEs can do this but GANs can’t easily:

VAEs organize their hidden space smoothly and meaningfully.

GANs don’t have a clear, structured latent space for easy interpolation.

3. Controlling Variety (KL Divergence Term)

What is KL Divergence?

It tells the VAE how much it should spread out its latent space.

If you increase KL divergence weight:

Latent space stays more organized.

Less variety — images look more similar, but cleaner.

If you decrease KL divergence weight:

Latent space spreads out.

More variety — different kinds of images, but more risk of messy or unrealistic results.

Real-World Example:
In fashion design, you might want high variety to create many different styles. But in medical images, you want less variety to generate consistent, reliable scans.

4. Changing Just One Feature (Disentangled Latent Space)

In a good VAE, each part of the hidden vector controls a different feature (e.g., hair color, smile, age).

How to change just one feature:

Find which dimension in the latent vector controls hair color.

Change only that number (for example, add +1 or -1).

Keep other dimensions the same.

Decode the vector — the person’s hair color changes, but their face stays the same.

This is called: disentangling the latent space.