## Part 1: Theoretical Analysis (30%)
## **Group Members**
1. Gathigi Moses Muiruri - gathimoses@gmail.com
2. odongo lsaiah - odongoreagan19@gmail.com
3. Keren Hapuch Ntinyari - karenhapuch68@gmail.com
4. Jebichii  Joyce - jebichiijoyce@gmail.com
5. Palpable Smart - palpable237@gmail.com
## Task 1
### Q1: How AI-Driven Code Generation Tools Reduce Development Time — and Their Limitations

**Time-Saving Benefits:**
- **Contextual Suggestions:** Tools like GitHub Copilot provide relevant code based on current context, reducing manual effort.
- **Faster Prototyping:** Developers can quickly scaffold and iterate on ideas.
- **Learning Support:** New developers benefit from seeing idiomatic code examples in real time.

**Limitations:**
- **Security & Quality Risks:** AI-generated code may contain vulnerabilities or inefficient logic.
- **Context Blindness:** Lacks full understanding of project-specific goals or constraints.
- **Inherited Bias:** May replicate outdated or non-compliant code from public datasets.

---

### Q2: Supervised vs. Unsupervised Learning for Automated Bug Detection

| Feature                  | Supervised Learning                                         | Unsupervised Learning                                       |
|--------------------------|--------------------------------------------------------------|--------------------------------------------------------------|
| **Data Input**           | Labeled (buggy vs. clean code)                              | Unlabeled code samples                                       |
| **Common Techniques**    | Classification using decision trees, SVMs, or neural nets   | Clustering, anomaly detection (e.g., K-Means, Isolation Forests) |
| **Detects**              | Known bug patterns                                           | Anomalous or novel issues                                    |
| **Usefulness**           | High precision with known data                              | Good for discovering unknown or rare bugs                    |

---

### Q3: Why Bias Mitigation Is Critical in AI-Driven UX Personalization

- **Fairness:** Prevents discrimination or exclusion of minority user groups.
- **Trust Building:** Users are more likely to engage with systems that treat them equitably.
- **Regulatory Compliance:** Mitigates legal risks under privacy and anti-bias laws.
- **Ethical Design:** Ensures inclusive access and diverse representation in personalized content.



## Task 2: Case Study Analysis

### How AIOps Improves Software Deployment Efficiency

AIOps (Artificial Intelligence for IT Operations) leverages machine learning, predictive analytics, and automation to streamline the deployment process. It enhances both speed and reliability by minimizing manual intervention and proactively managing issues.

---

### Example 1: Predictive Deployment and Automated Rollbacks

AIOps platforms analyze historical deployment patterns and system behavior to forecast potential issues **before** code reaches production.  
- If anomalies or performance degradation are detected, AIOps can **automatically trigger rollbacks**, ensuring service continuity.
- _Example:_ AI-driven observability tools integrated with Kubernetes can detect failures post-release and revert to a previous stable version.

---

### Example 2: Intelligent Build Optimization

By analyzing usage trends and pipeline telemetry, AIOps can allocate build resources more effectively, skipping redundant steps and reducing overhead.
- Techniques like **intelligent caching**, failure pattern recognition, and dynamic resource scaling shorten build times.
- _Example:_ Cloud CI/CD tools that use AIOps automatically prioritize faster builds based on historical success rates and project complexity.

---


## Part 2

###  Manual Implementation


In [1]:
# Manual function to sort list of dictionaries by key
def sort_dicts_by_key_manual(data, key):
    for i in range(len(data)):
        for j in range(0, len(data) - i - 1):
            if data[j].get(key) > data[j + 1].get(key):
                data[j], data[j + 1] = data[j + 1], data[j]
    return data

# Example list of dictionaries
people = [
    {"name": "Joyce", "age": 34},
    {"name": "Odongo", "age": 28},
    {"name": "palpable", "age": 40}
]

# Sort by the key 'age'
sorted_people = sort_dicts_by_key_manual(people, "age")

# Print the sorted list
print("Sorted by age (manual sort):")
for person in sorted_people:
    print(person)



Sorted by age (manual sort):
{'name': 'Odongo', 'age': 28}
{'name': 'Joyce', 'age': 34}
{'name': 'palpable', 'age': 40}


### AI-Suggested Code (GitHub Copilot / Tabnine)

In [2]:
# Copilot/Tabnine-suggested function
def sort_dicts_by_key(data, key):
    return sorted(data, key=lambda x: x.get(key))

# Example list of dictionaries
people = [
    {"name": "Joyce", "age": 34},
    {"name": "Odongo", "age": 28},
    {"name": "palpable", "age": 40}
]

# Sort by the key 'age'
sorted_people = sort_dicts_by_key(people, "age")

# Print the sorted list
print("Sorted by age:")
for person in sorted_people:
    print(person)



Sorted by age:
{'name': 'Odongo', 'age': 28}
{'name': 'Joyce', 'age': 34}
{'name': 'palpable', 'age': 40}


##  AI-Suggested vs. Manual Sorting Analysis

The AI-generated function uses Python’s efficient `sorted()` with Timsort (O(n log n)), making it fast, scalable, and clean. It emphasizes best practices through readable and maintainable code.

The manual version uses bubble sort (O(n²)), which is easier to grasp for beginners but inefficient for large datasets. It’s more prone to logic errors and harder to maintain.

While both yield correct results for small inputs, the AI-suggested approach is significantly **more efficient and practical** for real-world use.


## Task 2: Automated Testing with AI

![selenium.PNG](attachment:6f7fd92e-6c47-4217-b107-e06c82b7d2ea.PNG)

### Test Results Overview

| Scenario                               | Expected Outcome        | Actual Outcome         | Status     |
|----------------------------------------|--------------------------|------------------------|------------|
| Valid username & password              | Successful login         | Dashboard loaded       | ✅ Passed  |
| Invalid username & valid password      | Login failed             | Error message shown    | ✅ Passed  |
| Valid username & invalid password      | Login failed             | Error message shown    | ✅ Passed  |
| Empty username & password              | Login blocked            | Error message shown    | ✅ Passed  |


## 🤖 How AI Enhances Test Coverage Compared to Manual Testing

AI-driven testing tools significantly improve software quality by automating and expanding the scope of test coverage. Here's how:

---

### 🔍 1. Automated Test Case Generation
- AI analyzes source code, user interactions, and historical bug data to generate large volumes of relevant test scenarios.
- It detects edge cases and rare usage paths that manual testers often overlook.

---

### 🔁 2. Self-Healing Tests
- When UI elements change, AI tools (like Testim or Mabl) automatically adapt, minimizing false positives and broken tests.
- This reduces the time spent on test maintenance and increases test stability.

---

### 🎯 3. Intelligent Coverage Optimization
- AI focuses on untested paths, maximizing **branch** and **decision coverage**.
- It can prioritize tests using risk-based algorithms that consider code churn and historical defect patterns.

---

### ⏱️ 4. Accelerated Regression Testing
- AI clusters similar test cases and flags anomalies with high speed, making full-suite regression testing feasible within tight CI/CD cycles.

---

### 🗣️ 5. Natural Language Testing
- AI tools can convert user stories or bug reports into executable test cases using NLP.
- This bridges the gap between business requirements and automated scripts.

---

## 🧠 Manual vs AI-Powered Testing Comparison

| Feature                    | Manual Testing             | AI-Powered Testing                      |
|---------------------------|----------------------------|-----------------------------------------|
| Test Volume               | Limited by human effort    | Scales infinitely with compute power    |
| Edge Case Detection       | Based on intuition         | Driven by data and behavior analysis    |
| Maintenance               | Manual script updates      | Self-healing based on DOM/UX patterns   |
| Coverage Tracking         | Manual tracking            | Automated path & risk-based coverage    |
| Speed                     | Slower                     | Instant test generation & execution     |

---

## ✅ Summary
AI enhances both the **breadth** and **depth** of test coverage while reducing human effort. By augmenting traditional QA workflows, it enables faster releases and more reliable software—all while freeing testers to focus on exploratory and user-centric testing.




# Task 3: Predictive Analytics for Resource Allocation

# 🔍 Predictive Analytics for Resource Allocation

## 📊 Objective
Develop a machine learning model to predict issue priority levels—**High**, **Medium**, or **Low**—to guide smarter resource allocation in healthcare-related contexts.

## 📂 Dataset
- **Source:** Kaggle Breast Cancer Dataset
- **Rationale:** Chosen for its comprehensive diagnostic features, making it suitable for modeling complex decision tasks.

## 🛠️ Methodology

### 1. Preprocessing
- Cleaned the dataset to handle missing values and irrelevant features.
- Labeled the target variable with issue priorities: `High`, `Medium`, `Low`.
- Split the dataset into training and testing sets (e.g., 80/20 split).

### 2. Model Training
- Utilized a **Random Forest Classifier** for its effectiveness with tabular data and handling class imbalances.

### 3. Evaluation
- **F1-score:** `0.83`, indicating strong precision-recall balance across classes.
- Accuracy considered, but emphasis placed on F1-score due to class imbalance.
![score.PNG](attachment:c882342b-b587-49d8-8bde-636557e4bdac.PNG)

## ✅ Outcome
The trained model reliably predicts issue priority levels, offering a viable tool for supporting **efficient decision-making** in resource-limited healthcare environments. The strong F1 performance (0.83) highlights its potential for integration into operational workflows.

---




# ⚖️ Part 3: Ethical Reflection (10%)

## 📌 Scenario

In Task 3, a Random Forest model was trained using the **Kaggle Breast Cancer Dataset** to predict **issue priority levels (High, Medium, Low)** within a company context. While the dataset offers structured clinical features, its real-world application in corporate workflows introduces important ethical concerns.

---

## 🚨 Potential Biases in the Dataset

### 🔻 Class Imbalance
- If the priority labels are imbalanced (e.g., far more “low” than “high” cases), the model may **underperform on critical cases**, failing to elevate truly urgent issues.

### 👥 Demographic Feature Gaps
- The original breast cancer dataset lacks demographic diversity (e.g., race, region, socioeconomic status). If priority is influenced by underlying demographics in real-world applications, the model risks **overgeneralizing from a narrow population**.

### 🔄 Historical Labeling Bias
- If historical priorities were labeled by subjective human decisions, they may encode bias—such as certain departments or issue types being routinely undervalued. The model could **reinforce these skewed historical judgments**.

---

## 🛠️ Addressing Bias with IBM AI Fairness 360

### What It Can Do:

- **📏 Bias Detection**
  - Evaluate fairness metrics like:
    - Statistical Parity Difference
    - Equal Opportunity Difference
    - Disparate Impact
  - Compare prediction outcomes across user groups (e.g., issue origin, reporter role)

- **⚖️ Preprocessing**
  - Use tools like **Reweighing** to balance underrepresented priority classes or departmental tickets.

- **🧠 In-Processing**
  - Apply techniques like **Adversarial Debiasing** to enforce fairness constraints while training.

- **📤 Post-processing**
  - Adjust prediction probabilities using **Reject Option Classification** to equalize error rates across teams or priority tiers.

---

## ✅ Outcome

By incorporating AIF360, your predictive model can:
- Reduce unjust under-prioritization of certain teams
- Offer transparent, auditable decision paths
- Align with ethical AI principles for enterprise software

> 📣 *Fairness in prediction ensures fairness in action—especially when every ticket could represent a patient, a user, or a mission-critical incident.*



# Bonus Task: Innovation Challenge (Extra 10%)

## Proposal: LegacyLens — AI-Powered Codebase Understanding Assistant

### 🛠 Problem Addressed

One of the biggest hurdles in modern software engineering is **understanding large, undocumented legacy systems**. Engineers spend hours deciphering old code with little to no documentation, which leads to high onboarding time, fragile refactors, and duplicate logic.

### 🎯 Purpose

**LegacyLens** is an AI assistant that reverse-engineers complex legacy codebases into understandable maps, diagrams, and summaries. It accelerates comprehension, refactoring, and documentation by providing semantic overviews and logic flow insights of existing code.

---

### ⚙️ Workflow

1. **Codebase Analysis**  
   - Scans through legacy code repositories (e.g. COBOL, C, Python, Java).
   - Performs static and semantic analysis to detect modules, dependencies, and inheritance chains.

2. **Flow & Feature Mapping**  
   - Visualizes control and data flow.
   - Groups code by functional purpose (e.g., UI logic, DB access, business rules).

3. **Natural Language Summarization**  
   - Generates concise module-level summaries (e.g., “This class performs customer lookup using cached records”).
   - Provides traceable “explanation threads” that walk devs through what happens from input to output.

4. **Refactor Suggestions**  
   - Highlights duplicate or dead code.
   - Offers structure recommendations (e.g., extract method, convert to class, remove unreachable logic).

5. **Collaboration Hooks**  
   - Integrates with IDEs and GitHub/GitLab.
   - Enables devs to ask: “What does this function do?” or “Where is payment status updated?”

---

### 🌍 Potential Impact

- **Saves Weeks of Reverse Engineering**: Useful for onboarding, audits, and modernization.
- **Improves Quality**: Safer refactoring with better context.
- **Preserves Institutional Knowledge**: Captures logic that exists only in developer memory.
- **Cross-Team Collaboration**: Business analysts and junior engineers can navigate legacy systems confidently.

> By translating tangled code into human insight, LegacyLens revives forgotten systems—and helps modernize them with confidence.

