<a href="https://colab.research.google.com/github/faithtinarwo/ai-software-crafting-lab/blob/main/Part_2_Task_1_AI_Powered_Code_Completion.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# Part 2: Practical Implementation (60%)
# Task 1: AI-Powered Code Completion

# Objective:
# Write a Python function to sort a list of dictionaries by a specific key.
# Compare the AI-suggested code with your manual implementation.
# Document which version is more efficient and why.

# --- Manual Implementation ---
# This version explicitly defines a lambda function for the key argument
# and relies on Python's built-in `sort()` method or `sorted()` function.
# It's a common and idiomatic way to sort in Python.

def sort_list_of_dicts_manual(list_of_dicts, key_to_sort_by, reverse=False):
    """
    Manually sorts a list of dictionaries by a specified key.
    Handles missing keys by placing those dictionaries at the end (for ascending sort)
    or beginning (for descending sort) without raising TypeError.

    Args:
        list_of_dicts (list): The list of dictionaries to sort.
        key_to_sort_by (str): The key in the dictionaries to sort by.
        reverse (bool, optional): If True, sort in descending order. Defaults to False.

    Returns:
        list: A new sorted list of dictionaries.
    """
    if not isinstance(list_of_dicts, list) or not all(isinstance(d, dict) for d in list_of_dicts):
        print("Error: Input must be a list of dictionaries.")
        return []

    # Custom key function to handle None values gracefully.
    # If the key is missing, d.get(key_to_sort_by) returns None.
    # We want None values to sort consistently (e.g., at the end for ascending order).
    # Python's default sort for NoneType fails.
    # Strategy: Return a tuple (is_none, value).
    # If value is None, is_none will be True, placing it after non-None values.
    # If value is not None, is_none will be False, placing it before None values.
    # For actual values, comparison proceeds normally.

    return sorted(list_of_dicts,
                  key=lambda d: (d.get(key_to_sort_by) is None, d.get(key_to_sort_by)),
                  reverse=reverse)


# --- AI-Suggested Implementation (Simulated/Example) ---
# An AI tool like Copilot might suggest a very similar or identical approach,
# or it might provide a slightly more verbose or perhaps less Pythonic version
# depending on its training data and prompt. For this demonstration, we'll
# simulate an "AI-suggested" version that is essentially the same best practice,
# emphasizing that often AI converges on optimal common solutions.
# The key difference might be how explicit it is, or if it adds extra checks.
# In a real scenario, Copilot would often complete `key=lambda d: d[` after you type `sorted(list_of_dicts, key=lambda d: d[`.

def sort_list_of_dicts_ai_suggested(data_list, sort_key, descending=False):
    """
    AI-suggested function to sort a list of dictionaries by a specific key.
    This version might appear if an AI tool quickly completes based on common patterns.
    Handles missing keys gracefully by placing those dictionaries at the end (for ascending sort)
    or beginning (for descending sort).

    Args:
        data_list (list): The list of dictionaries.
        sort_key (str): The dictionary key to sort by.
        descending (bool, optional): True for descending order, False for ascending. Defaults to False.

    Returns:
        list: The sorted list.
    """
    if not all(isinstance(item, dict) for item in data_list):
        print("Warning: List contains non-dictionary items. Sorting might fail or be unexpected.")
        return [] # Or raise an error, depending on desired strictness

    # The AI would likely suggest this standard Pythonic approach due to its prevalence.
    # Modified to handle None values using the same tuple comparison strategy.
    return sorted(data_list,
                  key=lambda item: (item.get(sort_key) is None, item.get(sort_key)),
                  reverse=descending)


# --- Example Usage ---
people = [
    {"name": "Alice", "age": 30, "city": "New York"},
    {"name": "Bob", "age": 25, "city": "Los Angeles"},
    {"name": "Charlie", "age": 35, "city": "Chicago"},
    {"name": "David", "age": 25, "city": "Houston"}
]

print("Original List:")
for p in people:
    print(p)

print("\n--- Manual Implementation ---")
sorted_by_age_manual = sort_list_of_dicts_manual(people, "age")
print("\nSorted by Age (Ascending):")
for p in sorted_by_age_manual:
    print(p)

sorted_by_name_desc_manual = sort_list_of_dicts_manual(people, "name", reverse=True)
print("\nSorted by Name (Descending):")
for p in sorted_by_name_desc_manual:
    print(p)

# Demonstrate sorting by a key that might not exist, now handled gracefully
people_with_zip = [
    {"name": "Alice", "age": 30, "zip_code": "10001"},
    {"name": "Bob", "age": 25}, # No zip_code
    {"name": "Charlie", "age": 35, "zip_code": "60601"},
    {"name": "David", "age": 25, "zip_code": "77001"}
]

print("\n--- Manual Implementation with Missing Key ---")
print("Original List with Some Missing 'zip_code':")
for p in people_with_zip:
    print(p)

sorted_by_zip_manual = sort_list_of_dicts_manual(people_with_zip, "zip_code")
print("\nSorted by 'zip_code' (Ascending, missing values at end):")
for p in sorted_by_zip_manual:
    print(p)

sorted_by_zip_desc_manual = sort_list_of_dicts_manual(people_with_zip, "zip_code", reverse=True)
print("\nSorted by 'zip_code' (Descending, missing values at beginning):")
for p in sorted_by_zip_desc_manual:
    print(p)


print("\n--- AI-Suggested Implementation ---")
sorted_by_age_ai = sort_list_of_dicts_ai_suggested(people, "age")
print("\nSorted by Age (Ascending):")
for p in sorted_by_age_ai:
    print(p)

sorted_by_city_desc_ai = sort_list_of_dicts_ai_suggested(people, "city", descending=True)
print("\nSorted by City (Descending):")
for p in sorted_by_city_desc_ai:
    print(p)

# Example with potential error handling (mixed list)
mixed_list = [{"name": "Eve"}, 123, {"name": "Frank"}]
print("\nTesting AI-Suggested with mixed list:")
sort_list_of_dicts_ai_suggested(mixed_list, "name")

# --- Analysis (200-word analysis) ---
"""
Analysis: Comparing Manual vs. AI-Suggested Code for Sorting Dictionaries (Updated)

Both the "manual" and "AI-suggested" implementations for sorting a list of dictionaries now gracefully handle cases where the specified sorting key might be missing in some dictionaries. The previous `TypeError: '<' not supported between instances of 'NoneType' and 'NoneType'` occurred because `d.get(key)` returns `None` for missing keys, and Python's default sort cannot directly compare `None` values.

The updated solution leverages a common Python idiom for robust sorting: `key=lambda d: (d.get(key) is None, d.get(key))`. This creates a tuple for each element's sort key. Python's tuple comparison works element by element. `d.get(key) is None` evaluates to `True` if the key is missing (or its value is `None`), and `False` otherwise. Since `False` sorts before `True`, dictionaries with actual values for the key will sort before those with missing keys. The second element of the tuple, `d.get(key)`, then handles the actual value comparison for non-None items. For `None` items, `(True, None)` will correctly compare against other `(True, None)` tuples without error and retain their original relative order (stable sort).

In terms of computational efficiency, this modification does not change the core O(N log N) time complexity of Python's `sorted()` function. It merely refines the comparison logic within that framework, ensuring robustness. The AI's strength continues to be its ability to quickly suggest or complete such robust, Pythonic patterns, accelerating development and reducing debugging time related to edge cases like missing dictionary keys. Both versions now exhibit the same high efficiency and correctness for this task.
"""


Original List:
{'name': 'Alice', 'age': 30, 'city': 'New York'}
{'name': 'Bob', 'age': 25, 'city': 'Los Angeles'}
{'name': 'Charlie', 'age': 35, 'city': 'Chicago'}
{'name': 'David', 'age': 25, 'city': 'Houston'}

--- Manual Implementation ---

Sorted by Age (Ascending):
{'name': 'Bob', 'age': 25, 'city': 'Los Angeles'}
{'name': 'David', 'age': 25, 'city': 'Houston'}
{'name': 'Alice', 'age': 30, 'city': 'New York'}
{'name': 'Charlie', 'age': 35, 'city': 'Chicago'}

Sorted by Name (Descending):
{'name': 'David', 'age': 25, 'city': 'Houston'}
{'name': 'Charlie', 'age': 35, 'city': 'Chicago'}
{'name': 'Bob', 'age': 25, 'city': 'Los Angeles'}
{'name': 'Alice', 'age': 30, 'city': 'New York'}

--- Manual Implementation with Missing Key ---
Original List with Some Missing 'zip_code':
{'name': 'Alice', 'age': 30, 'zip_code': '10001'}
{'name': 'Bob', 'age': 25}
{'name': 'Charlie', 'age': 35, 'zip_code': '60601'}
{'name': 'David', 'age': 25, 'zip_code': '77001'}

Sorted by 'zip_code' (Ascending, 

'\nAnalysis: Comparing Manual vs. AI-Suggested Code for Sorting Dictionaries (Updated)\n\nBoth the "manual" and "AI-suggested" implementations for sorting a list of dictionaries now gracefully handle cases where the specified sorting key might be missing in some dictionaries. The previous `TypeError: \'<\' not supported between instances of \'NoneType\' and \'NoneType\'` occurred because `d.get(key)` returns `None` for missing keys, and Python\'s default sort cannot directly compare `None` values.\n\nThe updated solution leverages a common Python idiom for robust sorting: `key=lambda d: (d.get(key) is None, d.get(key))`. This creates a tuple for each element\'s sort key. Python\'s tuple comparison works element by element. `d.get(key) is None` evaluates to `True` if the key is missing (or its value is `None`), and `False` otherwise. Since `False` sorts before `True`, dictionaries with actual values for the key will sort before those with missing keys. The second element of the tuple, `

In [None]:
!pip install selenium


Collecting selenium
  Downloading selenium-4.33.0-py3-none-any.whl.metadata (7.5 kB)
Collecting trio~=0.30.0 (from selenium)
  Downloading trio-0.30.0-py3-none-any.whl.metadata (8.5 kB)
Collecting trio-websocket~=0.12.2 (from selenium)
  Downloading trio_websocket-0.12.2-py3-none-any.whl.metadata (5.1 kB)
Collecting typing_extensions~=4.13.2 (from selenium)
  Downloading typing_extensions-4.13.2-py3-none-any.whl.metadata (3.0 kB)
Collecting outcome (from trio~=0.30.0->selenium)
  Downloading outcome-1.3.0.post0-py2.py3-none-any.whl.metadata (2.6 kB)
Collecting wsproto>=0.14 (from trio-websocket~=0.12.2->selenium)
  Downloading wsproto-1.2.0-py3-none-any.whl.metadata (5.6 kB)
Downloading selenium-4.33.0-py3-none-any.whl (9.4 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m9.4/9.4 MB[0m [31m48.8 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading trio-0.30.0-py3-none-any.whl (499 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m499.2/499.2 kB[0m [31m26.

In [None]:
# Part 2: Practical Implementation (60%)
# Task 2: Automated Testing with AI

# Objective:
# Automate a test case for a login page (valid/invalid credentials).
# Run the test and capture results (success/failure rates).
# Explain how AI improves test coverage compared to manual testing.

# Deliverable: Test script + screenshot of results + 150-word summary.
# Note: This code is designed to be run in an environment where Selenium WebDriver
# and a browser driver (e.g., ChromeDriver) are installed and configured.
# You would need to replace 'your_login_page_url' with an actual URL.

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException, NoSuchElementException
from selenium.webdriver.chrome.service import Service # Import Service for cleaner driver setup
import time # For demonstration, in real tests use explicit waits more robustly

class LoginPageTester:
    def __init__(self, driver_path='path/to/your/chromedriver', login_url='http://your-test-login-page.com/login'):
        """
        Initializes the WebDriver for testing.
        Args:
            driver_path (str): Path to the browser WebDriver executable (e.g., chromedriver.exe).
                               If chromedriver is in PATH, this can be left as default.
            login_url (str): The URL of the login page to test.
        """
        print("Initializing WebDriver (make sure chromedriver is in your PATH or specified).")
        self.driver = None
        self.results = {"success": 0, "failure": 0, "details": []}
        self.login_url = login_url

        try:
            # Configure Chrome options for robustness, especially in automated environments
            chrome_options = webdriver.ChromeOptions()
            # Often needed in Docker/CI/headless environments to prevent "user data directory in use" errors
            chrome_options.add_argument('--no-sandbox')
            chrome_options.add_argument('--disable-dev-shm-usage')
            # For headless execution (no browser UI), uncomment the next line:
            # chrome_options.add_argument('--headless')

            # Initialize the Chrome WebDriver
            # If driver_path is not specified, Selenium will look for chromedriver in system PATH
            if driver_path and driver_path != 'path/to/your/chromedriver':
                service = Service(driver_path)
                self.driver = webdriver.Chrome(service=service, options=chrome_options)
            else:
                self.driver = webdriver.Chrome(options=chrome_options)

            self.driver.implicitly_wait(5) # Implicit wait for elements to appear
            print(f"WebDriver initialized. Testing URL: {self.login_url}")
        except Exception as e:
            print(f"Error initializing WebDriver: {e}")
            print("Please ensure you have Chrome browser installed and chromedriver in your PATH or provided its path correctly.")
            print("If running in a constrained environment (e.g., Docker, CI/CD), --no-sandbox and --disable-dev-shm-usage arguments are crucial.")

    def _navigate_to_login_page(self):
        """Navigates to the login page."""
        if self.driver:
            try:
                self.driver.get(self.login_url)
                WebDriverWait(self.driver, 10).until(
                    EC.presence_of_element_located((By.ID, "username")) # Assuming input fields have these IDs
                )
                print(f"Navigated to {self.login_url}")
                return True
            except TimeoutException:
                print(f"Timeout while navigating to {self.login_url} or elements not found.")
                return False
            except Exception as e:
                print(f"Error navigating to page: {e}")
                return False
        return False

    def _login(self, username, password):
        """
        Performs a login attempt.
        Returns: True if login elements are found and interacted with, False otherwise.
        """
        if not self.driver: return False
        try:
            # Use more robust selectors if IDs are not stable, e.g., By.NAME, By.CSS_SELECTOR
            username_field = self.driver.find_element(By.ID, "username")
            password_field = self.driver.find_element(By.ID, "password")
            login_button = self.driver.find_element(By.ID, "loginButton")

            username_field.clear()
            password_field.clear()
            username_field.send_keys(username)
            password_field.send_keys(password)
            login_button.click()
            print(f"Attempted login with user: {username}, pass: {'*' * len(password)}")
            return True
        except NoSuchElementException as e:
            print(f"Could not find login elements: {e}")
            return False
        except Exception as e:
            print(f"Error during login attempt: {e}")
            return False

    def test_valid_login(self, username, password, expected_success_url_part="dashboard"):
        """Tests a valid login scenario."""
        print(f"\n--- Running Test: Valid Login for {username} ---")
        if not self._navigate_to_login_page():
            self.results["failure"] += 1
            self.results["details"].append({"test": "valid_login", "status": "FAIL", "reason": "Navigation failed"})
            return

        if not self._login(username, password):
            self.results["failure"] += 1
            self.results["details"].append({"test": "valid_login", "status": "FAIL", "reason": "Login action failed"})
            return

        try:
            WebDriverWait(self.driver, 10).until(
                EC.url_contains(expected_success_url_part)
            )
            print(f"Valid Login Success: User '{username}' successfully logged in.")
            self.results["success"] += 1
            self.results["details"].append({"test": "valid_login", "status": "PASS"})
        except TimeoutException:
            print(f"Valid Login Failed: Did not redirect to '{expected_success_url_part}' or timeout.")
            self.results["failure"] += 1
            self.results["details"].append({"test": "valid_login", "status": "FAIL", "reason": "Timeout/Wrong redirect"})
        except Exception as e:
            print(f"Valid Login Failed due to an unexpected error: {e}")
            self.results["failure"] += 1
            self.results["details"].append({"test": "valid_login", "status": "FAIL", "reason": f"Unexpected error: {e}"})

    def test_invalid_login(self, username, password, expected_error_message_id="error-message"):
        """Tests an invalid login scenario."""
        print(f"\n--- Running Test: Invalid Login for {username} ---")
        if not self._navigate_to_login_page():
            self.results["failure"] += 1
            self.results["details"].append({"test": "invalid_login", "status": "FAIL", "reason": "Navigation failed"})
            return

        if not self._login(username, password):
            self.results["failure"] += 1
            self.results["details"].append({"test": "invalid_login", "status": "FAIL", "reason": "Login action failed"})
            return

        try:
            WebDriverWait(self.driver, 10).until(
                EC.presence_of_element_located((By.ID, expected_error_message_id))
            )
            error_message = self.driver.find_element(By.ID, expected_error_message_id).text
            print(f"Invalid Login Success: Error message '{error_message}' found as expected.")
            self.results["success"] += 1
            self.results["details"].append({"test": "invalid_login", "status": "PASS"})
        except TimeoutException:
            print(f"Invalid Login Failed: Error message '{expected_error_message_id}' not found or timeout.")
            self.results["failure"] += 1
            self.results["details"].append({"test": "invalid_login", "status": "FAIL", "reason": "Timeout/Error message not found"})
        except Exception as e:
            print(f"Invalid Login Failed due to an unexpected error: {e}")
            self.results["failure"] += 1
            self.results["details"].append({"test": "invalid_login", "status": "FAIL", "reason": f"Unexpected error: {e}"})

    def get_results(self):
        """Returns the accumulated test results."""
        return self.results

    def close(self):
        """Closes the browser."""
        if self.driver:
            print("Closing WebDriver.")
            self.driver.quit()

# --- Main execution block for running the tests ---
if __name__ == "__main__":
    # IMPORTANT: Replace 'http://your-test-login-page.com/login' with an actual login page URL
    # Replace 'path/to/your/chromedriver' with the actual path to your chromedriver executable
    # Download chromedriver from: https://chromedriver.chromium.org/downloads
    # And make sure its version matches your Chrome browser version.

    # For a simple, public test page, you might consider something like:
    # login_page_url = "https://www.demoblaze.com/index.html" # A demo site, but may not have predictable element IDs

    # Mock URL for demonstration purposes since I cannot access external sites
    mock_login_url = "http://example.com/login" # Replace with your actual target URL!

    tester = LoginPageTester(login_url=mock_login_url)

    if tester.driver: # Only proceed if WebDriver initialized successfully
        try:
            print("\n--- Running Test Cases ---")

            # Test Case 1: Valid Credentials
            tester.test_valid_login("correct_user", "correct_password", expected_success_url_part="success_page")
            time.sleep(1) # Give some time for observations

            # Test Case 2: Invalid Credentials
            tester.test_invalid_login("wrong_user", "wrong_password", expected_error_message_id="error-message")
            time.sleep(1)

            # Test Case 3: Invalid Username, Valid Password (example)
            tester.test_invalid_login("non_existent_user", "correct_password", expected_error_message_id="error-message")
            time.sleep(1)

            # Test Case 4: Valid Username, Invalid Password (example)
            tester.test_invalid_login("correct_user", "wrong_password", expected_error_message_id="error-message")
            time.sleep(1)

        finally:
            final_results = tester.get_results()
            print("\n--- Test Results Summary ---")
            print(f"Total Tests Run: {final_results['success'] + final_results['failure']}")
            print(f"Successful Tests: {final_results['success']}")
            print(f"Failed Tests: {final_results['failure']}")
            print("\nDetails:")
            for detail in final_results["details"]:
                print(f"- Test: {detail['test']}, Status: {detail['status']}, Reason: {detail.get('reason', 'N/A')}")

            # In a real scenario, you would integrate a screenshot capture here
            # E.g., self.driver.save_screenshot("test_results_summary.png")
            print("\nNote: A screenshot capturing these results would typically be saved here.")
            tester.close() # Ensure driver is closed even if an error occurs during tests
    else:
        print("\nAutomated tests could not be run due to WebDriver initialization failure. Please check the error message above for details.")


Initializing WebDriver (make sure chromedriver is in your PATH or specified).
Error initializing WebDriver: Message: session not created: probably user data directory is already in use, please specify a unique value for --user-data-dir argument, or don't use --user-data-dir
Stacktrace:
#0 0x5a6a85ffa45a <unknown>
#1 0x5a6a85a9f760 <unknown>
#2 0x5a6a85ada0d8 <unknown>
#3 0x5a6a85ad52cf <unknown>
#4 0x5a6a85b258d6 <unknown>
#5 0x5a6a85b24f96 <unknown>
#6 0x5a6a85b16c23 <unknown>
#7 0x5a6a85ae34a5 <unknown>
#8 0x5a6a85ae4111 <unknown>
#9 0x5a6a85fbef1b <unknown>
#10 0x5a6a85fc2e19 <unknown>
#11 0x5a6a85fa5ac9 <unknown>
#12 0x5a6a85fc39c8 <unknown>
#13 0x5a6a85f8a34f <unknown>
#14 0x5a6a85fe7a28 <unknown>
#15 0x5a6a85fe7c06 <unknown>
#16 0x5a6a85ff9336 <unknown>
#17 0x7fe1eb0adac3 <unknown>

Please ensure you have Chrome browser installed and chromedriver in your PATH or provided its path correctly.
If running in a constrained environment (e.g., Docker, CI/CD), --no-sandbox and --disable-

In [None]:
# Part 2: Practical Implementation (60%)
# Task 3: Predictive Analytics for Resource Allocation
# Goal: Use Kaggle Breast Cancer Dataset to predict issue priority (high/medium/low).
# (Note: The Breast Cancer Dataset is a binary classification dataset. We will simulate
# "issue priority" by mapping the malignant/benign outcome to high/low priority for demonstration.
# For a true multi-class "issue priority" prediction, a different dataset with priority labels
# would be needed. This serves as an example of classification for resource allocation.)

# Deliverable: Jupyter Notebook + performance metrics.

import pandas as pd
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, f1_score, classification_report
from sklearn.preprocessing import LabelEncoder
import numpy as np

# --- 1. Load and Explore Data ---
# The Breast Cancer Wisconsin (Diagnostic) dataset is included in scikit-learn.
# It's a classic dataset for binary classification.
print("Loading Breast Cancer Dataset...")
cancer = load_breast_cancer()
X = pd.DataFrame(cancer.data, columns=cancer.feature_names)
y = pd.Series(cancer.target) # 0 for malignant, 1 for benign

print("Dataset Loaded. Features (X) shape:", X.shape)
print("Target (y) shape:", y.shape)
print("\nFirst 5 rows of Features (X):")
print(X.head())
print("\nTarget unique values (0: malignant, 1: benign):", y.unique())
print("Target value counts:\n", y.value_counts())

# --- 2. Preprocess Data ---
# The dataset is relatively clean, but we need to map the target to "issue priority"
# and split into training and testing sets.

# For the purpose of this assignment, we'll map:
# Malignant (0) -> High Priority
# Benign (1)    -> Low Priority
# This is a simplification to fit the "issue priority" theme.
# In a real scenario, you'd have actual issue priority labels.

# Create a mapping for priority
priority_map = {0: 'High', 1: 'Low'}
y_priority = y.map(priority_map)

print("\nTransformed Target (y_priority) value counts:\n", y_priority.value_counts())

# Split the dataset into training and testing sets
# Stratify ensures that the proportion of target labels is similar in train and test sets
X_train, X_test, y_train, y_test = train_test_split(
    X, y_priority, test_size=0.3, random_state=42, stratify=y_priority
)

print(f"\nData Split:")
print(f"X_train shape: {X_train.shape}, y_train shape: {y_train.shape}")
print(f"X_test shape: {X_test.shape}, y_test shape: {y_test.shape}")
print(f"y_train value counts:\n{y_train.value_counts()}")
print(f"y_test value counts:\n{y_test.value_counts()}")


# --- 3. Train a Model (Random Forest Classifier) ---
print("\nTraining Random Forest Classifier...")
# Initialize the Random Forest Classifier
# random_state for reproducibility
model = RandomForestClassifier(n_estimators=100, random_state=42, class_weight='balanced')
# 'class_weight='balanced'' can help with potential class imbalance,
# though this dataset is not severely imbalanced for this binary classification.

# Train the model
model.fit(X_train, y_train)
print("Model training complete.")

# --- 4. Evaluate Model Performance ---
print("\nEvaluating Model Performance...")

# Make predictions on the test set
y_pred = model.predict(X_test)

# Calculate Accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.4f}")

# Calculate F1-score
# For multi-class (if we had High/Medium/Low as true distinct classes),
# 'weighted' F1-score is often appropriate as it accounts for class imbalance.
# Since we mapped to binary, 'binary' or 'weighted' can be used.
# 'pos_label' needed for 'binary' F1 if target is not 0/1.
# Here we'll use 'weighted' to be generalizable to more classes if we extend this.
f1 = f1_score(y_test, y_pred, average='weighted')
print(f"F1-Score (Weighted): {f1:.4f}")

# Generate a detailed classification report
print("\nClassification Report:")
print(classification_report(y_test, y_pred))

# --- Feature Importance (Bonus Insight) ---
# Random Forest can also tell us which features were most important for prediction.
print("\nFeature Importances (Top 10):")
feature_importances = pd.Series(model.feature_importances_, index=X.columns)
print(feature_importances.nlargest(10))

# --- Interpretation for Resource Allocation ---
print("\n--- Interpretation for Resource Allocation ---")
print("This model, trained on features analogous to 'symptoms' of an 'issue',")
print("can predict if an 'issue' (e.g., a bug report, a system alert) is 'High Priority' (malignant) or 'Low Priority' (benign).")
print("\nPotential Use Cases:")
print("- Automatically triage incoming bug reports based on their characteristics.")
print("- Allocate senior engineers to 'High Priority' issues and junior engineers to 'Low Priority' issues.")
print("- Prioritize monitoring alerts, escalating 'High Priority' ones to on-call teams.")
print("- Automate responses for 'Low Priority' issues (e.g., sending an FAQ link).")
print(f"\nThe model achieved an accuracy of {accuracy:.2f} and a weighted F1-score of {f1:.2f},")
print("indicating good performance in distinguishing between 'High' and 'Low' priority issues based on the given features.")
print("The classification report provides detailed precision, recall, and F1-scores for each priority class.")


Loading Breast Cancer Dataset...
Dataset Loaded. Features (X) shape: (569, 30)
Target (y) shape: (569,)

First 5 rows of Features (X):
   mean radius  mean texture  mean perimeter  mean area  mean smoothness  \
0        17.99         10.38          122.80     1001.0          0.11840   
1        20.57         17.77          132.90     1326.0          0.08474   
2        19.69         21.25          130.00     1203.0          0.10960   
3        11.42         20.38           77.58      386.1          0.14250   
4        20.29         14.34          135.10     1297.0          0.10030   

   mean compactness  mean concavity  mean concave points  mean symmetry  \
0           0.27760          0.3001              0.14710         0.2419   
1           0.07864          0.0869              0.07017         0.1812   
2           0.15990          0.1974              0.12790         0.2069   
3           0.28390          0.2414              0.10520         0.2597   
4           0.13280          0.19