# Introduction

This assessment aims to evaluate your understanding and application of the concepts covered in the Data Analytics course. You will be tasked with analyzing a dataset related to remote work and mental health, utilizing various data manipulation, statistical analysis, and visualization techniques learned throughout the course. This exercise will help reinforce your skills in data handling, exploratory analysis, and drawing meaningful insights from data.

### Submission Details:

The deadline for submission is 16 November at 11:59 PM. Specific submission details will be shared with you shortly.

### Passing Criteria:

To successfully pass this assessment, you must achieve a score of 80% or higher.
We encourage you to engage with the material and demonstrate your analytical skills. Good luck!


---



# Section 1 - Beginner (25%)


## Shopping Cart System with Discounts

Write a Python program to simulate a shopping cart system for an online store. The program will calculate the total cost of items, apply discounts, and check if the total exceeds a specified budget.

1.	Variables and Lists:
  - Define a `budget` variable with an initial value of 200.
  
  - Create two empty lists called `item_names_list` and `item_prices_list` to store the name and price of each item separately.

In [None]:
# Write your code here
class ShoppingCart:
    def __init__(self, budget):
        """
        Initialize the shopping cart with an empty list and a budget.
        """
        self.cart = []  # List to store items in the cart
        self.budget = budget  # User's budget

    def add_item(self, item_name, price, quantity=1):
        """
        Add an item to the shopping cart.
        """
        self.cart.append({"item_name": item_name, "price": price, "quantity": quantity})
        print(f"Added {quantity} x {item_name} to the cart.")

    def view_cart(self):
        """
        Display all items in the shopping cart.
        """
        if not self.cart:
            print("Your cart is empty.")
            return
        print("\nShopping Cart Contents:")
        for item in self.cart:
            print(f"{item['quantity']} x {item['item_name']} @ ${item['price']:.2f} each")
        print("\n")

    def calculate_total(self):
        """
        Calculate the total cost of items in the cart.
        """
        total = sum(item["price"] * item["quantity"] for item in self.cart)
        return total

    def apply_discount(self, discount_percent):
        """
        Apply a discount to the total cost and return the discounted total.
        """
        total = self.calculate_total()
        discount = total * (discount_percent / 100)
        print(f"A discount of {discount_percent}% has been applied. You saved ${discount:.2f}.")
        return total - discount

    def check_budget(self, discounted_total):
        """
        Check if the total cost exceeds the user's budget.
        """
        if discounted_total > self.budget:
            print(f"Warning: Your total (${discounted_total:.2f}) exceeds your budget of ${self.budget:.2f}.")
        else:
            print(f"Your total (${discounted_total:.2f}) is within your budget of ${self.budget:.2f}.")

def main():
    """
    Main function to run the shopping cart program.
    """
    print("Welcome to the Shopping Cart System!")
    budget = float(input("Enter your budget: "))
    cart = ShoppingCart(budget)  # Create a shopping cart instance with the specified budget

    while True:
        # Display menu options
        print("\nOptions:")
        print("1. Add item to cart")
        print("2. View cart")
        print("3. Calculate total")
        print("4. Apply discount")
        print("5. Check budget")
        print("6. Exit")

        choice = input("Choose an option: ")

        if choice == "1":
            # Add an item to the cart
            item_name = input("Enter the item name: ")
            price = float(input("Enter the item price: "))
            quantity = int(input("Enter the quantity: "))
            cart.add_item(item_name, price, quantity)
        elif choice == "2":
            # View the cart contents
            cart.view_cart()
        elif choice == "3":
            # Calculate and display the total cost
            total = cart.calculate_total()
            print(f"The total cost of items in your cart is: ${total:.2f}")
        elif choice == "4":
            # Apply a discount to the total cost
            discount = float(input("Enter the discount percentage: "))
            discounted_total = cart.apply_discount(discount)
        elif choice == "5":
            # Check if the total is within the budget
            if 'discounted_total' in locals():
                cart.check_budget(discounted_total)
            else:
                total = cart.calculate_total()
                cart.check_budget(total)
        elif choice == "6":
            # Exit the program
            print("Thank you for shopping with us!")
            break
        else:
            # Handle invalid input
            print("Invalid option. Please try again.")

if __name__ == "__main__":
    main()


Welcome to the Shopping Cart System!
Enter your budget: 200

Options:
1. Add item to cart
2. View cart
3. Calculate total
4. Apply discount
5. Check budget
6. Exit
Choose an option: 1


2. Functions:
  - Write a function `add_item_to_cart(item_name, item_price)` that takes the item’s name and price as arguments, appends the name to item_names and the price to item_prices, and returns both updated lists.
  
  - Write a function `calculate_total(item_prices)` that calculates and returns the total cost of all items in item_prices.

    Conditions:
    - If the total cost exceeds the budget after adding an item, print "Budget exceeded!" and stop adding more items.
    - If the total cost is within budget and exceeds $100, apply a 10% discount on the total and print the discounted total.

In [2]:
def add_item_to_cart(item_name, item_price, item_names, item_prices, budget):
    """
    Add an item to the cart if the total stays within budget.
    """
    total = calculate_total(item_prices) + item_price
    if total > budget:
        print("Budget exceeded! Cannot add more items.")
        return item_names, item_prices, False
    else:
        item_names.append(item_name)
        item_prices.append(item_price)
        print(f"Added {item_name} for ${item_price:.2f}.")
        return item_names, item_prices, True


def calculate_total(item_prices):
    """
    Calculate the total cost of items in the cart.
    """
    return sum(item_prices)


def main():
    """
    Run the shopping cart program: add items, check budget, and apply discounts.
    """
    item_names = []  # Store names of items
    item_prices = []  # Store prices of items
    budget = float(input("Enter your budget: "))  # User budget input

    while True:
        # Input item details or finish adding items
        item_name = input("Enter the item name (or 'done' to finish): ")
        if item_name.lower() == 'done':
            break
        item_price = float(input(f"Enter the price for {item_name}: "))

        # Add item to the cart
        item_names, item_prices, can_continue = add_item_to_cart(
            item_name, item_price, item_names, item_prices, budget
        )
        if not can_continue:
            break

    # Calculate the final total and check for discounts
    total = calculate_total(item_prices)
    print("\nCart Summary:")
    print("Items:", item_names)
    print("Prices:", item_prices)
    print(f"Total cost: ${total:.2f}")

    if total > 100:
        discount = total * 0.1
        discounted_total = total - discount
        print(f"Discount applied! Final total after 10% discount: ${discounted_total:.2f}")
    else:
        print("No discount applied.")

if __name__ == "__main__":
    main()


Enter your budget: 200
Enter the item name (or 'done' to finish): apple
Enter the price for apple: 10
Added apple for $10.00.
Enter the item name (or 'done' to finish): mouse
Enter the price for mouse: 100
Added mouse for $100.00.
Enter the item name (or 'done' to finish): done

Cart Summary:
Items: ['apple', 'mouse']
Prices: [10.0, 100.0]
Total cost: $110.00
Discount applied! Final total after 10% discount: $99.00


3.	Loop and Input:
  - Start the input only once the user says 'start'
  - Use a loop to allow the user to add items to the cart by entering an item name and price. The loop should stop when the user types 'done'.
  - For each item, add it to item_names and item_prices using add_item_to_cart, then update the total cost using calculate_total.

Output:
  - After the loop ends, display the final cart with each item and its price, the initial total, any applicable discount, and the final total.


In [4]:
def add_item_to_cart(item_name, item_price, item_names, item_prices):
    """
    Add an item to the cart.
    """
    item_names.append(item_name)
    item_prices.append(item_price)
    print(f"Added {item_name} for ${item_price:.2f}.")
    return item_names, item_prices


def calculate_total(item_prices):
    """
    Calculate the total cost of items in the cart.
    """
    return sum(item_prices)


def main():
    """
    Run the shopping cart program: wait for user input, add items, and calculate totals.
    """
    print("Welcome to the Shopping Cart!")
    start_command = input("Type 'start' to begin: ").strip().lower()

    # Wait for the user to type 'start' before proceeding
    while start_command != "start":
        start_command = input("Type 'start' to begin: ").strip().lower()

    item_names = []  # List to store names of items
    item_prices = []  # List to store prices of items

    print("\nLet's start adding items to the cart!")
    while True:
        # Get user input for item name and price
        item_name = input("Enter the item name (or 'done' to finish): ").strip()
        if item_name.lower() == "done":
            break

        item_price = float(input(f"Enter the price for {item_name}: "))
        # Add item to the cart
        item_names, item_prices = add_item_to_cart(item_name, item_price, item_names, item_prices)

    # Calculate the final total
    initial_total = calculate_total(item_prices)

    print("\nFinal Cart Summary:")
    for name, price in zip(item_names, item_prices):
        print(f"{name}: ${price:.2f}")
    print(f"Initial Total: ${initial_total:.2f}")

    # Apply discount if applicable
    if initial_total > 100:
        discount = initial_total * 0.1
        final_total = initial_total - discount
        print(f"Discount Applied: ${discount:.2f}")
    else:
        discount = 0
        final_total = initial_total
        print("No discount applied.")

    print(f"Final Total: ${final_total:.2f}")


if __name__ == "__main__":
    main()


Welcome to the Shopping Cart!
Type 'start' to begin: starrt
Type 'start' to begin: start

Let's start adding items to the cart!
Enter the item name (or 'done' to finish): apple
Enter the price for apple: 100000
Added apple for $100000.00.
Enter the item name (or 'done' to finish): done

Final Cart Summary:
apple: $100000.00
Initial Total: $100000.00
Discount Applied: $10000.00
Final Total: $90000.00


In [None]:
# Write your code here

# Section 2 - Intermidiate (55%) - Remote Work and Mental Health Analysis

Dataset source: Kaggle (https://www.kaggle.com/datasets/waqi786/remote-work-and-mental-health)




## Objective:
- In the following sections, you will explore the "Remote Work and Mental Health" dataset using Python and different data science libraries such as Pandas, NumPy and Matplotlib.
- Follow the instructions below to complete each task. Please provide code for each question and any observations as comments when necessary.

In [5]:
# Import necessary modules and libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt


## 1. Load Dataset (2 marks)
- Instructions: Load the dataset using Pandas and display few rows.
- Question: Describe the overall structure (rows, columns, data types) as a comment at the end of your code.


In [None]:
# Load the dataset
data = pd.read_csv('/mnt/data/Impact_of_Remote_Work_on_Mental_Health.csv')

# Display first few rows
print(data.head())

# Describe the dataset structure
print(f"Dataset has {data.shape[0]} rows and {data.shape[1]} columns.")
print("Column data types:")
print(data.dtypes)



## 2. Display 'n' Rows (3 marks)
- Instructions: Display the first 13 rows of the dataset.

In [None]:
# Write code here
# Display the first 13 rows
print(data.head(13))


- Instructions: Display the last 7 rows of the dataset

In [None]:
# Write code here
print("\nLast 7 rows of the dataset:")
print(data.tail(7))

## 3. Find the Number of Null Values in the Dataset (2 mark)

In [None]:
# Write code here
print("\nNumber of null values in each column:")
print(null_counts)

## 4. Statistical Summary for Numeric Columns (10 marks)
Instructions: Use individual commands to find the statistical summary.

- Count

In [None]:
# Write code here
count_values = data.select_dtypes(include=['int64', 'float64']).count()
print("Count for each numeric column:")
print(count_values)

- Mean

In [None]:
# Write code here
mean_values = data.select_dtypes(include=['int64', 'float64']).mean()
print("\nMean for each numeric column:")
print(mean_values)

- Standard Deviation

In [None]:
# Write code here
std_values = data.select_dtypes(include=['int64', 'float64']).std()
print("\nStandard Deviation for each numeric column:")
print(std_values)

- Quartiles

In [None]:
# Write code here
quartiles = data.select_dtypes(include=['int64', 'float64']).quantile([0.25, 0.5, 0.75])
print("\nQuartiles for each numeric column:")
print(quartiles)

## 5. Calculate Extrema (2 marks)

In [None]:
# Write code here
min_values = data.min(numeric_only=True)
max_values = data.max(numeric_only=True)
print("\nMinimum values for numeric columns:")
print(min_values)
print("\nMaximum values for numeric columns:")
print(max_values)

## 6. Find Unique Values in a Categorical Column (3 marks)

- Instructions: Identify the unique values in the `job_role` column (2 marks)
- Question: How many unique roles are represented in the dataset? (1 mark)

In [None]:
# Write code here
unique_roles = data['Job_Role'].unique()
unique_roles_count = len(unique_roles)
print(f"\nNumber of unique job roles: {unique_roles_count}")
print("Unique job roles:", unique_roles)

## 7. Group Data and Calculate Mean (4 marks)
- Instructions: Group the dataset by `job_role` and calculate the mean of the `Work_Life_Balance_Rating` for each role.
- Question: Which job role has the highest average Work life balance?

In [None]:
# Write code here

mean_wlb = data.groupby('Job_Role')['Work_Life_Balance_Rating'].mean()
highest_wlb_role = mean_wlb.idxmax()
highest_wlb_rating = mean_wlb.max()
print(f"\nJob role with the highest average Work Life Balance Rating: '{highest_wlb_role}' with a rating of {highest_wlb_rating:.2f}.")

## 8. Filter Data Based on Condition (4 marks)
- Instructions: Filter the dataset to show only rows where `work_hours_per_week` is greater than 40.
- Question: How many employees are working overtime?

In [None]:
# Write code here
overtime_employees = data[data['Hours_Worked_Per_Week'] > 40]
overtime_count = len(overtime_employees)
print(f"\nNumber of employees working more than 40 hours per week: {overtime_count}")


## 9 . Histogram of Work Hours per Week (5 marks)
- Instructions: Create a histogram of `Hours_Worked_Per_Week` (4 marks).
- Question: Describe the distribution of work hours. Are most employees working around a certain number of hours per week? (1 mark)

In [None]:
# Write code here
plt.figure(figsize=(8, 5))
plt.hist(data['Hours_Worked_Per_Week'], bins=10, edgecolor='black')
plt.xlabel('Work Hours Per Week')
plt.ylabel('Frequency')
plt.title('Distribution of Work Hours Per Week')
plt.show()

## 10. Scatter Plot of Work Hours vs. Years_of_Experience (4 marks)
- Instructions: Create a scatter plot with `Hours_Worked_Per_Week` on the x-axis and `Years_of_Experience` on the y-axis.

In [None]:
# Write code here

plt.figure(figsize=(8, 5))
plt.scatter(data['Hours_Worked_Per_Week'], data['Years_of_Experience'], alpha=0.7)
plt.xlabel('Work Hours Per Week')
plt.ylabel('Years of Experience')
plt.title('Work Hours vs. Years of Experience')
plt.show()

## 11. Bar Chart of Average Work Life Balance by Job Role (5 marks)
- Instructions: Create a bar chart showing the average `Work_Life_Balance_Rating` for each `Job_Role` (4 marks).
- Question: Which job roles have the highest and lowest average mental Work Life Balance? (1 mark)

In [None]:
# Write code here
mean_wlb.sort_values().plot(kind='bar', figsize=(10, 5), color='skyblue', edgecolor='black')
plt.xlabel('Job Role')
plt.ylabel('Average Work Life Balance Rating')
plt.title('Average Work Life Balance by Job Role')
plt.xticks(rotation=45, ha='right')
plt.show()

## 12. Pie Chart of Workload Level Distribution (5 marks)
- Instructions: Use a pie chart to show the proportion of `Access_to_Mental_Health_Resources` (Yes and no) in the dataset (4 marks).
- Question: What percentage of employees have access to mental health resources? (1 mark)

In [None]:
# Write code here

mental_health_resources = data['Access_to_Mental_Health_Resources'].value_counts()
plt.figure(figsize=(6, 6))
mental_health_resources.plot(kind='pie', autopct='%1.1f%%', startangle=90, labels=['Yes', 'No'], colors=['lightblue', 'lightcoral'])
plt.title('Access to Mental Health Resources')
plt.ylabel('')
plt.show()

## 13. Scatter Plot of Age vs. Social Isolation Rating (6 marks)
- Instructions: Create a scatter plot with `age` on the x-axis and `Social_Isolation_Rating` on the y-axis (4 marks).
- Question: Do you observe any trends or relationships between age and social isolation? Is there a noticeable impact of age on isoloation? (2 marks)

In [None]:
# Write code here
plt.figure(figsize=(8, 5))
plt.scatter(data['Age'], data['Social_Isolation_Rating'], alpha=0.7)
plt.xlabel('Age')
plt.ylabel('Social Isolation Rating')
plt.title('Age vs. Social Isolation Rating')
plt.show()

# Section 3 - Long Answer/Advanced (20%)



## Job Role and Workload Level Impact on Mental Health

Instructions: Investigate the influence of job roles and workload level on the mental health.
- Create a new column `workload_level` that labels each entry as "High" if the `Hours_Worked_Per_Week` is above its mean, otherwise "Low." (5 marks)
- Group the dataset by `Industry` and calculate the average `Hours_Worked_Per_Week` for each combination. (5 marks)
- Use a bar chart to display the average `Stress_Level` for each job role, with separate bars for high and low stress levels. (5 marks)
- Analyze the results: Which job roles and workload levels appear to have the greatest impact on mental health? (5 marks)


In [None]:

# 1. Create a new column 'workload_level'
mean_hours = data['Hours_Worked_Per_Week'].mean()
data['workload_level'] = data['Hours_Worked_Per_Week'].apply(lambda x: 'High' if x > mean_hours else 'Low')
print("\nAdded 'workload_level' column based on mean Hours_Worked_Per_Week.")

# 2. Group the dataset by Industry and calculate the average Hours_Worked_Per_Week for each combination
industry_avg_hours = data.groupby('Industry')['Hours_Worked_Per_Week'].mean()
print("\nAverage Hours Worked Per Week by Industry:")
print(industry_avg_hours)

# 3. Bar chart: Average Stress_Level for each job role with separate bars for high and low workload levels
# Map Stress_Level to numeric values for aggregation
stress_mapping = {'Low': 1, 'Medium': 2, 'High': 3}
data['Stress_Level_Numeric'] = data['Stress_Level'].map(stress_mapping)

# Calculate average stress levels by Job_Role and workload_level
avg_stress = data.groupby(['Job_Role', 'workload_level'])['Stress_Level_Numeric'].mean().unstack()

# Plotting the bar chart
avg_stress.plot(kind='bar', figsize=(10, 6), color=['skyblue', 'salmon'], edgecolor='black')
plt.xlabel('Job Role')
plt.ylabel('Average Stress Level')
plt.title('Average Stress Level by Job Role and Workload Level')
plt.legend(['Low Workload', 'High Workload'], title='Workload Level')
plt.xticks(rotation=45, ha='right')
plt.show()

# 4. Analysis of results
print("\nAnalysis:")
print("Job roles with the highest stress levels under high workload:")
highest_stress_high = avg_stress['High'].idxmax()
print(f"- {highest_stress_high}: {avg_stress['High'].max():.2f} (High Workload)")

print("\nJob roles with the highest stress levels under low workload:")
highest_stress_low = avg_stress['Low'].idxmax()
print(f"- {highest_stress_low}: {avg_stress['Low'].max():.2f} (Low Workload)")

# Additional insight: Differences in stress levels between workload levels
stress_diff = avg_stress['High'] - avg_stress['Low']
most_impactful_job_role = stress_diff.idxmax()
print("\nJob role most impacted by workload level difference:")
print(f"- {most_impactful_job_role}: Difference = {stress_diff.max():.2f}")
