### **TDI Data Science Track** <br>Week 5 Task: Data Structures and Object Oriented Programming

In [1]:
import csv

#### **1. Classes and Inheritance**

a.    Define a class Employee with attributes for name, age, and salary (private attribute) <br>
ii.	A method to get and set the value of the salary. <br>
iii.	A method to display the employee’s details.

In [2]:
class Employee:
    def __init__(self, name, age, salary):
        self.name = name 
        self.age = age
        self.__salary = salary    #private attribute
        
    # method to get the salary
    def get_salary(self):
        return self.__salary

    # method to set the salary
    def set_salary(self, salary):
        self.__salary = salary

    # method to display employee details
    def display_details(self):
        print(f"Name:{self.name}, Age:{self.age}, Salary: {self.salary}")

i.	Describe the difference between private and public attributes in Python classes. 
- When would you choose to make an attribute private?

**Public Attributes** <br> 
- **Access:** Public Attributes are accessible from both inside and outside the class.
- **Convention:** Attributes without any underscores (e.g name) are considered public.
- **Modification:** Can be modified freely
- **Purpose:** Used for attributes that can be openly accessed and unchanged. 

**Private Attributes** <br> 
- **Access:** Private Attributes are accessible from only inside the class.
- **Convention:** Attributes with two leading underscores (e.g __salary) are considered private.
- **Modification:** Cannot be modified directly, requires methods to access or change.
- **Purpose:** Used for sensitive data that should be hidden and protected.

**When to make an attribute private:** <br>
- **Encapsulation:** When the data needs to be hidden which helps control access to sensitive data and ensures integrity of the data.
- **Prevent External Modification:** This is for attributes that are critical to the object's internal state or logic.


b.	Explain how inheritance can be used to create a class hierarchy for different employee types such as Employee, Manager, and Intern. You can provide code examples if necessary.


**Inheritance:** Inheritance allows creation of a class hierarchy where one class (the child class) inherits attributes and methods from another class (the parent class). This is useful for representing relationships between different types of objects, like different types of employees in a company. <br>
An example:
- **Employee:** The base class that contains common attributes and methods for employees.
- **Manager:** A subclass that extends Employee and adds additional features like managing department.
- **Interns:** Another subclass that extends Employee but might have different methods and attributes specific to interns.

In [3]:
# Manager inheriting Employee class attributes
class Manager(Employee):
    def __init__(self, name, age, salary, department, position):
        # call the parent class constructor using super()
        super().__init__(name, age, salary)
        self.department = department
        self.position = position

c.	Create a Python program where the Manager class extends the Employee class. Override the method responsible for displaying employee details to include extra information about the manager's departments and their respective positions.

#### **2.	Reading and Slicing Datasets**

a.	Write a Python function to read data from a CSV file into a list of dictionaries, where each dictionary represents a row in the file. <br> Limit the data read to the first 20 rows.

In [4]:
# read csv file into a list of dictionaries
file_path = "Employees-Dataset01.csv"
with open(file_path, 'r') as file:
    # create a CSV reader object
    csvreader = csv.DictReader(file)
    
    # convert to a list of dictionaries
    data = [row for row in csvreader]

In [5]:
# display only first 20 rows
limit = 0
for record in data:
    if limit < 20:
        limit = limit + 1
        print(f"{limit} {record}")

1 {'Name': 'John Doe', 'Age': '28', 'Salary': '5000000', 'Department': 'Engineering'}
2 {'Name': 'Jane Smith', 'Age': '32', 'Salary': '5500000', 'Department': 'Marketing'}
3 {'Name': 'Emily Johnson', 'Age': '45', 'Salary': '7000000', 'Department': 'Finance'}
4 {'Name': 'Michael Brown', 'Age': '30', 'Salary': '6000000', 'Department': 'Engineering'}
5 {'Name': 'Linda Davis', 'Age': '27', 'Salary': '4800000', 'Department': 'Sales'}
6 {'Name': 'James Wilson', 'Age': '40', 'Salary': '7200000', 'Department': 'Finance'}
7 {'Name': 'Mary Moore', 'Age': '35', 'Salary': '6200000', 'Department': 'Engineering'}
8 {'Name': 'David Taylor', 'Age': '50', 'Salary': '7500000', 'Department': 'Marketing'}
9 {'Name': 'Laura Anderson', 'Age': '29', 'Salary': '5100000', 'Department': 'Sales'}
10 {'Name': 'Chris Thomas', 'Age': '42', 'Salary': '6800000', 'Department': 'Engineering'}
11 {'Name': 'Patricia Jackson', 'Age': '33', 'Salary': '5900000', 'Department': 'Marketing'}
12 {'Name': 'Robert Harris', 'Age':

b. Slice the dataset to get all employees with a salary above average salary. <br> Extract only the Name and Department fields from each dictionary. <br>The output should be sorted by name in ascending order.


In [6]:
# load salary data into a list
salary_data = [item["Salary"] for item in data]

# create an empty list to save salary data
int_salary = []

# drop salary data that has string values
for salary in salary_data:
    try:
        converted_salary = int(salary)
    except ValueError:
        convert_salary = None
    int_salary.append(converted_salary)

# replace salary with int_salary in data
num = 0
for item in data:
    item["Salary"] = int_salary[num]
    num = num + 1

# calculating the average value
avg_salary = sum(int_salary)/len(int_salary)

print(f"Average salary earned by workers: {avg_salary}.")
print("These are the workers who earn above the average pay and their departments:")

# sort data according to name
sorted_data = sorted(data, key=lambda data: data["Name"])

for item in sorted_data:
    if item["Salary"] > avg_salary:
        print(f"({item["Name"]}, {item['Department']}, N{item['Salary']})")

Average salary earned by workers: 3354651.35.
These are the workers who earn above the average pay and their departments:
(Alexander Rogers, Sales, N6200000)
(Alice -16, Finance, N4853380)
(Alice -2, , N3383944)
(Alice -31, , N4171861)
(Alice -36, Marketing, N3532971)
(Alice -49, Sales, N3501467)
(Alice 17, , N4564862)
(Alice 65, Sales, N4544799)
(Alice 73, Finance, N3852378)
(Alice 98, , N4039515)
(Amanda Young, Marketing, N5900000)
(Angela Green, Finance, N5200000)
(Ashley Turner, Finance, N6000000)
(Ava Carter, Marketing, N5300000)
(Barbara Walker, Marketing, N6100000)
(Bob -21, Sales, N3861301)
(Bob -34, Engineering, N4805822)
(Bob 50, Marketing, N4344360)
(Bob 80, Sales, N4704784)
(Brian Wright, Engineering, N6800000)
(Charles Scott, Sales, N6000000)
(Charlie 23, , N4872582)
(Charlie 42, Sales, N4805211)
(Charlie 6, Finance, N4864284)
(Charlie 99, , N4547808)
(Charlotte Wright, Finance, N5500000)
(Chloe Coleman, Marketing, N6600000)
(Chris Thomas, Engineering, N6800000)
(Daniel Ha

#### **3.	Integrating Classes with Dataset Operations**

#### **a.	Create Employee Objects from CSV: <br>**

**i.	Write a function using the Employee class that reads employee data from a CSV file. <br>For each row in the CSV, create an employee object and store these objects in a list.**

In [7]:
# creating Employee class
class Employee:
    def __init__(self, name, age, salary, department):
        self.name = name 
        self.age = age
        self.salary = salary
        self.department = department

    def __repr__(self):
        return f"[{self.name}, {self.age}, {self.salary}, {self.department}]"
    
# create a function to convert salary and age to integer
def convert_to_int(value):
        try:
            return int(value)
        except ValueError:
            return None     # return None if the conversion fails

In [8]:
# creating a function that reads employee data and stores them in a list
def read_employee_from_csv(file_path):
    employees = []
    with open(file_path, "r") as csvfile:
        reader = csv.DictReader(csvfile)
        for row in reader:
            # create Employee object for each row and append to the list
            employee = Employee(name=row["Name"], 
                                age = convert_to_int(row["Age"]), 
                                salary = convert_to_int(row["Salary"]), 
                                department = row["Department"])
            employees.append(employee)
    return employees

In [9]:
# call the function and view employee data
employees = read_employee_from_csv("Employees-Dataset01.csv")
for employee in employees:
    print(employee)

[John Doe, 28, 5000000, Engineering]
[Jane Smith, 32, 5500000, Marketing]
[Emily Johnson, 45, 7000000, Finance]
[Michael Brown, 30, 6000000, Engineering]
[Linda Davis, 27, 4800000, Sales]
[James Wilson, 40, 7200000, Finance]
[Mary Moore, 35, 6200000, Engineering]
[David Taylor, 50, 7500000, Marketing]
[Laura Anderson, 29, 5100000, Sales]
[Chris Thomas, 42, 6800000, Engineering]
[Patricia Jackson, 33, 5900000, Marketing]
[Robert Harris, 37, 6400000, Finance]
[Jennifer Martin, 26, 4700000, Sales]
[William Lewis, 31, 5300000, Engineering]
[Barbara Walker, 34, 6100000, Marketing]
[Daniel Hall, 38, 6500000, Finance]
[Elizabeth Young, 39, 6700000, Sales]
[Joseph Allen, 25, 4500000, Engineering]
[Sarah King, 44, 6900000, Marketing]
[Charles Scott, 30, 6000000, Sales]
[Angela Green, 29, 5200000, Finance]
[Richard Adams, 28, 5400000, Engineering]
[Jessica Nelson, 31, 5700000, Marketing]
[Matthew Carter, 45, 7000000, Sales]
[Nancy Mitchell, 37, 6200000, Finance]
[Kevin Perez, 40, 6500000, Engine

#### **b. Filter by Age** <br>
**i.	How many employees are over 30 years old?**

In [10]:
ova_30 = [employee for employee in employees 
          if employee.age is not None 
          and employee.age > 30]

print(f"{len(ova_30)} employees are above 30yrs of age.")

135 employees are above 30yrs of age.


**ii.	Which employees are younger than 25 years?**

In [11]:
print("Employees below 25 yrs:")
[employee for employee in employees 
    if employee.age is not None 
    and employee.age < 25]

Employees below 25 yrs:


[[Frank -22, 24, None, Marketing],
 [Grace -4, 22, 2897633, ],
 [Charlie 34, 24, 820924, Engineering],
 [Isaac 35, 24, 1101174, Marketing],
 [Isaac 48, 23, 3060254, Engineering],
 [Bob 62, 23, 2245289, Finance],
 [Frank 84, 24, 2808696, Engineering],
 [Frank 87, 22, 2458722, Finance],
 [Frank 89, 23, 465751, Engineering]]

**iii. List the employees who are between 28 and 35 years old.**

In [12]:
print("Employees between 28-35 years old:")

[employee 
    for employee in employees 
    if employee.age is not None 
    and employee.age >= 28 
    and employee.age <= 35]

Employees between 28-35 years old:


[[John Doe, 28, 5000000, Engineering],
 [Jane Smith, 32, 5500000, Marketing],
 [Michael Brown, 30, 6000000, Engineering],
 [Mary Moore, 35, 6200000, Engineering],
 [Laura Anderson, 29, 5100000, Sales],
 [Patricia Jackson, 33, 5900000, Marketing],
 [William Lewis, 31, 5300000, Engineering],
 [Barbara Walker, 34, 6100000, Marketing],
 [Charles Scott, 30, 6000000, Sales],
 [Angela Green, 29, 5200000, Finance],
 [Richard Adams, 28, 5400000, Engineering],
 [Jessica Nelson, 31, 5700000, Marketing],
 [Megan Roberts, 32, 5800000, Marketing],
 [Joshua Evans, 29, 5300000, Sales],
 [Ashley Turner, 35, 6000000, Finance],
 [Amanda Young, 33, 5900000, Marketing],
 [Edward Cooper, 31, 5500000, Sales],
 [Sophia Hall, 30, 5600000, Finance],
 [Isabella King, 29, 5100000, Finance],
 [Ava Carter, 28, 5300000, Marketing],
 [Alexander Rogers, 35, 6200000, Sales],
 [Charlotte Wright, 32, 5500000, Finance],
 [Henry Reed, 31, 5700000, Engineering],
 [Lily Ross, 30, 5400000, Marketing],
 [Jack Peterson, 29, 510

**iv. Who are the employees that are exactly 40 years old?**

In [13]:
print("Below are employees exactly 40yrs:")
[employee 
    for employee in employees 
    if employee.age == 40]

Below are employees exactly 40yrs:


[[James Wilson, 40, 7200000, Finance],
 [Kevin Perez, 40, 6500000, Engineering],
 [Jacob Morris, 40, 6700000, Engineering],
 [Chloe Coleman, 40, 6600000, Marketing],
 [Alice -48, 40, 2032806, Sales],
 [Diana -14, 40, 1787738, Engineering],
 [Charlie 30, 40, 2534602, Sales],
 [Judy 82, 40, None, Finance]]

#### **Filter by Salary**

**v.	Which employees have a salary higher than ₦1,500,000 per month?**

In [14]:
print("Below are employees who earn above 1500000:")
[employee 
    for employee in employees 
    if employee.salary is not None 
    and employee.salary > 1500000]

Below are employees who earn above 1500000:


[[John Doe, 28, 5000000, Engineering],
 [Jane Smith, 32, 5500000, Marketing],
 [Emily Johnson, 45, 7000000, Finance],
 [Michael Brown, 30, 6000000, Engineering],
 [Linda Davis, 27, 4800000, Sales],
 [James Wilson, 40, 7200000, Finance],
 [Mary Moore, 35, 6200000, Engineering],
 [David Taylor, 50, 7500000, Marketing],
 [Laura Anderson, 29, 5100000, Sales],
 [Chris Thomas, 42, 6800000, Engineering],
 [Patricia Jackson, 33, 5900000, Marketing],
 [Robert Harris, 37, 6400000, Finance],
 [Jennifer Martin, 26, 4700000, Sales],
 [William Lewis, 31, 5300000, Engineering],
 [Barbara Walker, 34, 6100000, Marketing],
 [Daniel Hall, 38, 6500000, Finance],
 [Elizabeth Young, 39, 6700000, Sales],
 [Joseph Allen, 25, 4500000, Engineering],
 [Sarah King, 44, 6900000, Marketing],
 [Charles Scott, 30, 6000000, Sales],
 [Angela Green, 29, 5200000, Finance],
 [Richard Adams, 28, 5400000, Engineering],
 [Jessica Nelson, 31, 5700000, Marketing],
 [Matthew Carter, 45, 7000000, Sales],
 [Nancy Mitchell, 37, 62

**vi. Can you find the employees earning less than ₦800,000 per month?**

In [15]:
print("Below are employees who earn below 800000:")
[employee 
    for employee in employees 
    if employee.salary is not None 
    and employee.salary < 800000]

Below are employees who earn below 800000:


[[Alice -24, 43, 342318, Sales],
 [Eve 0, None, 494658, Engineering],
 [Isaac 5, 29, 184861, ],
 [Charlie 33, 42, 511145, Sales],
 [Frank 36, 27, 496734, Finance],
 [Judy 37, 45, 169870, Sales],
 [Eve 40, None, 789253, Sales],
 [Isaac 52, 39, 460634, Sales],
 [Judy 54, 29, 79511, Finance],
 [Grace 61, 26, 100843, ],
 [Bob 66, 43, 718757, Sales],
 [Eve 68, 27, 581845, Sales],
 [Frank 72, 35, 273081, Marketing],
 [Hannah 83, 48, 220203, ],
 [Diana 85, 50, 149248, Sales],
 [Frank 89, 23, 465751, Engineering]]

**vii. List the employees whose salaries are between ₦1,000,000 and ₦2,000,000 per month.**

In [16]:
print("Below are employees who earn between 1000000 and 2000000:")

[employee for employee in employees 
    if employee.salary is not None 
    and employee.salary >= 1000000
    and employee.salary <= 2000000]

Below are employees who earn between 1000000 and 2000000:


[[Grace -46, 37, 1301649, Marketing],
 [Isaac -44, 32, 1245294, Engineering],
 [Charlie -41, 25, 1153283, Sales],
 [Bob -39, 50, 1098982, Finance],
 [Bob -26, 35, 1495519, Sales],
 [Bob -23, None, 1677722, Engineering],
 [Eve -19, None, 1035627, Finance],
 [Isaac -18, 33, 1687415, Engineering],
 [Isaac -17, None, 1321984, Marketing],
 [Alice -15, 26, 1728769, Engineering],
 [Diana -14, 40, 1787738, Engineering],
 [Eve -12, 25, 1183812, Engineering],
 [Eve -11, None, 1634495, Sales],
 [Eve -8, 26, 1979065, ],
 [Eve -7, 35, 1197784, Marketing],
 [Frank -3, 42, 1636208, Engineering],
 [Grace -1, 27, 1971925, Finance],
 [Charlie 8, 48, 1577020, Finance],
 [Judy 14, 38, 1426708, Sales],
 [Diana 18, 48, 1150468, Marketing],
 [Alice 25, 33, 1436477, Sales],
 [Judy 27, None, 1676923, Engineering],
 [Eve 32, 34, 1140081, Sales],
 [Isaac 35, 24, 1101174, Marketing],
 [Frank 46, 50, 1432713, Sales],
 [Eve 53, 27, 1808817, Engineering],
 [Bob 69, 39, 1876938, Marketing],
 [Isaac 81, 28, 1600737, F

**viii. Who earns exactly ₦1,200,000 per month?**

In [17]:
print("Below are employees who earn exactly 1200000:")
[employee
    for employee in employees
    if employee.salary == 1200000]

Below are employees who earn exactly 1200000:


[]

#### **Combined Filters**

**ix. Which employees over 30 years old earn more than ₦1,500,000 per month?**

In [18]:
[employee 
    for employee in employees
    if employee.age is not None and employee.age > 30 
    and employee.salary is not None and employee.salary > 1500000]

[[Jane Smith, 32, 5500000, Marketing],
 [Emily Johnson, 45, 7000000, Finance],
 [James Wilson, 40, 7200000, Finance],
 [Mary Moore, 35, 6200000, Engineering],
 [David Taylor, 50, 7500000, Marketing],
 [Chris Thomas, 42, 6800000, Engineering],
 [Patricia Jackson, 33, 5900000, Marketing],
 [Robert Harris, 37, 6400000, Finance],
 [William Lewis, 31, 5300000, Engineering],
 [Barbara Walker, 34, 6100000, Marketing],
 [Daniel Hall, 38, 6500000, Finance],
 [Elizabeth Young, 39, 6700000, Sales],
 [Sarah King, 44, 6900000, Marketing],
 [Jessica Nelson, 31, 5700000, Marketing],
 [Matthew Carter, 45, 7000000, Sales],
 [Nancy Mitchell, 37, 6200000, Finance],
 [Kevin Perez, 40, 6500000, Engineering],
 [Megan Roberts, 32, 5800000, Marketing],
 [Ashley Turner, 35, 6000000, Finance],
 [Brian Wright, 42, 6800000, Engineering],
 [Amanda Young, 33, 5900000, Marketing],
 [Edward Cooper, 31, 5500000, Sales],
 [Jacob Morris, 40, 6700000, Engineering],
 [Samuel Allen, 36, 6400000, Sales],
 [Ethan Mitchell, 3

**x. Can you list the employees who are under 35, earn between ₦1,000,000 and ₦2,000,000 per month, and work in the Sales department?**

In [19]:
[employee
    for employee in employees
    if employee.age is not None and employee.salary is not None
    and employee.salary >= 1000000 and employee.salary <= 2000000
    and employee.age < 35 and employee.department == "Sales" ]

[[Charlie -41, 25, 1153283, Sales],
 [Alice 25, 33, 1436477, Sales],
 [Eve 32, 34, 1140081, Sales]]

#### **Filter by Department and Position**

**xi.	Who are the employees working in the Engineering department?**

In [20]:
[employee
    for employee in employees
    if employee.department == "Engineering"]

[[John Doe, 28, 5000000, Engineering],
 [Michael Brown, 30, 6000000, Engineering],
 [Mary Moore, 35, 6200000, Engineering],
 [Chris Thomas, 42, 6800000, Engineering],
 [William Lewis, 31, 5300000, Engineering],
 [Joseph Allen, 25, 4500000, Engineering],
 [Richard Adams, 28, 5400000, Engineering],
 [Kevin Perez, 40, 6500000, Engineering],
 [Brian Wright, 42, 6800000, Engineering],
 [Jacob Morris, 40, 6700000, Engineering],
 [Ethan Mitchell, 38, 6900000, Engineering],
 [Henry Reed, 31, 5700000, Engineering],
 [Mason Sanders, 37, 6300000, Engineering],
 [Charlie -47, 37, 2132539, Engineering],
 [Isaac -44, 32, 1245294, Engineering],
 [Judy -38, 30, 2383899, Engineering],
 [Bob -34, 48, 4805822, Engineering],
 [Judy -27, None, None, Engineering],
 [Bob -23, None, 1677722, Engineering],
 [Frank -20, 39, 3276405, Engineering],
 [Isaac -18, 33, 1687415, Engineering],
 [Alice -15, 26, 1728769, Engineering],
 [Diana -14, 40, 1787738, Engineering],
 [Eve -12, 25, 1183812, Engineering],
 [Alice -

**xii.	List the employees who hold the title of Manager.**

In [21]:
department_max_salary = {}

for employee in employees:
    if employee.salary is not None:
        if employee.salary not in department_max_salary:
            department_max_salary[employee.department] = f'[Name: {employee.name}, Salary: {employee.salary}]'
        else:
            department_max_salary[employee.department] = max(department_max_salary[employee.department], employee.salary)

print(department_max_salary)

{'Engineering': '[Name: Isaac 100, Salary: 2752815]', 'Marketing': '[Name: Diana 96, Salary: 2978765]', 'Finance': '[Name: Bob 101, Salary: 1925325]', 'Sales': '[Name: Judy 93, Salary: 4288673]', '': '[Name: Charlie 99, Salary: 4547808]'}


**xiii.	Which employees are Managers in the Finance department?**