<a href="https://colab.research.google.com/github/cpython-projects/da_2603/blob/main/lesson_13_hw.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Анализ производительности сотрудников компании

#### Входные данные:

CSV-файл `employees.csv` со следующей структурой:

```
id,name,department,salary,projects_completed,hours_worked
1,John Smith,Engineering,72000,10,1600
2,Anna Johnson,Marketing,68000,8,1500
3,Robert Brown,Engineering,80000,12,1700
4,Linda Davis,HR,60000,5,1400
5,James Wilson,Marketing,70000,9,1550
...
```

**Task 1.** Создать класс `Employee`, который описывает сотрудника.  
Эффективность можно посчитать как:  
`эффективность = количество завершённых проектов / отработанные часы`

In [2]:
class Employee:
    def __init__(self, name: str, salary: float | int, projects_completed: int, hours_worked: int | float):
        self.name = name.strip().title()
        self.salary = salary
        self.projects_completed = projects_completed
        self.hours_worked = hours_worked

    def efficiency(self):
        if self.hours_worked == 0:
            return 0
        return self.projects_completed / self.hours_worked

    def __str__(self):
        return f"{self.name}"

**Task 2.** Создать класс `Department`, который агрегирует сотрудников одного отдела.  
`Department` должен уметь:
*   добавлять сотрудников
*   рассчитывать среднюю и медианную эффективность (summary_statistics)
*   определять наилучшего и наихудшего сотрудника по эффективности

*Сравните/подумайте, какое среднее (арифметическое, геометрическое, гармоническое) лучше отражает реальную эффективность сотрудников*

*Используйте модуль statistics для рассчетов*  
`from statistics import mean, median, geometric_mean, harmonic_mean`

In [13]:
import math
from statistics import mean, median, geometric_mean, harmonic_mean

class Department:
    def __init__(self, name):
        self.name = name
        self.employees = []

    def add_employee(self, employee: Employee):
        if employee not in self.employees:
            self.employees.append(employee)

    def best_employee(self):
        max_value = max(self.employees, key=lambda x: x.efficiency())
        best_emps = []
        for emp in self.employees:
            if math.isclose(emp.efficiency(), max_value.efficiency()):
                best_emps.append(emp)

        return best_emps

    def worst_employee(self):
        min_value = min(self.employees, key=lambda x: x.efficiency())
        worst_emps = []
        for emp in self.employees:
            if math.isclose(emp.efficiency(), min_value.efficiency()):
                worst_emps.append(emp)

        return worst_emps

    def summary_statistics(self):
        efficiency_list = [emp.efficiency() for emp in self.employees]
        # print(efficiency_list)
        return {
            "mean": mean(efficiency_list),
            "median": median(efficiency_list),
            "geometric_mean": geometric_mean(efficiency_list),
            "harmonic_mean": harmonic_mean(efficiency_list)
        }

    def __str__(self):
        return f"{self.name}"

**Task 3.** Считать данные из файла и создать объекты класса `Employee`

In [4]:
from google.colab import files
uploaded = files.upload()

Saving employees (1).csv to employees (1).csv


In [15]:
import csv
departments = {}

with open("employees (1).csv", newline="") as file:
    reader = csv.DictReader(file)
    for row in reader:
        name = row["name"]
        department = row["department"]
        salary = float(row["salary"])
        projects_completed = int(row["projects_completed"])
        hours_worked = float(row["hours_worked"])

        emp = Employee(name, salary, projects_completed, hours_worked)

        if department not in departments:
            departments[department] = Department(department)
        departments[department].add_employee(emp)

# Вывод аналитики по каждому отделу
for dept_name, dept in departments.items():
    print(dept_name)
    print(dept.summary_statistics())




HR
{'mean': 0.006586440402709784, 'median': 0.00633934535738143, 'geometric_mean': 0.006298863994331656, 'harmonic_mean': 0.006018877388171993}
Marketing
{'mean': 0.005853444208853715, 'median': 0.0062367124708322536, 'geometric_mean': 0.005098746946214836, 'harmonic_mean': 0.004372687616529013}
Engineering
{'mean': 0.005545436313161862, 'median': 0.005962977296181631, 'geometric_mean': 0.005176467854601394, 'harmonic_mean': 0.004819914169031123}


**Insights (1-3 предложения):**

In [17]:
import pandas as pd

In [18]:
df = pd.read_csv("employees (1).csv", delimiter=",", encoding="utf-8")
df.head()

Unnamed: 0,id,name,department,salary,projects_completed,hours_worked
0,1,James Wilson,HR,80765,13,1950
1,2,Robert Brown,Marketing,57577,6,1752
2,3,Emily Clark,Engineering,55679,11,1938
3,4,James Wilson,Engineering,63244,5,1596
4,5,Robert Brown,HR,81951,15,1460



Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `y` variable to `hue` and set `legend=False` for the same effect.




Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `y` variable to `hue` and set `legend=False` for the same effect.




Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `y` variable to `hue` and set `legend=False` for the same effect.




Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `y` variable to `hue` and set `legend=False` for the same effect.



In [19]:
df['efficiency'] = df['projects_completed'] / df['hours_worked']
df.head()

Unnamed: 0,id,name,department,salary,projects_completed,hours_worked,efficiency
0,1,James Wilson,HR,80765,13,1950,0.006667
1,2,Robert Brown,Marketing,57577,6,1752,0.003425
2,3,Emily Clark,Engineering,55679,11,1938,0.005676
3,4,James Wilson,Engineering,63244,5,1596,0.003133
4,5,Robert Brown,HR,81951,15,1460,0.010274


In [21]:
summary_statistics = df.groupby('department')['efficiency'].agg(['mean', 'median', 'min', 'max'])
summary_statistics

Unnamed: 0_level_0,mean,median,min,max
department,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Engineering,0.005545,0.005963,0.002933,0.009848
HR,0.006586,0.006339,0.004071,0.010274
Marketing,0.005853,0.006237,0.002317,0.009893
