# 🐍 Python for Data Science - A Deep Dive
### **Understanding Python’s Core Concepts for AI & Machine Learning**


### 📌 Table of Contents
1. [Introduction to Python](#introduction)
2. [Python Data Structures](#data-structures)
    - [Lists](#lists)
    - [Tuples](#tuples)
    - [Dictionaries](#dictionaries)
    - [Sets](#sets)
3. [Functions & Functional Programming](#functions)
4. [NumPy for Efficient Computation](#numpy)
5. [Pandas for Data Analysis](#pandas)
6. [Object-Oriented Programming (OOP)](#oop)
7. [Final Comprehensive Project](#project)


## 📌 📂 1️⃣ Introduction to Python


### **What is Python?**
Python is a **general-purpose, high-level programming language** that is widely used in **Data Science, Machine Learning, and AI**.  
It was created by **Guido van Rossum** in 1991 and emphasizes **code readability and simplicity**.

---

### **Why is Python Used in Data Science?**
Python dominates **Data Science, Machine Learning, and AI** for the following reasons:

🔹 **Simple & Readable Syntax** – Python has an intuitive, **English-like syntax**, making it easy for beginners to learn and implement complex algorithms.  
🔹 **Rich Ecosystem of Libraries** – Python provides **powerful libraries** like NumPy, Pandas, Scikit-Learn, TensorFlow, and PyTorch, which simplify data handling and machine learning implementation.  
🔹 **Scalability & High Performance** – Python’s ability to handle **large-scale data processing and deep learning** makes it ideal for **big data analytics and AI applications**.  
🔹 **Strong Community Support** – Python has one of the largest **open-source communities**, making it easy to find solutions, tutorials, and pre-built models.  
🔹 **Versatility Across Domains** – Python is used in **Finance, Healthcare, Robotics, NLP, Computer Vision, and more**.

---

### 📌 **Python vs. Other Languages for Data Science**
| Feature                        | Python  | R        | Java     | C++        |
|--------------------------------|---------|----------|----------|------------|
| **Ease of Learning**           | ✅ Easy | Moderate | Difficult | Very Difficult |
| **Library Support**            | ✅ Extensive | Moderate | Limited | Limited |
| **Performance**                | ✅ Optimized | Moderate | High | Very High |
| **Machine Learning Support**   | ✅ Excellent | Limited | Basic | Basic |

---




Python in Data Science: Code Example

In [1]:
# Python's Simplicity: Creating a Variable
name = "Anandi"
age = 25
print(f"My name is {name} and I am {age} years old.")

# List Comprehensions: Efficient Iteration
numbers = [1, 2, 3, 4, 5]
squared = [x ** 2 for x in numbers]
print(squared)  


My name is Anandi and I am 25 years old.
[1, 4, 9, 16, 25]


# 📂 2️⃣ Python Data Structures


Data structures define **how information is stored, accessed, and manipulated**.  
Choosing the right data structure **improves efficiency**, making programs run faster and consume less memory.

---

## **📌 Why Are Data Structures Important in Data Science?**
✔ **Efficient Data Processing** – Handling large datasets efficiently is crucial for AI models.  
✔ **Faster Lookups & Computations** – Reducing processing time enhances machine learning model performance.  
✔ **Memory Optimization** – Using the right structure prevents memory wastage, improving scalability.  
✔ **Data Integrity & Consistency** – Ensures structured data representation for ML models.  

---

## 📌 Common Data Structures in Python

✔ **Lists** – Ordered, Mutable (used for storing multiple values).  
✔ **Tuples** – Ordered, Immutable (used for fixed values).  
✔ **Dictionaries** – Key-value pairs, Fast lookups (used in ML model metadata).  
✔ **Sets** – Unordered, Unique Elements (used for filtering unique categories).  

---

## **📌 Time Complexity of Data Structures**
Understanding **Big-O Notation** helps in optimizing algorithms.

| Operation | List | Tuple | Dictionary | Set |
|-----------|------|-------|------------|-----|
| **Access (`O(1)`)** | ✅ Fast | ✅ Fast | ✅ Fast | ❌ Not Applicable |
| **Search (`O(n)`)** | 🔴 Slow | 🔴 Slow | ✅ Fast (`O(1)`) | ✅ Fast (`O(1)`) |
| **Insertion (`O(1)`)** | ✅ Fast | ❌ Not Possible | ✅ Fast (`O(1)`) | ✅ Fast (`O(1)`) |
| **Deletion (`O(n)`)** | 🔴 Slow | ❌ Not Possible | ✅ Fast (`O(1)`) | ✅ Fast (`O(1)`) |

📌 **Key Takeaways**:
- **Dictionaries & Sets** → Fastest for lookups because they use **hash tables**.
- **Lists & Tuples** → Good for ordered collections but **slower for searches**.
- **Tuples** → Cannot be modified, making them useful for **constant values**.

---

## **📌 When to Use Each Data Structure in Data Science?**
✅ **Lists** → When storing **dataset rows**, like individual user data points.  
✅ **Tuples** → When defining **fixed values** like **hyperparameters** in ML models.  
✅ **Dictionaries** → When working with **key-value pairs**, like **mapping feature names to values**.  
✅ **Sets** → When **removing duplicate values** from a dataset.  

---



Lists: Ordered, Mutable Data Structure

In [2]:
# Creating a List
fruits = ["apple", "banana", "cherry"]
print(fruits)

# Accessing Elements
print(fruits[0])  

# Modifying Elements
fruits.append("orange")
print(fruits)  

['apple', 'banana', 'cherry']
apple
['apple', 'banana', 'cherry', 'orange']


# 📌📂3️⃣ Functions & Functional Programming - Deep Dive

## 📌 Theoretical Explanation

### 🔹 What are Functions?
Functions in Python are **self-contained blocks of code** that perform a specific task. They improve **code organization, reusability, and modularity**.

In **Data Science & Machine Learning**, functions are used for:
✔ **Data preprocessing** (cleaning, transforming, encoding)  
✔ **Feature engineering** (creating new features, scaling)  
✔ **Model evaluation** (performance metrics)  
✔ **Automating repetitive tasks**  

---

### 🔹 Why Use Functions in Data Science?
✔ **Code Reusability** → Write once, use multiple times  
✔ **Improved Readability** → Code becomes easier to manage  
✔ **Abstraction** → Complex logic is hidden inside functions  
✔ **Performance Optimization** → Reduces redundant execution  

---

## 📌 3.1 Defining & Calling Functions in Python

A function definition consists of:

- `def` → Keyword to define a function  
- **Function Name** → The name of the function  
- **Parameters (optional)** → Inputs required for the function  
- **A return statement (optional)** → Defines the function output  

In [3]:
# Function to calculate the square of a number
def square(num):
    return num ** 2

# Calling the function
result = square(5)
print(f"The square of 5 is: {result}")


The square of 5 is: 25


📌 Explanation:

square(num) → Defines a function that takes num as an argument.
return num ** 2 → Computes the square of the number.
square(5) → Calls the function with 5 as input.

## 📌 3.2 Function Parameters & Default Arguments

Functions can have:  
✔ **Positional Arguments** → Order matters  
✔ **Keyword Arguments** → Named arguments for better clarity  
✔ **Default Arguments** → Provide default values  
✔ **Variable-Length Arguments** → `*args` and `**kwargs`  


In [4]:
# Function with default argument
def greet(name="User"):
    print(f"Hello, {name}!")

greet("Anandi")  
greet() 


Hello, Anandi!
Hello, User!


📌 **Explanation:**
- `name="User"` → Sets a **default value** for the parameter.  
- If no argument is provided, it defaults to `"User"`.  

---

## 📌 3.3 Returning Multiple Values

Python allows returning **multiple values** as a **tuple**.



In [5]:
# Function returning multiple values
def math_operations(a, b):
    sum_result = a + b
    product = a * b
    return sum_result, product

# Calling the function
sum_result, product_result = math_operations(5, 3)
print(f"Sum: {sum_result}, Product: {product_result}")


Sum: 8, Product: 15


📌 **Explanation:**
- The function **returns multiple values** as a tuple.  
- The values are **unpacked** when calling the function.  

---

## 📌 3.4 Higher-Order Functions (Functional Programming)

A **higher-order function** is a function that:  
✔ **Takes another function as an argument**  
✔ **Returns a function as a result**  

📌 **Examples of Higher-Order Functions in Data Science:**  
✔ `map()` → Applies a function to each element in a sequence  
✔ `filter()` → Filters elements based on a condition  
✔ `reduce()` → Performs cumulative operations  

---

## 📌 3.4.1 Using `map()`

### 📜 Theoretical Explanation  
The `map()` function **applies a function** to each item in an iterable (e.g., list).


In [6]:
numbers = [1, 2, 3, 4, 5]

# Using map() with a function
def square(num):
    return num ** 2

squared_numbers = list(map(square, numbers))
print(squared_numbers) 


[1, 4, 9, 16, 25]


📌 **Explanation:**  
- `map(square, numbers)` → Applies `square()` function to every element in `numbers`.  

---

## 📌 3.4.2 Using `filter()`

### 📜 Theoretical Explanation  
The `filter()` function **filters elements** from an iterable based on a condition.



In [7]:
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Using filter() to get even numbers
even_numbers = list(filter(lambda x: x % 2 == 0, numbers))
print(even_numbers)  

[2, 4, 6, 8, 10]


📌 **Explanation:**  
- `lambda x: x % 2 == 0` → Defines a **lambda function** to check if a number is even.  
- `filter()` **filters elements that satisfy the condition**.  

---

## 📌 3.4.3 Using `reduce()`

### 📜 Theoretical Explanation  
The `reduce()` function performs **cumulative** operations over elements.



In [8]:
from functools import reduce

numbers = [1, 2, 3, 4, 5]

# Using reduce() to find the product of numbers
product = reduce(lambda x, y: x * y, numbers)
print(product) 


120


📌 **Explanation:**  
- `reduce(lambda x, y: x * y, numbers)` → **Multiplies all numbers together**.  

---


## 📌 3.5 Anonymous Functions (Lambda)

### 📜 Theoretical Explanation  
✔ **Lambda functions** are one-line anonymous functions.  
✔ Useful in **functional programming** for short, simple operations.  

📌 **Why Use Lambda?**  
✔ Saves space for **small one-time-use functions**.  
✔ Can be used **inside map(), filter(), and reduce()**.  

In [9]:
# Lambda function to square a number
square = lambda x: x ** 2
print(square(5))


25


# 📌📂4️⃣ NumPy for Efficient Computation (In-Depth Explanation & Code)

## 📌 Theoretical Explanation

### 🔹 Why NumPy?
NumPy (**Numerical Python**) is the foundation of numerical computing in Python. It provides:

✔ **High-performance array operations** – Faster than Python lists  
✔ **Efficient Memory Management** – Uses contiguous memory blocks  
✔ **Vectorized Computations** – Eliminates explicit loops for numerical operations  
✔ **Essential for Machine Learning & AI** – Used in deep learning frameworks like **TensorFlow, PyTorch**  

---

### 🔹 Python Lists vs NumPy Arrays
Python lists are **dynamic** and support heterogeneous data, but they **lack efficiency** when handling large datasets.  

NumPy arrays (**ndarrays**) provide:  

✔ **Faster computations** – Operations on large datasets are significantly faster  
✔ **Fixed-size storage** – Reduces memory overhead  
✔ **Built-in mathematical functions** – Eliminates the need for loops  

---

## 📌 Performance Comparison: Python Lists vs NumPy Arrays4️⃣ NumPy for Efficient Computation 

### 🔹 Why NumPy?
NumPy (**Numerical Python**) is the foundation of numerical computing in Python. It provides:

✔ **High-performance array operations** – Faster than Python lists  
✔ **Efficient Memory Management** – Uses contiguous memory blocks  
✔ **Vectorized Computations** – Eliminates explicit loops for numerical operations  
✔ **Essential for Machine Learning & AI** – Used in deep learning frameworks like **TensorFlow, PyTorch**  

---

### 🔹 Python Lists vs NumPy Arrays
Python lists are **dynamic** and support heterogeneous data, but they **lack efficiency** when handling large datasets.  

NumPy arrays (**ndarrays**) provide:  

✔ **Faster computations** – Operations on large datasets are significantly faster  
✔ **Fixed-size storage** – Reduces memory overhead  
✔ **Built-in mathematical functions** – Eliminates the need for loops  

---

## 📌 Performance Comparison: Python Lists vs NumPy Arrays


In [10]:
import numpy as np
import time

# Creating large dataset
size = 1_000_000  # 1 million elements

# Python List Squaring (Slow)
python_list = list(range(size))
start = time.time()
squared_list = [x**2 for x in python_list]
end = time.time()
print(f"Python List Execution Time: {end - start:.5f} seconds")

# NumPy Array Squaring (Fast)
numpy_array = np.array(python_list)
start = time.time()
squared_numpy = numpy_array**2
end = time.time()
print(f"NumPy Execution Time: {end - start:.5f} seconds")


Python List Execution Time: 0.04374 seconds
NumPy Execution Time: 0.00286 seconds


🔹 NumPy Memory Efficiency

NumPy stores elements in contiguous memory locations which: 

✔ Reduces overhead compared to Python lists

✔ Allows fast retrieval and modification of elements

✔ Optimizes vectorized operations for large datasets


📌 Memory Usage Comparison: Python List vs NumPy Array

In [11]:
import sys

size = 1_000_000  # 1 million elements

# Python List Memory Consumption
python_list = list(range(size))
print(f"Python List Memory: {sys.getsizeof(python_list)} bytes")

# NumPy Array Memory Consumption
numpy_array = np.array(python_list)
print(f"NumPy Array Memory: {numpy_array.nbytes} bytes")


Python List Memory: 8000056 bytes
NumPy Array Memory: 8000000 bytes


📌 Core NumPy Functionalities

🔹 1. Creating NumPy Arrays
NumPy arrays can be created from Python lists, tuples, or using built-in functions.

📌 Basic Array Creation

In [12]:
import numpy as np

# Creating Arrays from Python Lists
arr1 = np.array([1, 2, 3, 4, 5])
arr2 = np.array([[1, 2, 3], [4, 5, 6]])  # 2D Array

# Creating Arrays using NumPy Functions
zeros = np.zeros((3, 3))  # 3x3 array of zeros
ones = np.ones((2, 4))  # 2x4 array of ones
random_values = np.random.rand(3, 3)  # 3x3 array of random values

print(f"1D Array: {arr1}")
print(f"2D Array:\n{arr2}")
print(f"Zeros Array:\n{zeros}")
print(f"Ones Array:\n{ones}")
print(f"Random Values:\n{random_values}")


1D Array: [1 2 3 4 5]
2D Array:
[[1 2 3]
 [4 5 6]]
Zeros Array:
[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
Ones Array:
[[1. 1. 1. 1.]
 [1. 1. 1. 1.]]
Random Values:
[[0.19338469 0.03162071 0.25292853]
 [0.49780889 0.77452738 0.04345748]
 [0.57514902 0.69366269 0.10075391]]



🔹 2. NumPy Array Indexing & Slicing

NumPy provides efficient indexing mechanisms to access, modify, and slice arrays.

📌 Indexing & Slicing Examples

In [13]:
import numpy as np

# Creating a 1D NumPy array
arr = np.array([10, 20, 30, 40, 50])

# Accessing elements
print(arr[0])  
print(arr[-1])  

# Slicing the array
print(arr[1:4])  
print(arr[:3])   

# Modifying elements
arr[0] = 100
print(arr)  


10
50
[20 30 40]
[10 20 30]
[100  20  30  40  50]


📌 Slicing in Multi-Dimensional Arrays

In [14]:
import numpy as np

# Creating a 2D array
matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Accessing elements
print(matrix[0, 0])  
print(matrix[1, 2]) 

# Slicing rows and columns
print(matrix[0, :])  
print(matrix[:, 1]) 


1
6
[1 2 3]
[2 5 8]


🔹 3. NumPy Mathematical Operations

NumPy provides optimized mathematical operations that work on entire arrays.

📌 Element-wise Mathematical Operations

In [15]:
import numpy as np

# Creating Arrays
arr1 = np.array([1, 2, 3, 4, 5])
arr2 = np.array([10, 20, 30, 40, 50])

# Arithmetic Operations
print(arr1 + arr2)  
print(arr1 * 2)  
print(arr1 ** 2)  


[11 22 33 44 55]
[ 2  4  6  8 10]
[ 1  4  9 16 25]


📌 Matrix Operations

In [16]:
import numpy as np

# Creating Matrices
matrix1 = np.array([[1, 2], [3, 4]])
matrix2 = np.array([[5, 6], [7, 8]])

# Matrix Multiplication
dot_product = np.dot(matrix1, matrix2)
print(f"Matrix Multiplication:\n{dot_product}")


Matrix Multiplication:
[[19 22]
 [43 50]]


4. NumPy Statistics & Aggregations

NumPy provides built-in functions for statistical analysis.

In [17]:
import numpy as np

arr = np.array([10, 20, 30, 40, 50])

print(f"Mean: {np.mean(arr)}")  # Average value
print(f"Median: {np.median(arr)}")  # Middle value
print(f"Standard Deviation: {np.std(arr)}")  # Spread of values
print(f"Sum: {np.sum(arr)}")  # Sum of all elements
print(f"Min: {np.min(arr)}, Max: {np.max(arr)}")  # Minimum and Maximum


Mean: 30.0
Median: 30.0
Standard Deviation: 14.142135623730951
Sum: 150
Min: 10, Max: 50



## 📌📂5️⃣ Pandas for Data Analysis
### 🔍 What is Pandas?
Pandas is **the most powerful library for data manipulation and analysis** in Python. It provides:
✔ **DataFrames**: A table-like data structure similar to SQL tables or Excel spreadsheets.  
✔ **Efficient Data Handling**: Filtering, sorting, aggregating, and reshaping large datasets.  
✔ **Seamless Integration**: Works well with NumPy, Matplotlib, and Scikit-Learn.  

### 🔹 Why is Pandas Important in Data Science?
✔ **Data Cleaning & Preprocessing**: Handling missing values, duplicates, and outliers.  
✔ **Feature Engineering**: Transforming raw data into meaningful insights.  
✔ **Data Aggregation & Grouping**: Summarizing and analyzing large datasets efficiently.  

### 📌 Common Operations in Pandas
#### 📂 1️⃣ Loading and Exploring Data



In [18]:
import pandas as pd

# Load a sample dataset
df = pd.read_csv("https://raw.githubusercontent.com/mwaskom/seaborn-data/master/tips.csv")

# Display first five rows
print(df.head())

# Basic dataset info
print(df.info())

# Summary statistics
print(df.describe())


   total_bill   tip     sex smoker  day    time  size
0       16.99  1.01  Female     No  Sun  Dinner     2
1       10.34  1.66    Male     No  Sun  Dinner     3
2       21.01  3.50    Male     No  Sun  Dinner     3
3       23.68  3.31    Male     No  Sun  Dinner     2
4       24.59  3.61  Female     No  Sun  Dinner     4
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 244 entries, 0 to 243
Data columns (total 7 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   total_bill  244 non-null    float64
 1   tip         244 non-null    float64
 2   sex         244 non-null    object 
 3   smoker      244 non-null    object 
 4   day         244 non-null    object 
 5   time        244 non-null    object 
 6   size        244 non-null    int64  
dtypes: float64(2), int64(1), object(4)
memory usage: 13.5+ KB
None
       total_bill         tip        size
count  244.000000  244.000000  244.000000
mean    19.785943    2.998279    2.569672
std      

## **📌 6️⃣ Object-Oriented Programming (OOP)**


### **🔍 What is OOP?**
Object-Oriented Programming (OOP) **organizes code into reusable objects**. It is useful in:
✔ **Machine Learning Pipelines**: Organizing models into reusable classes.  
✔ **Scalability**: Helps manage large-scale AI projects.  
✔ **Encapsulation**: Bundles related data and functions together.

### **🔹 Key Principles of OOP**
✔ **Encapsulation** → Bundling data and methods into classes.  
✔ **Inheritance** → Creating new classes from existing ones.  
✔ **Polymorphism** → Different objects can have the same function name.  

---

### **📌 Basic OOP Implementation**
📜 **1️⃣ Defining a Class & Creating Objects**

In [20]:
# Define a Class
class Employee:
    def __init__(self, name, salary):
        self.name = name
        self.salary = salary
    
    def show_details(self):
        print(f"Employee: {self.name}, Salary: ${self.salary}")

# Create an Object
emp1 = Employee("Alice", 70000)
emp1.show_details()


Employee: Alice, Salary: $70000


# 🔍 Final Comprehensive Project - Employee Salary Analysis
## 📌 Overview
This project integrates **Python, Pandas, NumPy, and Object-Oriented Programming** into a real-world data processing pipeline.

## 🔥 Key Concepts Covered:
✔ **Data Handling** - Using Pandas to load and clean datasets.  
✔ **Optimized Computation** - NumPy for efficient salary calculations.  
✔ **Scalable Code Structure** - OOP to model employee details.  

## 🗂 Dataset Description
- `employees.csv` → Contains employee names, departments, ages, and salaries.



 📂 Creating the Employee Dataset

In [21]:
import pandas as pd

# Create the dataset
data = {
    "Employee_ID": [101, 102, 103, 104, 105],
    "Name": ["Alice", "Bob", "Charlie", "David", "Eve"],
    "Department": ["Engineering", "Finance", "Marketing", "HR", "IT"],
    "Age": [28, 32, 29, 35, 30],
    "Salary": [70000, 65000, 72000, 68000, 73000]
}

# Convert to Pandas DataFrame
df = pd.DataFrame(data)

# Display the dataset
df


Unnamed: 0,Employee_ID,Name,Department,Age,Salary
0,101,Alice,Engineering,28,70000
1,102,Bob,Finance,32,65000
2,103,Charlie,Marketing,29,72000
3,104,David,HR,35,68000
4,105,Eve,IT,30,73000


In [22]:
# Check for missing values
print(df.isnull().sum())

# Basic dataset information
print(df.info())

# Summary statistics
print(df.describe())


Employee_ID    0
Name           0
Department     0
Age            0
Salary         0
dtype: int64
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 5 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   Employee_ID  5 non-null      int64 
 1   Name         5 non-null      object
 2   Department   5 non-null      object
 3   Age          5 non-null      int64 
 4   Salary       5 non-null      int64 
dtypes: int64(3), object(2)
memory usage: 332.0+ bytes
None
       Employee_ID        Age        Salary
count     5.000000   5.000000      5.000000
mean    103.000000  30.800000  69600.000000
std       1.581139   2.774887   3209.361307
min     101.000000  28.000000  65000.000000
25%     102.000000  29.000000  68000.000000
50%     103.000000  30.000000  70000.000000
75%     104.000000  32.000000  72000.000000
max     105.000000  35.000000  73000.000000


In [23]:
# Sort employees by salary in descending order
df_sorted = df.sort_values(by="Salary", ascending=False)
print(df_sorted)

# Filter employees earning above 70,000
high_salary = df[df["Salary"] > 70000]
print(high_salary)


   Employee_ID     Name   Department  Age  Salary
4          105      Eve           IT   30   73000
2          103  Charlie    Marketing   29   72000
0          101    Alice  Engineering   28   70000
3          104    David           HR   35   68000
1          102      Bob      Finance   32   65000
   Employee_ID     Name Department  Age  Salary
2          103  Charlie  Marketing   29   72000
4          105      Eve         IT   30   73000


Using NumPy for Efficient Salary Calculations

In [24]:
import numpy as np

# Convert salaries to NumPy array
salaries = df["Salary"].values

# Compute mean salary
mean_salary = np.mean(salaries)
print(f"Average Salary: ${mean_salary:.2f}")

# Compute highest & lowest salary
max_salary = np.max(salaries)
min_salary = np.min(salaries)

print(f"Highest Salary: ${max_salary}")
print(f"Lowest Salary: ${min_salary}")

# Apply a 10% salary increase using NumPy broadcasting
df["Updated_Salary"] = salaries * 1.1
df


Average Salary: $69600.00
Highest Salary: $73000
Lowest Salary: $65000


Unnamed: 0,Employee_ID,Name,Department,Age,Salary,Updated_Salary
0,101,Alice,Engineering,28,70000,77000.0
1,102,Bob,Finance,32,65000,71500.0
2,103,Charlie,Marketing,29,72000,79200.0
3,104,David,HR,35,68000,74800.0
4,105,Eve,IT,30,73000,80300.0


In [25]:
# Define an Employee class
class Employee:
    def __init__(self, emp_id, name, department, age, salary):
        self.emp_id = emp_id
        self.name = name
        self.department = department
        self.age = age
        self.salary = salary
    
    def display_info(self):
        print(f"Employee {self.name} (ID: {self.emp_id}) works in {self.department} earning ${self.salary}.")
    
    def apply_bonus(self, percentage):
        self.salary *= (1 + percentage / 100)
        print(f"New Salary for {self.name}: ${self.salary:.2f}")


In [26]:
# Create employee instances
emp1 = Employee(101, "Alice", "Engineering", 28, 70000)
emp2 = Employee(102, "Bob", "Finance", 32, 65000)

# Display employee info
emp1.display_info()
emp2.display_info()

# Apply 10% bonus
emp1.apply_bonus(10)
emp2.apply_bonus(5)


Employee Alice (ID: 101) works in Engineering earning $70000.
Employee Bob (ID: 102) works in Finance earning $65000.
New Salary for Alice: $77000.00
New Salary for Bob: $68250.00
