# Week 3: NumPy and Pandas for Data Manipulation 
- Theory: Study NumPy arrays, operations, broadcasting, and Pandas 
DataFrames, Series, indexing, and data grouping. 
- Hands-On: Perform operations with NumPy and manipulate datasets with 
Pandas. 
- Client Project: Clean and aggregate a dataset (e.g., remove missing values, 
calculate averages). 
- Submit: Python script and a summary of the concepts learned (on Google 
Classroom). 


## Numpy

In [1]:
import numpy as np

# Create array
arr = np.array([1, 2, 3, 4, 5])
print(arr)

[1 2 3 4 5]


In [3]:
# Array of zeros
zeros = np.zeros((2,3))
zeros

array([[0., 0., 0.],
       [0., 0., 0.]])

In [5]:
# Array of ones

ones = np.ones((3,3))
ones

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [7]:
# Range of numbers

rng = np.arange(0, 20, 2)
rng

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [9]:
# Reshape array

reshaped = np.arange(12).reshape(3,4)
reshaped

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [11]:
# Random numbers

rand_nums = np.random.randint(1, 100, [3,4])
rand_nums

array([[25, 91, 26, 26],
       [90, 23, 65, 59],
       [27, 30, 28, 70]])

In [13]:
# Sum

array = np.array([11,22,33,44,55,66])
print(f" The sum of the elements in array is: {array.sum()}")

 The sum of the elements in array is: 231


In [15]:
# Mean

print(F" The mean is {array.mean()}")

 The mean is 38.5


In [17]:
# Standard deviation

print(array.std())

18.786076404259266


In [19]:
# Element-wise 

arr = np.array([2, 3, 4, 5])
print("Squared:", np.power(arr, 2))
print("Cube:", np.power(arr, 3))


Squared: [ 4  9 16 25]
Cube: [  8  27  64 125]


In [21]:
dim = np.array([[1,2,3],[1,2,3]])

print(dim)

[[1 2 3]
 [1 2 3]]


In [53]:
print(f"\n shape is {dim.shape}")


 shape is (2, 3)


In [55]:
print(f"\nThe array is in {dim.ndim}D")


The array is in 2D


In [93]:
# Broadcasting 

matrix = np.array([[1,2],[3,4]])
print(matrix * 2)

[[2 4]
 [6 8]]


In [488]:
# Unique elements

arr = np.array([1, 2, 2, 3, 4, 4, 5])
unique_elements = np.unique(arr)
print("Unique Values:", unique_elements)


Unique Values: [1 2 3 4 5]


In [490]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

print("Vertical Stack:\n", np.vstack((a, b)))
print("Horizontal Stack:\n", np.hstack((a, b)))


Vertical Stack:
 [[1 2 3]
 [4 5 6]]
Horizontal Stack:
 [1 2 3 4 5 6]


In [480]:
# transpose
matrix = np.array([[1, 2], [3, 4], [5, 6]])

print("Original:\n", matrix)
print("Transpose:\n", matrix.T)



Original:
 [[1 2]
 [3 4]
 [5 6]]
Transpose:
 [[1 3 5]
 [2 4 6]]


## Pandas: Data wrangling with nba datasest

In [250]:
import pandas as pd

In [332]:
nba = pd.read_csv("nba.csv")
nba.head(3)

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
0,Avery Bradley,Boston Celtics,0.0,PG,25.0,6-2,180.0,Texas,7730337.0
1,Jae Crowder,Boston Celtics,99.0,SF,25.0,6-6,235.0,Marquette,6796117.0
2,John Holland,Boston Celtics,30.0,SG,27.0,6-5,205.0,Boston University,


In [400]:
nba.tail(3)

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
455,Tibor Pleiss,Utah Jazz,21.0,C,26.0,7-3,256.0,,2900000.0
456,Jeff Withey,Utah Jazz,24.0,C,26.0,7-0,231.0,Kansas,947276.0
457,,,,,,,,,


In [334]:
nba.shape

(458, 9)

In [342]:
nba.dtypes.value_counts()

object     5
float64    4
Name: count, dtype: int64

In [344]:
nba.describe()

Unnamed: 0,Number,Age,Weight,Salary
count,457.0,457.0,457.0,446.0
mean,17.678337,26.938731,221.522976,4842684.0
std,15.96609,4.404016,26.368343,5229238.0
min,0.0,19.0,161.0,30888.0
25%,5.0,24.0,200.0,1044792.0
50%,13.0,26.0,220.0,2839073.0
75%,25.0,30.0,240.0,6500000.0
max,99.0,40.0,307.0,25000000.0


In [406]:
nba[['Name', 'Team', 'Number']]


Unnamed: 0,Name,Team,Number
0,Avery Bradley,Boston Celtics,0.0
1,Jae Crowder,Boston Celtics,99.0
2,John Holland,Boston Celtics,30.0
3,R.J. Hunter,Boston Celtics,28.0
4,Jonas Jerebko,Boston Celtics,8.0
...,...,...,...
453,Shelvin Mack,Utah Jazz,8.0
454,Raul Neto,Utah Jazz,25.0
455,Tibor Pleiss,Utah Jazz,21.0
456,Jeff Withey,Utah Jazz,24.0


In [408]:
nba.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 458 entries, 0 to 457
Data columns (total 9 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   Name      457 non-null    object 
 1   Team      457 non-null    object 
 2   Number    457 non-null    float64
 3   Position  457 non-null    object 
 4   Age       457 non-null    float64
 5   Height    457 non-null    object 
 6   Weight    457 non-null    float64
 7   College   373 non-null    object 
 8   Salary    446 non-null    float64
dtypes: float64(4), object(5)
memory usage: 32.3+ KB


In [410]:
nba['Position'].value_counts()

Position
SG    102
PF    100
PG     92
SF     85
C      78
Name: count, dtype: int64

In [412]:
nba.columns

Index(['Name', 'Team', 'Number', 'Position', 'Age', 'Height', 'Weight',
       'College', 'Salary'],
      dtype='object')

In [416]:
nba["Team"].unique()

array(['Boston Celtics', 'Brooklyn Nets', 'New York Knicks',
       'Philadelphia 76ers', 'Toronto Raptors', 'Golden State Warriors',
       'Los Angeles Clippers', 'Los Angeles Lakers', 'Phoenix Suns',
       'Sacramento Kings', 'Chicago Bulls', 'Cleveland Cavaliers',
       'Detroit Pistons', 'Indiana Pacers', 'Milwaukee Bucks',
       'Dallas Mavericks', 'Houston Rockets', 'Memphis Grizzlies',
       'New Orleans Pelicans', 'San Antonio Spurs', 'Atlanta Hawks',
       'Charlotte Hornets', 'Miami Heat', 'Orlando Magic',
       'Washington Wizards', 'Denver Nuggets', 'Minnesota Timberwolves',
       'Oklahoma City Thunder', 'Portland Trail Blazers', 'Utah Jazz',
       nan], dtype=object)

In [404]:
nba.isnull().sum()

Name         1
Team         1
Number       1
Position     1
Age          1
Height       1
Weight       1
College     85
Salary      12
dtype: int64

In [418]:
nba.dropna(axis = 0,how='all', inplace = True)

In [426]:
nba['Salary'] = nba['Salary'].fillna(nba['Salary'].mean())


In [428]:
nba.sort_values('Salary', ascending = False)

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
109,Kobe Bryant,Los Angeles Lakers,24.0,SF,37.0,6-6,212.0,,25000000.0
169,LeBron James,Cleveland Cavaliers,23.0,SF,31.0,6-8,250.0,,22970500.0
33,Carmelo Anthony,New York Knicks,7.0,SF,32.0,6-8,240.0,Syracuse,22875000.0
251,Dwight Howard,Houston Rockets,12.0,C,30.0,6-11,265.0,,22359364.0
339,Chris Bosh,Miami Heat,1.0,PF,32.0,6-11,235.0,Georgia Tech,22192730.0
...,...,...,...,...,...,...,...,...,...
175,Jordan McRae,Cleveland Cavaliers,12.0,SG,25.0,6-5,179.0,Tennessee,111196.0
135,Alan Williams,Phoenix Suns,15.0,C,23.0,6-8,260.0,UC Santa Barbara,83397.0
130,Phil Pressey,Phoenix Suns,25.0,PG,25.0,5-11,175.0,Missouri,55722.0
291,Orlando Johnson,New Orleans Pelicans,0.0,SG,27.0,6-5,220.0,UC Santa Barbara,55722.0


In [436]:
# we are going to remove remaining objects collge values where there are NaN values present
nba.dropna(inplace= True)

In [454]:
nba.head(20)

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
0,Avery Bradley,Boston Celtics,0.0,PG,25.0,6-2,180.0,Texas,7730337.0
1,Jae Crowder,Boston Celtics,99.0,SF,25.0,6-6,235.0,Marquette,6796117.0
2,John Holland,Boston Celtics,30.0,SG,27.0,6-5,205.0,Boston University,4842684.0
3,R.J. Hunter,Boston Celtics,28.0,SG,22.0,6-5,185.0,Georgia State,1148640.0
6,Jordan Mickey,Boston Celtics,55.0,PF,21.0,6-8,235.0,LSU,1170960.0
7,Kelly Olynyk,Boston Celtics,41.0,C,25.0,7-0,238.0,Gonzaga,2165160.0
8,Terry Rozier,Boston Celtics,12.0,PG,22.0,6-2,190.0,Louisville,1824360.0
9,Marcus Smart,Boston Celtics,36.0,PG,22.0,6-4,220.0,Oklahoma State,3431040.0
10,Jared Sullinger,Boston Celtics,7.0,C,24.0,6-9,260.0,Ohio State,2569260.0
11,Isaiah Thomas,Boston Celtics,4.0,PG,27.0,5-9,185.0,Washington,6912869.0


In [458]:
nba.reset_index(drop= True, inplace= True)

In [460]:
nba.isnull().sum()

Name        0
Team        0
Number      0
Position    0
Age         0
Height      0
Weight      0
College     0
Salary      0
dtype: int64

In [470]:
nba.loc[1:5]

Unnamed: 0,Name,Team,Number,Position,Age,Height,Weight,College,Salary
1,Jae Crowder,Boston Celtics,99.0,SF,25.0,6-6,235.0,Marquette,6796117.0
2,John Holland,Boston Celtics,30.0,SG,27.0,6-5,205.0,Boston University,4842684.0
3,R.J. Hunter,Boston Celtics,28.0,SG,22.0,6-5,185.0,Georgia State,1148640.0
4,Jordan Mickey,Boston Celtics,55.0,PF,21.0,6-8,235.0,LSU,1170960.0
5,Kelly Olynyk,Boston Celtics,41.0,C,25.0,7-0,238.0,Gonzaga,2165160.0


In [474]:
nba.iloc[2,7]

'Boston University'

## Client Project: Clean and aggregate a Employee dataset

In [252]:
emp = pd.read_csv("employees.csv")
emp.head()

Unnamed: 0,First Name,Gender,Start Date,Last Login Time,Salary,Bonus %,Senior Management,Team
0,Douglas,Male,08-06-1993,12:42 PM,97308,6.945,True,Marketing
1,Thomas,Male,3/31/1996,6:53 AM,61933,4.17,True,
2,Maria,Female,4/23/1993,11:17 AM,130590,11.858,False,Finance
3,Jerry,Male,03-04-2005,1:00 PM,138705,9.34,True,Finance
4,Larry,Male,1/24/1998,4:47 PM,101004,1.389,True,Client Services


In [253]:
emp.dtypes.value_counts()

object     6
int64      1
float64    1
Name: count, dtype: int64

In [254]:
emp.dtypes

First Name            object
Gender                object
Start Date            object
Last Login Time       object
Salary                 int64
Bonus %              float64
Senior Management     object
Team                  object
dtype: object

In [255]:
emp.describe()

Unnamed: 0,Salary,Bonus %
count,1000.0,1000.0
mean,90662.181,10.207555
std,32923.693342,5.528481
min,35013.0,1.015
25%,62613.0,5.40175
50%,90428.0,9.8385
75%,118740.25,14.838
max,149908.0,19.944


In [256]:
emp.describe(include = "object")

Unnamed: 0,First Name,Gender,Start Date,Last Login Time,Senior Management,Team
count,933,855,1000,1000,933,957
unique,200,2,972,720,2,10
top,Marilyn,Female,10/30/1994,1:35 PM,True,Client Services
freq,11,431,2,5,468,106


In [257]:
emp.shape

(1000, 8)

In [264]:
emp.isnull().sum()

First Name            67
Gender               145
Start Date             0
Last Login Time        0
Salary                 0
Bonus %                0
Senior Management     67
Team                  43
dtype: int64

There are some NaN values in the dataset, which need to be cleaned

In [275]:
emp.dropna(inplace=True)

In [281]:
emp.reset_index(drop=True, inplace=True)
emp.head(3)

Unnamed: 0,First Name,Gender,Start Date,Last Login Time,Salary,Bonus %,Senior Management,Team
0,Douglas,Male,08-06-1993,12:42 PM,97308,6.945,True,Marketing
1,Maria,Female,4/23/1993,11:17 AM,130590,11.858,False,Finance
2,Jerry,Male,03-04-2005,1:00 PM,138705,9.34,True,Finance


In [273]:
emp.isnull().sum()

First Name           0
Gender               0
Start Date           0
Last Login Time      0
Salary               0
Bonus %              0
Senior Management    0
Team                 0
dtype: int64

In [220]:
emp.shape

(764, 8)

In [283]:
emp.head(2)

Unnamed: 0,First Name,Gender,Start Date,Last Login Time,Salary,Bonus %,Senior Management,Team
0,Douglas,Male,08-06-1993,12:42 PM,97308,6.945,True,Marketing
1,Maria,Female,4/23/1993,11:17 AM,130590,11.858,False,Finance


In [295]:
emp.loc[1:5] # label      


Unnamed: 0,First Name,Gender,Start Date,Last Login Time,Salary,Bonus %,Senior Management,Team
1,Maria,Female,4/23/1993,11:17 AM,130590,11.858,False,Finance
2,Jerry,Male,03-04-2005,1:00 PM,138705,9.34,True,Finance
3,Larry,Male,1/24/1998,4:47 PM,101004,1.389,True,Client Services
4,Dennis,Male,4/18/1987,1:35 AM,115163,10.125,False,Legal
5,Ruby,Female,8/17/1987,4:20 PM,65476,10.012,True,Product


In [301]:
emp.iloc[1,5] # position


11.858

In [313]:
result = emp["Salary"].agg(["mean", "max","min"])

print(result)

mean     90433.196335
max     149908.000000
min      35013.000000
Name: Salary, dtype: float64


In [329]:
result = emp["Bonus %"].agg(["mean", "max", "min"])
result

mean    10.148041
max     19.944000
min      1.015000
Name: Bonus %, dtype: float64

In [327]:
result = emp.agg({
    "Salary": ["min", "max", "mean"],
    "Bonus %": ["mean", "sum"]
})
result

Unnamed: 0,Salary,Bonus %
min,35013.0,
max,149908.0,
mean,90433.196335,10.148041
sum,,7753.103


In [325]:
result = emp.groupby("Gender").agg({
    "Salary": "mean",
    "Bonus %": "max"
})
result

Unnamed: 0_level_0,Salary,Bonus %
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1
Female,89736.834606,19.85
Male,91170.851752,19.944


In [323]:
# For each team, calculate mean salary and total bonus

result = emp.groupby("Team").agg({
    "Salary": ["mean", "max"],
    "Bonus %": ["mean", "sum"]
})
result


Unnamed: 0_level_0,Salary,Salary,Bonus %,Bonus %
Unnamed: 0_level_1,mean,max,mean,sum
Team,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2
Business Development,90520.397727,147417,10.611989,933.855
Client Services,89336.658824,147183,10.341941,879.065
Distribution,85849.1,149105,9.221533,553.292
Engineering,94369.405063,147362,10.153494,802.126
Finance,94519.075,149908,9.847038,787.763
Human Resources,91145.171053,149903,10.242921,778.462
Legal,88066.402985,148985,10.661881,714.346
Marketing,90764.081081,146812,10.446135,773.014
Product,86935.963855,149684,9.714759,806.325
Sales,91724.819444,144887,10.067431,724.855


In [376]:
emp["Team"].unique()

array(['Marketing', 'Finance', 'Client Services', 'Legal', 'Product',
       'Engineering', 'Business Development', 'Human Resources', 'Sales',
       'Distribution'], dtype=object)

In [370]:
# In finace team how many feamles are working
mask1 = emp['Gender'] == 'Female'
mask2 = emp['Team'] == 'Finance'

emp[mask1 & mask2].head()

Unnamed: 0,First Name,Gender,Start Date,Last Login Time,Salary,Bonus %,Senior Management,Team
1,Maria,Female,4/23/1993,11:17 AM,130590,11.858,False,Finance
11,Kimberly,Female,1/14/1999,7:13 AM,41426,14.543,True,Finance
49,Rachel,Female,8/16/1999,6:53 AM,51178,9.735,True,Finance
63,Doris,Female,8/20/2004,5:51 AM,83072,7.511,False,Finance
71,Cynthia,Female,3/21/1994,8:34 AM,142321,1.737,False,Finance


In [378]:
# How many are working in senior manaement where the date is less than 1980-01-01

mask1 = emp['Senior Management']
mask2 = emp['Start Date'] < '1980-01-01'

emp[mask1 | mask2].head()

Unnamed: 0,First Name,Gender,Start Date,Last Login Time,Salary,Bonus %,Senior Management,Team
0,Douglas,Male,08-06-1993,12:42 PM,97308,6.945,True,Marketing
2,Jerry,Male,03-04-2005,1:00 PM,138705,9.34,True,Finance
3,Larry,Male,1/24/1998,4:47 PM,101004,1.389,True,Client Services
5,Ruby,Female,8/17/1987,4:20 PM,65476,10.012,True,Product
6,Angela,Female,11/22/2005,6:29 AM,95570,18.523,True,Engineering


In [384]:
# Gender Female and Teams are Finance and Marketing details

mask1 = emp['Gender'] == 'Female'
mask2 = emp['Team'] == 'Finance'
fas = emp['Team'] == 'Marketing'

emp[(mask1 & mask2) | fas]

Unnamed: 0,First Name,Gender,Start Date,Last Login Time,Salary,Bonus %,Senior Management,Team
0,Douglas,Male,08-06-1993,12:42 PM,97308,6.945,True,Marketing
1,Maria,Female,4/23/1993,11:17 AM,130590,11.858,False,Finance
11,Kimberly,Female,1/14/1999,7:13 AM,41426,14.543,True,Finance
17,Matthew,Male,09-05-1995,2:12 AM,100612,13.645,False,Marketing
19,Craig,Male,2/27/2000,7:45 AM,37598,7.757,True,Marketing
...,...,...,...,...,...,...,...,...
715,Lori,Female,11/20/2015,1:15 PM,75498,6.537,True,Marketing
752,Donna,Female,11/26/1982,7:04 AM,82871,17.999,False,Marketing
753,Gloria,Female,12-08-2014,5:08 AM,136709,10.331,True,Finance
756,Rose,Female,8/25/2002,5:12 AM,134505,11.051,True,Marketing


In [386]:
# Teams of Legal, Sales and Product detials

mask1 = emp['Team'] == 'Legal'
mask2 = emp['Team'] == 'Sales'
mask3 = emp['Team'] == 'Product'

emp[mask1 |mask2| mask3]

Unnamed: 0,First Name,Gender,Start Date,Last Login Time,Salary,Bonus %,Senior Management,Team
4,Dennis,Male,4/18/1987,1:35 AM,115163,10.125,False,Legal
5,Ruby,Female,8/17/1987,4:20 PM,65476,10.012,True,Product
8,Julie,Female,10/26/1997,3:19 PM,102508,12.637,True,Legal
10,Gary,Male,1/27/2008,11:40 PM,109831,5.831,False,Sales
12,Lillian,Female,06-05-2016,6:09 AM,59414,1.256,False,Product
...,...,...,...,...,...,...,...,...
744,Sarah,Female,12-04-1995,9:16 AM,124566,5.949,False,Product
746,Ernest,Male,7/20/2013,6:41 AM,142935,13.198,True,Product
748,James,Male,1/15/1993,5:19 PM,148985,19.280,False,Legal
761,Russell,Male,5/20/2013,12:39 PM,96914,1.421,False,Product


In [354]:
mask = emp['Team'].isnull()
emp[mask].head()

Unnamed: 0,First Name,Gender,Start Date,Last Login Time,Salary,Bonus %,Senior Management,Team


In [396]:
# salary above 50000

emp[emp['Salary'] > 50000]

Unnamed: 0,First Name,Gender,Start Date,Last Login Time,Salary,Bonus %,Senior Management,Team
0,Douglas,Male,08-06-1993,12:42 PM,97308,6.945,True,Marketing
1,Maria,Female,4/23/1993,11:17 AM,130590,11.858,False,Finance
2,Jerry,Male,03-04-2005,1:00 PM,138705,9.340,True,Finance
3,Larry,Male,1/24/1998,4:47 PM,101004,1.389,True,Client Services
4,Dennis,Male,4/18/1987,1:35 AM,115163,10.125,False,Legal
...,...,...,...,...,...,...,...,...
758,Tina,Female,5/15/1997,3:53 PM,56450,19.040,True,Engineering
759,George,Male,6/21/2013,5:47 PM,98874,4.479,True,Marketing
761,Russell,Male,5/20/2013,12:39 PM,96914,1.421,False,Product
762,Larry,Male,4/20/2013,4:45 PM,60500,11.985,False,Business Development


In [394]:
# who are getting bonus betwwen 2 to 5
emp[emp['Bonus %'].between(2.0, 5.0)].sort_values('Bonus %', ascending = False)

Unnamed: 0,First Name,Gender,Start Date,Last Login Time,Salary,Bonus %,Senior Management,Team
59,Bonnie,Female,11/13/1988,3:30 PM,115814,4.990,False,Product
266,Ronald,Male,2/24/2009,2:09 PM,96633,4.990,True,Engineering
156,Willie,Male,06-06-2006,9:45 AM,55281,4.935,True,Marketing
636,Lillian,Female,8/26/2002,8:53 AM,103854,4.924,True,Distribution
410,Clarence,Male,8/26/1982,9:47 AM,146589,4.905,True,Business Development
...,...,...,...,...,...,...,...,...
611,Mary,Female,11-06-2011,8:32 AM,115057,2.089,False,Finance
460,Louis,Male,4/15/2011,5:02 AM,95198,2.075,False,Business Development
571,Lisa,Female,04-11-2007,1:04 AM,128042,2.030,True,Legal
373,Howard,Male,04-09-2012,6:36 AM,37984,2.021,False,Distribution


In [398]:
# Salary between 60000 and 70000

emp[emp['Salary'].between(60000, 70000 )]

Unnamed: 0,First Name,Gender,Start Date,Last Login Time,Salary,Bonus %,Senior Management,Team
5,Ruby,Female,8/17/1987,4:20 PM,65476,10.012,True,Product
35,Kathy,Female,6/22/2005,4:51 AM,66820,9.000,True,Client Services
42,Henry,Male,6/26/1996,1:44 AM,64715,15.107,True,Human Resources
44,Irene,Female,05-07-1997,9:32 AM,66851,11.279,False,Engineering
47,Steve,Male,11-11-2009,11:44 PM,61310,12.428,True,Distribution
...,...,...,...,...,...,...,...,...
733,Catherine,Female,9/25/1989,1:31 AM,68164,18.393,False,Client Services
738,Alice,Female,09-03-1988,8:54 PM,63571,15.397,True,Product
741,Harry,Male,8/30/2011,6:31 PM,67656,16.455,True,Client Services
745,Sean,Male,1/17/1983,2:23 PM,66146,11.178,False,Human Resources


## Week 3: NumPy and Pandas
Concepts Learned:

- NumPy Arrays: Fast, memory-efficient arrays for numerical data
- Array Operations: Element-wise addition, subtraction, multiplication, division
- Broadcasting: Applying operations between arrays of different shapes
- Indexing & Slicing: Extracting and modifying parts of arrays
- Pandas Series & DataFrames: Handling tabular and labeled data
- Indexing & Filtering: Selecting rows/columns based on conditions
- Aggregation: Using .groupby() and .agg() for summaries
- Missing Values: Handling NaN with .dropna() and .fillna()

Hands-On:

- Performed 11 NumPy operations (reshape, transpose, sum, mean, std, max, min, dot product, etc.)
- Practiced Pandas dataset manipulation: filtering rows, selecting columns, applying functions, grouping, and aggregating data

Client Project:
- Cleaned dataset by removing missing values and duplicates
- Aggregated data to calculate averages of Salary and Bonus % by Team
- Exported the cleaned dataset for reporting