# **Python (NumPy and Pandas)**

**Objectives**

**The objective of this assignment is to familiarize students with NumPy, a powerful Python library for numerical computing. Students will practice creating and manipulating arrays, performing vectorized operations, using broadcasting, and applying mathematical functions efficiently. Another objective of this assignment is to develop students’ understanding of data analysis using Pandas. Students will practice creating DataFrames, reading data from CSV, filtering, grouping, sorting, and performing basic statistics using Pandas operations.**

**Task#01: Array Creation**

In [23]:
import numpy as np
import pandas as pd
# Create the following arrays using NumPy:
#  o Anarray from a Python list [1, 2, 3, 4, 5]
print(np.array([1, 2, 3, 4, 5]))
#  o Anarray of 10 zeros
print(np.zeros(10))
#  o Anarray of 10 ones
print(np.ones(10))
#  o Anarray of numbers from 0 to 9
print(np.array(range(10)))
# or print(np.arange(10))


[1 2 3 4 5]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[0 1 2 3 4 5 6 7 8 9]


**Task#02: Array Properties**

In [24]:
# Create a 2D array [[1, 2, 3], [4, 5, 6]]
arr = np.array([[1, 2, 3], [4, 5, 6]])
#  Print:
#  o Shape
print("Shape = ",arr.shape)
#  o Dimensions
print("Dimentions = ",arr.ndim)
#  o Datatype
print("Data Type = ", arr.dtype)

Shape =  (2, 3)
Dimentions =  2
Data Type =  int64


**Task#03: Indexing & Slicing**

In [25]:
# Create an array from 10 to 20
arr = np.array(range(10,21))
#  Print:
#  o First 5 elements
print(arr[:5])
#  o Last3elements
print(arr[-3:])
#  o Reverse the array
print(arr[::-1])

[10 11 12 13 14]
[18 19 20]
[20 19 18 17 16 15 14 13 12 11 10]


**Task#04: Reshape Array**

In [26]:
# Create a 1D array with values from 1 to 12
arr_1d = np.array(range(1,13))
#  Reshape it into a 3x4 matrix
print(arr_1d.reshape(3,4))

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


**Task#05: Basic Arithmetic Operations**

In [27]:
#  Create two arrays:
#  a =np.array([1, 2, 3])
#  b =np.array([4, 5, 6])
#  Perform:
# o Addition, Subtraction, Multiplication, Division
#  o Dotproduct
a =np.array([1, 2, 3])
b =np.array([4, 5, 6])

print("Addition: ",a+b)
print("Subtraction: ",a-b)
print("Multiplication",a*b)
print("Division: ",a/b)

# Dot Product
#Method 1
print("Dot Product: ", np.dot(a,b))
#Method 2
print("Dot Product: ", a@b)
#Method 3
print("Dot Product: ", sum(a*b))




Addition:  [5 7 9]
Subtraction:  [-3 -3 -3]
Multiplication [ 4 10 18]
Division:  [0.25 0.4  0.5 ]
Dot Product:  32
Dot Product:  32
Dot Product:  32


**Task#06: Statistical Functions**

In [28]:
# Create an array of 15 random integers between 10 and 100
ran_num = np.random.randint(10,100,15)
#  Find:
#  o Minimumand Maximum
print("Minimum: ", ran_num.min())
print("Maximum: ", ran_num.max())
#  o Mean and Standard Deviation
print("Mean: ", ran_num.mean())
print("Standard Deviation: ", ran_num.std())
#  o SumandProduct
print("Sum: ", ran_num.sum())
print("Product: ", ran_num.prod())


Minimum:  10
Maximum:  95
Mean:  54.13333333333333
Standard Deviation:  26.147955093191424
Sum:  812
Product:  5803892416555614208


**Task#07: Array Filtering (Boolean Indexing)**

In [29]:
#  Create an array of numbers from 1 to 20
arr1 = np.arange(1,21)
print(arr1)
#  Extract:
#  o All even numbers
even_arr = arr1[arr1%2==0]
print(even_arr)
#  o Numbers greater than 10
filtered_arr = arr1[arr1>10]
print(filtered_arr)

[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20]
[ 2  4  6  8 10 12 14 16 18 20]
[11 12 13 14 15 16 17 18 19 20]


**Task#08: Create Identity and Diagonal Matrices**

In [30]:
#  Create a 4x4 identity matrix
identity = np.identity(4)
print(identity)
#  Create a diagonal matrix with values [10, 20, 30, 40]
elements = [10,20,30,40]
diagonal = np.diag(elements)
print(diagonal)

[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]
[[10  0  0  0]
 [ 0 20  0  0]
 [ 0  0 30  0]
 [ 0  0  0 40]]


**Task#09: List Filtering with Loops**

In [31]:
# •	Given a list of numbers, use a loop to:
# o	Create a new list containing only even numbers
numbers = np.array([12, 7, 9, 24, 18, 5, 42, 33, 10])
even_numbers = numbers[numbers % 2 == 0]

print("Original list:", numbers)
print("Even numbers:", even_numbers)

Original list: [12  7  9 24 18  5 42 33 10]
Even numbers: [12 24 18 42 10]


**Task#10: Load Dataset**

In [32]:
# Load the CSV file using pd.read_csv()
#  Display the first 5 rows
#  Print column names and data types

import pandas as pd
df = pd.read_csv("D:\Semester V\Data Science\pandas_practice_dataset.csv")
print(df.head())
print("-"*50)
print(df.dtypes)

    ID    Name  Age Department  Salary  Experience_Years
0  101     Ali   22         IT   50000               1.0
1  102    Sara   24         HR   55000               2.0
2  103    Zain   23    Finance   52000               1.5
3  104  Fatima   21         IT   51000               1.0
4  105    Omar   25         HR   58000               3.0
--------------------------------------------------
ID                    int64
Name                 object
Age                   int64
Department           object
Salary                int64
Experience_Years    float64
dtype: object


  df = pd.read_csv("D:\Semester V\Data Science\pandas_practice_dataset.csv")


**Task#11: Basic Data Exploration**

In [33]:
# Show the number of rows and columns
print(df.shape)
#  Get basic statistics using .describe()
print(df.describe())
#  Print the number of unique departments
print(df["Department"].unique())

(5, 6)
               ID        Age        Salary  Experience_Years
count    5.000000   5.000000      5.000000           5.00000
mean   103.000000  23.000000  53200.000000           1.70000
std      1.581139   1.581139   3271.085447           0.83666
min    101.000000  21.000000  50000.000000           1.00000
25%    102.000000  22.000000  51000.000000           1.00000
50%    103.000000  23.000000  52000.000000           1.50000
75%    104.000000  24.000000  55000.000000           2.00000
max    105.000000  25.000000  58000.000000           3.00000
['IT' 'HR' 'Finance']


**Task#12: Selecting Data**

In [34]:
# Print all employee names and their salaries
print(df[["Name","Salary"]])
print("-"*30)
#  Show all rows where Department is "IT"
print(df[df["Department"]=="IT"])

     Name  Salary
0     Ali   50000
1    Sara   55000
2    Zain   52000
3  Fatima   51000
4    Omar   58000
------------------------------
    ID    Name  Age Department  Salary  Experience_Years
0  101     Ali   22         IT   50000               1.0
3  104  Fatima   21         IT   51000               1.0


**Task#13: Filtering and Conditions**

In [None]:
# Show all employees with salary > 52000
#Method 1
print(df[df["Salary"]>52000][["Name","Salary"]])
print("-"*30)
#Method 2
print(df.query("Salary>52000")[["Name","Salary"]])

#  Show employees with experience >= 2 years
print("="*30)
print(df[df["Experience_Years"]>=2][["Name","Experience_Years"]])

print("-"*30)
print(df.query("Experience_Years>=2")[["Name","Experience_Years"]])


   Name  Salary
1  Sara   55000
4  Omar   58000
------------------------------
   Name  Salary
1  Sara   55000
4  Omar   58000
   Name  Experience_Years
1  Sara               2.0
4  Omar               3.0
------------------------------
   Name  Experience_Years
1  Sara               2.0
4  Omar               3.0


**Task#14: Sorting and Indexing**

In [36]:
# Sort the data by salary in descending order
print(df.sort_values(by="Salary", ascending=False))
#  Set the “ID” column as the index
print("-"*60)
print(df.set_index("ID"))

    ID    Name  Age Department  Salary  Experience_Years
4  105    Omar   25         HR   58000               3.0
1  102    Sara   24         HR   55000               2.0
2  103    Zain   23    Finance   52000               1.5
3  104  Fatima   21         IT   51000               1.0
0  101     Ali   22         IT   50000               1.0
------------------------------------------------------------
       Name  Age Department  Salary  Experience_Years
ID                                                   
101     Ali   22         IT   50000               1.0
102    Sara   24         HR   55000               2.0
103    Zain   23    Finance   52000               1.5
104  Fatima   21         IT   51000               1.0
105    Omar   25         HR   58000               3.0


**Task#15: Adding/Modifying Columns**

In [37]:
# Create a new column Bonus = 10% of Salary
df["Bonus"] = 0.1*df["Salary"]
print(df.head())
#  Update the Age of “Zain” to 24
df.loc[df["Name"]=="Zain", "Age"] = 24
print(df)

    ID    Name  Age Department  Salary  Experience_Years   Bonus
0  101     Ali   22         IT   50000               1.0  5000.0
1  102    Sara   24         HR   55000               2.0  5500.0
2  103    Zain   23    Finance   52000               1.5  5200.0
3  104  Fatima   21         IT   51000               1.0  5100.0
4  105    Omar   25         HR   58000               3.0  5800.0
    ID    Name  Age Department  Salary  Experience_Years   Bonus
0  101     Ali   22         IT   50000               1.0  5000.0
1  102    Sara   24         HR   55000               2.0  5500.0
2  103    Zain   24    Finance   52000               1.5  5200.0
3  104  Fatima   21         IT   51000               1.0  5100.0
4  105    Omar   25         HR   58000               3.0  5800.0


**Task#16: Grouping and Aggregation**

In [38]:
# Group employees by Department and calculate:
#  o Average Salary
print(df.groupby("Department")["Salary"].mean())
#  o Total Experience
print(df.groupby("Department")["Experience_Years"].sum())


Department
Finance    52000.0
HR         56500.0
IT         50500.0
Name: Salary, dtype: float64
Department
Finance    1.5
HR         5.0
IT         2.0
Name: Experience_Years, dtype: float64


**Task#17: Advanced Filtering**

In [39]:
# Show employees who are either in “HR” or have a salary greater than 55000
print(df[(df["Department"]=="HR") | (df["Salary"]>52000)][["Name", "Department", "Salary"]])

   Name Department  Salary
1  Sara         HR   55000
4  Omar         HR   58000


**Task#18: Apply and Lambda**

In [40]:
# Create a new column Salary_Category:
#  o "High" if salary > 53000
#  o "Low"otherwise
df["Category"] = df["Salary"].apply(lambda x: "High" if x>53000 else "Low")
print(df)

    ID    Name  Age Department  Salary  Experience_Years   Bonus Category
0  101     Ali   22         IT   50000               1.0  5000.0      Low
1  102    Sara   24         HR   55000               2.0  5500.0     High
2  103    Zain   24    Finance   52000               1.5  5200.0      Low
3  104  Fatima   21         IT   51000               1.0  5100.0      Low
4  105    Omar   25         HR   58000               3.0  5800.0     High


**Task#19: Dropping and Renaming**

In [41]:
# Drop the Experience_Years column
print(df.drop(columns="Experience_Years"))

#  Rename column Salary to Monthly_Salary
print(df.rename(columns={"Salary":"Monthly Salary"}))



    ID    Name  Age Department  Salary   Bonus Category
0  101     Ali   22         IT   50000  5000.0      Low
1  102    Sara   24         HR   55000  5500.0     High
2  103    Zain   24    Finance   52000  5200.0      Low
3  104  Fatima   21         IT   51000  5100.0      Low
4  105    Omar   25         HR   58000  5800.0     High
    ID    Name  Age Department  Monthly Salary  Experience_Years   Bonus  \
0  101     Ali   22         IT           50000               1.0  5000.0   
1  102    Sara   24         HR           55000               2.0  5500.0   
2  103    Zain   24    Finance           52000               1.5  5200.0   
3  104  Fatima   21         IT           51000               1.0  5100.0   
4  105    Omar   25         HR           58000               3.0  5800.0   

  Category  
0      Low  
1     High  
2      Low  
3      Low  
4     High  


**Task#20: Export Updated Data**

In [42]:
# •	Save the updated DataFrame with new columns and change into a new CSV file
df.to_csv('updated_students.csv', index=False)

## **The End**