#  Week 4: NumPy & Pandas


### Introduction
In this notebook, we will dive deeper into:

- NumPy arrays: creation, slicing, operations
- Pandas DataFrames: creation, accessing, filtering, adding columns
- Combining both libraries for data handling

Please complete the sections marked as # YOUR CODE HERE.
You will get practice examples, and short tasks to complete on your own.

### Section 1: NumPy Arrays

🏫 Example: Create a 1D array and explore it

In [None]:
listNum = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
print(listNum)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


In [None]:
import numpy as np

In [None]:
# Create a 1D array containing numbers from 0 to 9
myArray = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
print("Array:", myArray)

Array: [0 1 2 3 4 5 6 7 8 9]


In [None]:
# Access the element at index 3 (4th element)
print("Fifth element:", myArray[4])

# Alternative way to create the array
myArray2 = np.array(range(20))
print("Alternative array:", myArray2)

Fifth element: 4
Alternative array: [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]


In [None]:
testArray = myArray*2/10+5
print(testArray)

[5.  5.2 5.4 5.6 5.8 6.  6.2 6.4 6.6 6.8]


🏫 Exercise 1: Perform math operations

In [None]:
# TO DO: Create another array with values [10,20,30,40,50,60,70,80,90,100]
myArray3 = np.array([10,20,30,40,50,60,70,80,90,100])  # YOUR CODE HERE

# Add myArray and myArray3 element-wise
arraySum = myArray + myArray3  # YOUR CODE HERE
print("Result of addition:", arraySum)

# Multiply myArray3 by 5
myArray4 = myArray3*5  # YOUR CODE HERE
print("Result of multiplication:", myArray4)

Result of addition: [ 10  21  32  43  54  65  76  87  98 109]
Result of multiplication: [ 50 100 150 200 250 300 350 400 450 500]


<details> <summary>✅ Solution</summary>

myArray3 = np.array([10,20,30,40,50,60,70,80,90,100])
arraySum = myArray + myArray3
myArray4 = myArray3 * 5
print("Result of addition:", arraySum)
print("Result of multiplication:", myArray4)

</details>

🏫 Exercise 2: Work with 2D arrays

In [None]:
# TO DO: Create a 3x3 array [[1,2,3],[4,5,6],[7,8,9]]
array_2d = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])  # YOUR CODE HERE
print(array_2d)

# Divide all elements by 2
array_2d_div = array_2d/2
print(array_2d_div)  # YOUR CODE HERE

# Multiply all elements by 6
result = array_2d*6  # YOUR CODE HERE
print(result)

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
[[0.5 1.  1.5]
 [2.  2.5 3. ]
 [3.5 4.  4.5]
 [5.  5.5 6. ]]
[[ 6 12 18]
 [24 30 36]
 [42 48 54]
 [60 66 72]]


In [None]:
print(array_2d[1,2])

6


In [None]:
print(array_2d[1,:])

[4 5 6]


In [None]:
print(array_2d[:,2])

[ 3  6  9 12]


In [None]:
print( array_2d.reshape((3, 4)) )

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


<details> <summary>✅ Solution</summary>

array_2d = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(array_2d)
print(array_2d / 2)
result = array_2d * 6
print(result)

</details>

🏫 Exercise 3: Slicing and reshaping

In [None]:
# TO DO: Slice the first row of array_2d
first_row = array_2d[0,:]  # YOUR CODE HERE
print("First row:", first_row)

# TO DO: Slice the second column of array_2d
second_col = array_2d[:,1]  # YOUR CODE HERE
print("Second column:", second_col)

# TO DO: Reshape array_2d into a 1D array
reshaped = array_2d.reshape((-1))  # YOUR CODE HERE
print("Reshaped array:", reshaped)


First row: [1 2 3]
Second column: [ 2  5  8 11]
Reshaped array: [ 1  2  3  4  5  6  7  8  9 10 11 12]


In [None]:
print(array_2d[0:3,2])

[3 6 9]


<details> <summary>✅ Solution</summary>

first_row = array_2d[0, :]
second_col = array_2d[:, 1]
reshaped = array_2d.reshape(-1)
print("First row:", first_row)
print("Second column:", second_col)
print("Reshaped array:", reshaped)

</details>

## Section 2: Pandas DataFrames

🏫 Example: Create a DataFrame

In [None]:
import pandas as pd

# Create a DataFrame from a dictionary
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [27, 33, 38, 42],
    'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']
}

print("This is a dictionary:", data)

df = pd.DataFrame(data)
print("This is a DataFrame:\n", df)

This is a dictionary: {'Name': ['Alice', 'Bob', 'Charlie', 'David'], 'Age': [27, 33, 38, 42], 'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}
This is a DataFrame:
       Name  Age         City
0    Alice   27     New York
1      Bob   33  Los Angeles
2  Charlie   38      Chicago
3    David   42      Houston


🏫 Exercise 4: Access and slice columns

In [None]:
# TO DO: Access the 'Age' column
ages = df['Age']  # YOUR CODE HERE
print(ages)

0    25
1    30
2    35
3    40
Name: Age, dtype: int64


In [None]:
# TO DO: Slice the second and third elements from 'Age'
age_slice = ages[0:3]  # YOUR CODE HERE
print(age_slice)

0    25
1    30
2    35
Name: Age, dtype: int64


<details> <summary>✅ Solution</summary>

ages = df['Age']
age_slice = df['Age'][1:3]
print(ages)
print(age_slice)

</details>

🏫 Exercise 5: Add a new column

In [None]:
# TO DO: Add a 'Gender' column with ['Female', 'Male', 'Male', 'Male']
df['Gender'] = ['Female', 'Male', 'Male', 'Male']  # YOUR CODE HERE
print(df)

      Name  Age         City  Gender
0    Alice   27     New York  Female
1      Bob   33  Los Angeles    Male
2  Charlie   38      Chicago    Male
3    David   42      Houston    Male


<details> <summary>✅ Solution</summary>

df['Gender'] = ['Female', 'Male', 'Male', 'Male']
print(df)

</details>

🏫 Exercise 6: Filter data

In [None]:
print(df['Age'])

0    25
1    30
2    35
3    40
Name: Age, dtype: int64


In [None]:
print(df['Age']>=30)

0    False
1     True
2     True
3     True
Name: Age, dtype: bool


In [None]:
print(df[ df['Age']>=30 ])

      Name  Age         City Gender
1      Bob   30  Los Angeles   Male
2  Charlie   35      Chicago   Male
3    David   40      Houston   Male


In [None]:
# TO DO: Filter DataFrame to only show people aged 35 and above
filtered_df = df[ df['Age']>=35 ]  # YOUR CODE HERE
print("Filtered DataFrame:\n", filtered_df)

filtered_df = df[ df['Gender']=='Male' ]  # YOUR CODE HERE
print("Filtered DataFrame:\n", filtered_df)

Filtered DataFrame:
       Name  Age     City Gender
2  Charlie   35  Chicago   Male
3    David   40  Houston   Male
Filtered DataFrame:
       Name  Age         City Gender
1      Bob   30  Los Angeles   Male
2  Charlie   35      Chicago   Male
3    David   40      Houston   Male


<details> <summary>✅ Solution</summary>

filtered_df = df[df['Age'] >= 30]
print("Filtered DataFrame:\n", filtered_df)

</details>

🏫 Exercise 7: Grouping data

In [None]:
# TODO: Group by 'Gender' and calculate the average age
grouped = df.groupby('Gender')['Age']  # YOUR CODE HERE
print(grouped.mean())
print(grouped.std())

Gender
Female    27.000000
Male      37.666667
Name: Age, dtype: float64
Gender
Female        NaN
Male      4.50925
Name: Age, dtype: float64


<details> <summary>✅ Solution</summary>

grouped = df.groupby('Gender')['Age']
print(grouped.mean())

</details>

🏫 Exercise 8: Sorting data

In [None]:
# TO DO: Sort the DataFrame by 'Age' in descending order
sorted_df = df.sort_values('Age',ascending = True)  # YOUR CODE HERE
print(sorted_df)

      Name  Age         City  Gender
0    Alice   27     New York  Female
1      Bob   33  Los Angeles    Male
2  Charlie   38      Chicago    Male
3    David   42      Houston    Male


<details> <summary>✅ Solution</summary>

sorted_df = df.sort_values('Age', ascending=False)
print(sorted_df)

</details>

### Section 3: Combining NumPy and Pandas

🏫 Exercise 9: Create a DataFrame from NumPy array

In [None]:
# TO DO: Create a 4x2 NumPy array with random numbers
array = np.array( [ [4,7], [8,5], [12,11], [0,3] ] )  # YOUR CODE HERE
# print(array)

# TO DO: Convert to DataFrame with columns ['A', 'B']
df_array = pd.DataFrame(array, columns = ['A', 'B'])  # YOUR CODE HERE
print(df_array)

    A   B
0   4   7
1   8   5
2  12  11
3   0   3


<details> <summary>✅ Solution</summary>

array = np.array([[1,2],[3,4],[5,6],[7,8]])
df_array = pd.DataFrame(array, columns=['A', 'B'])
print(df_array)

</details>

### Summary

✅ You have practised:

- Advanced NumPy array operations (slicing, reshaping, math)

- Pandas DataFrame creation, slicing, filtering, grouping, sorting

- Connecting NumPy and Pandas for flexible data handling
