Here’s an exercise to test your skills with NumPy:

### **Exercise: NumPy Array Manipulation and Operations**

You are given a dataset that contains the number of hours a student studied each day for a month. Using NumPy, perform various tasks and operations on this dataset.

```python
import numpy as np

# Dataset: Hours studied each day for a month (30 days)
hours_studied = np.array([1, 3, 2, 4, 6, 2, 5, 3, 7, 5, 
                          6, 8, 7, 4, 5, 6, 8, 9, 2, 5, 
                          3, 4, 8, 6, 7, 9, 4, 5, 3, 2])
```

### **Tasks:**

1. **Reshape the Array**:
   Reshape the `hours_studied` array into a 2D array with 5 rows and 6 columns. Display the reshaped array.

2. **Calculate the Total Hours**:
   Calculate the total number of hours the student studied over the month using NumPy's sum function.

3. **Daily Average**:
   Find the average number of hours studied per day for the entire month.

4. **Find Maximum and Minimum Hours**:
   Identify the day with the maximum and minimum hours of study.

5. **Replace Values**:
   Replace all occurrences of 5 hours or more with 10. Print the modified array.

6. **Cumulative Sum**:
   Calculate the cumulative sum of the hours studied for the month.

7. **Standard Deviation**:
   Compute the standard deviation of the number of hours studied.

8. **Boolean Masking**:
   Create a Boolean mask for days where the student studied more than 6 hours. Use the mask to print the days with more than 6 hours of study.

9. **Matrix Operations**:
   Generate another random 2D array of shape (5,6) with values between 1 and 10, and add this new array to the reshaped `hours_studied` array from Task 1.

10. **Linear Space Array**:
    Create a 1D array with 50 equally spaced numbers between the minimum and maximum hours studied.

---

### **Extra Challenge:**

11. **Sorting**:
    Sort the `hours_studied` array in descending order. Display the sorted array.

12. **Dot Product**:
    Create a new 1D array representing the difficulty level of each day (random integers between 1 and 3) and compute the dot product between this array and the original `hours_studied` array.

---

### **Expected Skills Tested**:
- Array creation, reshaping
- Mathematical operations on arrays
- Boolean indexing, slicing
- Aggregation functions (sum, mean, etc.)
- Broadcasting and matrix operations

Let me know if you'd like hints or solutions for any of the tasks!

In [2]:
import numpy as np

# Dataset: Hours studied each day for a month (30 days)
hours_studied = np.array([1, 3, 2, 4, 6, 2, 5, 3, 7, 5, 
                          6, 8, 7, 4, 5, 6, 8, 9, 2, 5, 
                          3, 4, 8, 6, 7, 9, 4, 5, 3, 2])
print(hours_studied)

[1 3 2 4 6 2 5 3 7 5 6 8 7 4 5 6 8 9 2 5 3 4 8 6 7 9 4 5 3 2]


In [6]:
#1 Reshape the Array: Reshape the hours_studied array into a 2D array with 5 rows and 6 columns. Display the reshaped array.
hour_2d=hours_studied.reshape(5,6)
print(f"shape of new array = {hour_2d.shape}")
print(hour_2d)

shape of new array = (5, 6)
[[1 3 2 4 6 2]
 [5 3 7 5 6 8]
 [7 4 5 6 8 9]
 [2 5 3 4 8 6]
 [7 9 4 5 3 2]]


In [11]:
#2 Calculate the Total Hours: Calculate the total number of hours the student studied over the month using NumPy's sum function.
total_hr=hour_2d.sum()
print(f"Total number of studied hour = {total_hr}")

Total number of studied hour = 149


In [15]:
# 3 Daily Average: Find the average number of hours studied per day for the entire month.
daily_avg=total_hr/len(hours_studied)
print(f"Daily average = {daily_avg} hr/day")

Daily average = 4.966666666666667 hr/day


In [17]:
# 4 Find Maximum and Minimum Hours: Identify the day with the maximum and minimum hours of study.
min_hr=hours_studied.min()
max_hr=hours_studied.max()
print(f"Max hour studied ={max_hr} and Min hour studied is {min_hr}")

Max hour studied =9 and Min hour studied is 1


In [20]:
# 5 Replace Values: Replace all occurrences of 5 hours or more with 10. Print the modified array
cond = hours_studied>4
print(hours_studied)
for i in range(0,len(hours_studied)):
    if cond[i]:
        hours_studied[i]=10
print(hours_studied)

# ---------------------Better way to do the same -----------------------#
"""
hours_studied[hours_studied >= 5] = 10
"""

[1 3 2 4 6 2 5 3 7 5 6 8 7 4 5 6 8 9 2 5 3 4 8 6 7 9 4 5 3 2]
[ 1  3  2  4 10  2 10  3 10 10 10 10 10  4 10 10 10 10  2 10  3  4 10 10
 10 10  4 10  3  2]


In [21]:
# 6 Cumulative Sum: Calculate the cumulative sum of the hours studied for the month.
np.cumsum(hours_studied)

array([  1,   4,   6,  10,  20,  22,  32,  35,  45,  55,  65,  75,  85,
        89,  99, 109, 119, 129, 131, 141, 144, 148, 158, 168, 178, 188,
       192, 202, 205, 207], dtype=int32)

In [23]:
# 7 Standard Deviation: Compute the standard deviation of the number of hours studied.
std_dev=np.std(hours_studied)
print(f"Standard deviation of hours studied: {std_dev}")

Standard deviation of hours studied: 3.5995370072644994


In [37]:
# 8 Boolean Masking: Create a Boolean mask for days where the student studied more than 6 hours. 
# Use the mask to print the days with more than 6 hours of study.
mask=hours_studied>=6
# Print the days and corresponding hours with more than 6 hours of study
days = np.where(mask)[0] + 1  # To get the days (1-indexed)
hours = hours_studied[mask]  # To get the corresponding hours
print(days)

[ 5  7  9 10 11 12 13 15 16 17 18 20 23 24 25 26 28]


In [49]:
# 9 Matrix Operations: Generate another random 2D array of shape (5,6) with values between 1 and 10, 
# and add this new array to the reshaped hours_studied array from Task 1.
new_2d=np.random.randint(1,11,size=(5,6))
print(hour_2d+new_2d)

[[10 12  4 11 14  8]
 [17  5 19 18 19 11]
 [13  6 20 17 18 18]
 [ 9 15  6  7 16 17]
 [11 13 13 14  6  8]]


In [42]:
# 10 Linear Space Array: Create a 1D array with 50 equally spaced numbers between the minimum and maximum hours studied.
linear=np.linspace(hours_studied.min(),hours_studied.max())
print(linear)
print(f"length = {len(linear)}")

[ 1.          1.18367347  1.36734694  1.55102041  1.73469388  1.91836735
  2.10204082  2.28571429  2.46938776  2.65306122  2.83673469  3.02040816
  3.20408163  3.3877551   3.57142857  3.75510204  3.93877551  4.12244898
  4.30612245  4.48979592  4.67346939  4.85714286  5.04081633  5.2244898
  5.40816327  5.59183673  5.7755102   5.95918367  6.14285714  6.32653061
  6.51020408  6.69387755  6.87755102  7.06122449  7.24489796  7.42857143
  7.6122449   7.79591837  7.97959184  8.16326531  8.34693878  8.53061224
  8.71428571  8.89795918  9.08163265  9.26530612  9.44897959  9.63265306
  9.81632653 10.        ]
length = 50


In [53]:
# 11 Sorting: Sort the hours_studied array in descending order. Display the sorted array.
print(sorted(hours_studied))


[1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10]


In [55]:
# 12 Dot Product: Create a new 1D array representing the difficulty level of each day (random integers between 1 and 3) 
# and compute the dot product between this array and the original hours_studied array.
difficulty_levels = np.random.randint(1, 4, size=hours_studied.shape)
dot_product = np.dot(hours_studied, difficulty_levels)

print(f"Difficulty levels: {difficulty_levels}")
print(f"Dot product: {dot_product}")

Difficulty levels: [1 1 3 1 3 2 2 3 3 3 3 2 1 1 1 3 2 2 2 3 1 1 1 2 2 3 3 2 1 3]
Dot product: 413
