# NumPy Practice Notebook
### Dataset: AusApparalSales4thQrt2020.csv (Australian Apparel Sales Q4 2020)

**Columns:** Date, Time, State, Group, Unit, Sales

**Instructions:**
- Each question is in a markdown cell
- Write your solution in the empty code cell below each question
- Run cells with `Shift + Enter`
- Refer to `numpy_basics.md` for syntax help

In [None]:
# Setup - Run this cell first!
import numpy as np
import pandas as pd

# Load the dataset
df = pd.read_csv('../AusApparalSales4thQrt2020.csv')

# Extract numeric columns as NumPy arrays
units = df['Unit'].to_numpy()
sales = df['Sales'].to_numpy()

print(f"Dataset loaded: {df.shape[0]} rows, {df.shape[1]} columns")
print(f"Units array shape: {units.shape}")
print(f"Sales array shape: {sales.shape}")
print(f"\nFirst 5 rows:")
df.head()

---
## Section 1: Array Basics

**Q1.** Print the shape, dimensions, size, and data type of the `sales` array.

In [None]:
# Q1: Your solution here


**Q2.** Print the first 10 elements, last 5 elements, and every 3rd element of the `units` array.

In [None]:
# Q2: Your solution here


**Q3.** Reshape the first 60 elements of `sales` into a 2D array of shape (10, 6). Print the result.

In [None]:
# Q3: Your solution here


**Q4.** Flatten the reshaped array from Q3 back to 1D and verify it matches the original 60 elements.

In [None]:
# Q4: Your solution here


**Q5.** Create a copy of the `units` array. Modify the first element of the copy to 999. Verify the original array is unchanged.

In [None]:
# Q5: Your solution here


---
## Section 2: Indexing & Slicing

**Q6.** Extract all sales values from index 100 to 120 (inclusive).

In [None]:
# Q6: Your solution here


**Q7.** Extract every 50th element from the `sales` array.

In [None]:
# Q7: Your solution here


**Q8.** Using boolean indexing, find all `units` values that are greater than 20.

In [None]:
# Q8: Your solution here


**Q9.** Using boolean indexing, find all `sales` values where `units` is exactly 10.

In [None]:
# Q9: Your solution here


**Q10.** Use fancy indexing to extract sales at indices [0, 100, 500, 1000, 5000].

In [None]:
# Q10: Your solution here


---
## Section 3: Mathematical & Statistical Operations

**Q11.** Calculate the mean, median, standard deviation, and variance of the `sales` array.

In [None]:
# Q11: Your solution here


**Q12.** Find the minimum and maximum values in `units`. Also find their index positions using `argmin()` and `argmax()`.

In [None]:
# Q12: Your solution here


**Q13.** Calculate the total (sum) of all sales. What is the cumulative sum of the first 20 sales values?

In [None]:
# Q13: Your solution here


**Q14.** Calculate the average sales per unit. (Hint: divide `sales` array by `units` array â€” handle division by zero if any)

In [None]:
# Q14: Your solution here


**Q15.** What percentage of total sales does each individual sale represent? Store the result in a new array.

In [None]:
# Q15: Your solution here


---
## Section 4: Array Operations

**Q16.** Add 500 to every element in the `sales` array (broadcasting). Print the first 10 results.

In [None]:
# Q16: Your solution here


**Q17.** Multiply all `units` values by 2.5 (simulating a price increase). Print the first 10 results.

In [None]:
# Q17: Your solution here


**Q18.** Create a boolean array where `sales > 30000`. How many True values are there? What percentage of total rows is that?

In [None]:
# Q18: Your solution here


**Q19.** Use `np.where()` to create a new array that labels each sale as `'High'` if sales > 25000, else `'Low'`.

In [None]:
# Q19: Your solution here


**Q20.** Clip the `sales` array so that all values are between 10000 and 40000. Print the first 20 values.

In [None]:
# Q20: Your solution here


---
## Section 5: Sorting & Searching

**Q21.** Sort the `sales` array in ascending order. What are the 5 smallest and 5 largest sales values?

In [None]:
# Q21: Your solution here


**Q22.** Use `np.argsort()` on `sales` to find the indices of the top 10 highest sales.

In [None]:
# Q22: Your solution here


**Q23.** Find all unique values in the `units` array. How many unique unit values exist?

In [None]:
# Q23: Your solution here


**Q24.** Use `np.unique()` with `return_counts=True` on `units` to find the frequency of each unit value.

In [None]:
# Q24: Your solution here


**Q25.** Use `np.searchsorted()` on a sorted `sales` array to find where the value 25000 would be inserted.

In [None]:
# Q25: Your solution here


---
## Section 6: Reshaping & Stacking

**Q26.** Take the first 120 sales values. Reshape them into a (10, 12) matrix. Find the sum of each row and each column.

In [None]:
# Q26: Your solution here


**Q27.** Split the first 100 elements of `units` into 5 equal arrays using `np.split()`.

In [None]:
# Q27: Your solution here


**Q28.** Create two arrays: one with the first 50 sales values and another with the next 50. Stack them vertically using `np.vstack()` and horizontally using `np.hstack()`.

In [None]:
# Q28: Your solution here


**Q29.** Take the (10, 12) matrix from Q26. Transpose it. What is the new shape?

In [None]:
# Q29: Your solution here


**Q30.** Add a new axis to the `units` array to make it a column vector. Print the shape.

In [None]:
# Q30: Your solution here


---
## Section 7: Aggregation with Axis

**Q31.** Reshape the first 200 sales values into a (10, 20) matrix. Calculate:
- Sum along axis=0 (column-wise sum)
- Sum along axis=1 (row-wise sum)
- Mean along axis=0
- Mean along axis=1

In [None]:
# Q31: Your solution here


**Q32.** For the same matrix, find the max value in each row and the min value in each column.

In [None]:
# Q32: Your solution here


**Q33.** For the same matrix, use `np.argmax(axis=1)` to find which column has the highest value in each row.

In [None]:
# Q33: Your solution here


---
## Section 8: Random & Simulation

**Q34.** Set a random seed of 42. Generate a random array of 100 values from a normal distribution with mean=25000 and std=5000. Compare its mean and std with the actual `sales` array.

In [None]:
# Q34: Your solution here


**Q35.** Use `np.random.choice()` to randomly sample 50 values from the `sales` array (without replacement).

In [None]:
# Q35: Your solution here


**Q36.** Simulate 1000 random daily sales by sampling from the `units` array with replacement. What is the mean of this simulated data?

In [None]:
# Q36: Your solution here


**Q37.** Generate a 5x5 random integer matrix with values between 1 and 50. Find its determinant and inverse (if it exists).

In [None]:
# Q37: Your solution here


---
## Section 9: Linear Algebra (Bonus)

**Q38.** Create a 3x3 matrix from the first 9 sales values. Calculate its:
- Transpose
- Determinant
- Trace (sum of diagonal)

In [None]:
# Q38: Your solution here


**Q39.** Create two 3x3 matrices from sales data. Perform matrix multiplication using `@` operator and `np.dot()`.

In [None]:
# Q39: Your solution here


**Q40.** Solve the system of equations:
```
2x + 3y = 25000
4x + y  = 30000
```
Use `np.linalg.solve()`.

In [None]:
# Q40: Your solution here


---
## Section 10: Real-World Analysis with NumPy

**Q41.** Calculate the z-score for each value in the `sales` array. How many sales are more than 2 standard deviations from the mean?

In [None]:
# Q41: Your solution here


**Q42.** Calculate the correlation coefficient between `units` and `sales` using `np.corrcoef()`.

In [None]:
# Q42: Your solution here


**Q43.** Use `np.percentile()` to find the 25th, 50th, 75th, and 90th percentiles of `sales`.

In [None]:
# Q43: Your solution here


**Q44.** Bin the `sales` data into 5 equal-width bins using `np.histogram()`. Print the bin edges and counts.

In [None]:
# Q44: Your solution here


**Q45.** Calculate the moving average of `sales` with a window size of 7 using `np.convolve()`.

In [None]:
# Q45: Your solution here


---
## Congratulations!
You've completed all 45 NumPy practice questions. 

**Next steps:**
- Review any questions you found difficult
- Move on to the **Pandas** practice notebook
- Refer to `numpy_basics.md` for any syntax you want to revisit