<a href="https://colab.research.google.com/github/sdadi/ScalerDSML/blob/main/1_numpy/Class2/Postread_Numpy_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Numpy 2

## **Content**

- Aggregate Functions
- Logical Operations
  - `np.any()`
  - `np.all()`
  - `np.where()`
- Sorting
- Vectorization
- Broadcasting

---

## **Aggregate Functions**

Numpy provides various universal functions that cover a wide variety of operations and perform **fast element-wise array operations**.

#### How would calculate the sum of elements of an array?

#### `np.sum()`

- It sums all the values in a numpy array.

In [None]:
import numpy as np
a = np.arange(1, 11)
a

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [None]:
np.sum(a)

np.int64(55)

#### What if we want to find the average value or median value of all the elements in an array?

#### `np.mean()`

- It gives the us mean of all values in a numpy array.

In [None]:
np.mean(a)

np.float64(5.5)

Similar to sum() and mean(), we can use all aggregate functions to find values (e.g., min(), max(), count(), etc.).

Let's apply aggregate functions on 2D array.

In [None]:
a = np.arange(12).reshape(3, 4)
a

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [None]:
np.sum(a)  # sums all the values present in the array

np.int64(66)

### What if we want to do the elements row-wise or column-wise?

- By **setting `axis` parameter**

#### What will `np.sum(a, axis=0)` do?

- `np.sum(a, axis=0)` adds together values in **different rows**
- `axis = 0` $\rightarrow$ **Changes will happen along the vertical axis**
- Summation of values happen **in the vertical direction**.
- Rows collapse/merge when we do `axis=0`.

In [None]:
np.sum(a, axis=0)

array([12, 15, 18, 21])

#### What if we specify `axis=1`?

- `np.sum(a, axis=1)` adds together values in **different columns**
- `axis = 1` $\rightarrow$ **Changes will happen along the horizontal axis**
- Summation of values happen **in the horizontal direction**.
- Columns collapse/merge when we do `axis=1`.

In [None]:
np.sum(a, axis=1)

array([ 6, 22, 38])

---

## **Logical Operations**

#### What if we want to check whether **"any"** element of array follows a specific condition?

#### `np.any()`

- returns `True` if **any of the corresponding elements** in the argument arrays follow the **provided condition**.


Imagine you have a shopping list with items you need to buy, but you're not sure if you have enough money to buy everything.

You want to check if there's at least one item on your list that you can afford.

In this case, you can use `np.any`:


In [None]:
import numpy as np

# Prices of items on your shopping list
prices = np.array([50, 45, 25, 20, 35])

# Your budget
budget = 30

# Check if there's at least one item you can afford
can_afford = np.any(prices <= budget)

if can_afford:
    print("You can buy at least one item on your list!")
else:
    print("Sorry, nothing on your list fits your budget.")

You can buy at least one item on your list!


---

#### What if we want to check whether "all" the elements in our array follow a specific condition?

#### `np.all()`

- returns `True` if **all the elements** in the argument arrays follow the **provided condition**.



Let's consider a scenario where you have a list of chores, and you want to make sure all the chores are done before you can play video games.

You can use `np.all` to check if all the chores are completed.

In [None]:
import numpy as np

# Chores status: 1 for done, 0 for not done
chores = np.array([1, 1, 1, 1, 0])

# Check if all chores are done
all_chores_done = np.all(chores == 1)

if all_chores_done:
    print("Great job! You've completed all your chores. Time to play!")
else:
    print("Finish all your chores before you can play.")


Finish all your chores before you can play.


**Multiple conditions for `.all()` function -**

In [None]:
a = np.array([1, 2, 3, 2])
b = np.array([2, 2, 3, 2])
c = np.array([6, 4, 4, 5])

((a <= b) & (b <= c)).all()

np.True_

---

#### `np.where()`

- Syntax: `np.where(condition, [x, y])`
- returns an `ndarray` whose elements are chosen from `x` or `y` depending on condition.


Suppose you have a list of product prices, and you want to apply a **10%** discount to all products with prices above **$50**.

You can use `np.where` to adjust the prices.

In [None]:
import numpy as np

# Product prices
prices = np.array([45, 55, 60, 75, 40, 90])

# Apply a 10% discount to prices above $50
discounted_prices = np.where(prices > 50, prices * 0.9, prices)

print("Original prices:", prices)
print("Discounted prices:", discounted_prices)

Original prices: [45 55 60 75 40 90]
Discounted prices: [45.  49.5 54.  67.5 40.  81. ]


**Notice that it didn't change the original array.**

---

## Sorting

- `np.sort` returns a sorted copy of an array.

In [None]:
a = np.array([4, 7, 0, 3, 8, 2, 5, 1, 6, 9])
a

array([4, 7, 0, 3, 8, 2, 5, 1, 6, 9])

In [None]:
b = np.sort(a)
b

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [None]:
a # no change is reflected in the original array

array([4, 7, 0, 3, 8, 2, 5, 1, 6, 9])

#### We can directly call `sort` method on array but it can change the original array as it is an inplace operation.

In [None]:
a.sort() # sorting is performed inplace
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

### **Sorting in 2D array**

In [None]:
a = np.array([[1,5,3], [2,5,7], [400, 200, 300]])
a

array([[  1,   5,   3],
       [  2,   5,   7],
       [400, 200, 300]])

In [None]:
np.sort(a, axis=0) # sorting every column

array([[  1,   5,   3],
       [  2,   5,   7],
       [400, 200, 300]])

In [None]:
np.sort(a, axis=1) # sorting every row

array([[  1,   3,   5],
       [  2,   5,   7],
       [200, 300, 400]])

**Note**: By default, the `np.sort()` functions sorts along the last axis.

In [None]:
a = np.array([[23,4,43], [12, 89, 3], [69, 420, 0]])

In [None]:
np.sort(a) # default axis = -1 (last axis)

array([[  4,  23,  43],
       [  3,  12,  89],
       [  0,  69, 420]])

---

## **Vectorization**

Vectorization in NumPy refers to performing operations on entire arrays or array elements simultaneously, which is significantly faster and more efficient than using explicit loops.

In [None]:
a = np.arange(10)
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

**Note:**
- 1d np array --> vector
- 2d np array --> matrix
- 3d onwards --> tensors

In [None]:
def random_operation(x):
    if x % 2 == 0:
        x += 2
    else:
        x -= 2

    return x

In [None]:
random_operation(a)

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [None]:
cool_operation = np.vectorize(random_operation)

In [None]:
type(cool_operation)

#### `np.vectorize()`

- It is a generalised function for vectorization.
- It takes the function and returns an object (which acts like function but can take an array as input and perform the operations).

In [None]:
cool_operation(a)

array([ 2, -1,  4,  1,  6,  3,  8,  5, 10,  7])

---

## **Broadcasting**

Broadcasting in NumPy is the automatic and implicit extension of array dimensions to enable element-wise operations between arrays with different shapes.

![bro.jpg](https://d2beiqkhq929f0.cloudfront.net/public_assets/assets/000/047/364/original/download.jpeg?1694345633)

---

#### **Case 1:** If dimension in both matrix is equal, element-wise addition will be done.

In [None]:
a = np.tile(np.arange(0,40,10), (3,1))
a

array([[ 0, 10, 20, 30],
       [ 0, 10, 20, 30],
       [ 0, 10, 20, 30]])

**Note:**

* `numpy.tile(array, reps)` constructs an array by repeating A the number of times given by reps along each dimension.
* `np.tile(array, (repetition_rows, repetition_cols))`


In [None]:
a=a.T
a

array([[ 0,  0,  0],
       [10, 10, 10],
       [20, 20, 20],
       [30, 30, 30]])

In [None]:
b = np.tile(np.arange(0,3), (4,1))
b

array([[0, 1, 2],
       [0, 1, 2],
       [0, 1, 2],
       [0, 1, 2]])

In [None]:
print(a.shape, b.shape)

(4, 3) (4, 3)


Since a and b have the same shape, they can be added without any issues.

In [None]:
a+b

array([[ 0,  1,  2],
       [10, 11, 12],
       [20, 21, 22],
       [30, 31, 32]])

---

#### **Case 2:** Right array should be of 1-D and number of columns should be same of both the arrays and it will automatically do n-tile.

In [None]:
a

array([[ 0,  0,  0],
       [10, 10, 10],
       [20, 20, 20],
       [30, 30, 30]])

In [None]:
c = np.array([0,1,2])
c

array([0, 1, 2])

In [None]:
print(a.shape, c.shape)

(4, 3) (3,)


In [None]:
a + c

array([[ 0,  1,  2],
       [10, 11, 12],
       [20, 21, 22],
       [30, 31, 32]])

- c was broadcasted along rows (vertically)
- so that a and c can be made compatible

---

#### **Case 3:** If the left array is column matrix (must have only 1 column) and right array is row matrix, then it will do the n-tile such that element wise addition is possible.

In [None]:
d = np.array([0,10,20,30]).reshape(4,1)
d

array([[ 0],
       [10],
       [20],
       [30]])

In [None]:
c = np.array([0,1,2])
c

array([0, 1, 2])

In [None]:
print(d.shape, c.shape)

(4, 1) (3,)


In [None]:
d + c

array([[ 0,  1,  2],
       [10, 11, 12],
       [20, 21, 22],
       [30, 31, 32]])

- d was stacked (broadcasted) along columns (horizontally)
- c was stacked (broadcasted) along rows (vertically)

---

**Will broadcasting work in this case?**

In [None]:
a = np.arange(8).reshape(2,4)
a

array([[0, 1, 2, 3],
       [4, 5, 6, 7]])

In [None]:
b = np.arange(16).reshape(4,4)
b

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [None]:
a+b

ValueError: operands could not be broadcast together with shapes (2,4) (4,4) 

#### Broadcasting in 2D Arrays

- A + A (same shape)-> Works
- A + A (1D) -> Works
- A + number -> Works
- A + A (different shape but still 2D) -> DOES NOT WORK

**Is broadcasting possible in this case?**

In [None]:
A = np.arange(1,10).reshape(3,3)
A

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [None]:
B = np.array([-1, 0, 1])
B

array([-1,  0,  1])

In [None]:
A*B

array([[-1,  0,  3],
       [-4,  0,  6],
       [-7,  0,  9]])

Yes! Broadcasting is possible for all the operations.





---