<a href="https://colab.research.google.com/github/HarshaSuresh23/Pythonbasics-/blob/main/Lesson_10_NumPy_Arrays_II.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Lesson 10: NumPy Arrays II


### Teacher-Student Activities

In the previous lesson, we learnt how to create a NumPy array using different functions. In this lesson, we are going to learn a few of the mathematical operations that can be done on a NumPy array.

At the end of the class, we will compare the performance of a NumPy array with a Python list.

---

#### Activity 1: Array Length

Let's begin the lesson with array length. The length of an array (or array length) is:

- The number of items in a one-dimensional array.

- The number of rows in a two-dimensional array.

- The number of blocks in a three-dimensional array.

To calculate the length of a NumPy array, you can use the `len()` function.

Let's create an array for all the three different dimensions and calculate their lengths using the `len()` function.




In [None]:
# Teacher Action: Create a NumPy array for all the three different dimensions and calculate their lengths.
import numpy as np

one_dim_ar = np.arange(1, 25) # Creates a one-dimensional NumPy array having 24 items.
two_dim_ar = np.arange(1, 29).reshape(4, 7) # Creates a two-dimensional NumPy array having 4 rows and 7 columns. No. of items = 28
three_dim_ar = np.arange(1, 31).reshape(2, 5, 3) # Creates a three-dimensional array having 2 blocks, 5 rows and 3 columns. No. of items = 30
print("Number of items in one_dim_ar:", len(one_dim_ar))
print("Number of rows in two_dim_ar:", len(two_dim_ar))
print("Number of blocks in three_dim_ar:", len(three_dim_ar))

Number of items in one_dim_ar: 24
Number of rows in two_dim_ar: 4
Number of blocks in three_dim_ar: 2


**Note:** You do not have to create the arrays with the exact same items. You can choose to have different items in the arrays.

As you can see, the `len()` function gives different outputs for multi-dimensional NumPy arrays.

---

#### Activity 2: The `size` Keyword

To find the number of items in a NumPy array, you can use the `size` keyword. Regardless of dimensions, the `size` keyword will always return the number of items in a NumPy array.

In [None]:
# Student Action: Compute the number of items in the 'one_dim_ar', 'two_dim_ar' and 'three_dim_ar' NumPy arrays.
print("the size of one dimensional array is:",one_dim_ar.size)
print("the size of two dimensional array is:",two_dim_ar.size)
print("the size of three dimensional array is:",three_dim_ar.size)

the size of one dimensional array is: 24
the size of two dimensional array is: 28
the size of three dimensional array is: 30


**Note:** The `size` keyword is not available for a Python list.

---

#### Activity 3: Descriptive Statistics

Using NumPy arrays, we can easily do some statistical calculations.

Consider that you are a smartphone retailer and you have few smartphones in your inventory.

|Smartphone Model|Price (INR)|# Units Available|
|-|-|-|
|Samsung Galaxy M30S|	13999|	9|
Realme C2| 6298| 8|
Xiaomi Redmi Note 7 Pro| 10999| 9|
Xiaomi Redmi Note 8 Pro| 14999| 9|
Realme C2 32GB| 7298| 8|
Realme C2 2GB RAM| 6385| 8|
Realme 5| 8999| 9|
Xiaomi Redmi Note 7S 64GB| 9999| 6|
Xiaomi Redmi Note 8| 9999| 5|
Vivo Z1 Pro| 13868| 7|

Suppose you decided to do some analysis of your inventory. In the process, you want to find answers to the following questions:

1. What is the total monetary value of the inventory?

2. What is the average (or mean) price of a smartphone ?

3. What is the price of the cheapest smartphone in the inventory?

4. What is the price of the most expensive smartphone in the inventory?

5. What is the median price of a smartphone?

6. What is the most commonly occurring price of a smartphone?

You can answer all these questions in a few seconds by creating NumPy arrays and by applying the `sum(), mean(), median(), min()` and `max()` functions.

**Note**: A median value is the middle value in an array when the values are arranged in an increasing order. Consider the five numbers `6, 1 , 5, 32,` and `13`.

**How to find the median value?**

To find the median value, follow two steps:

1. First, arrange all the numbers in the increasing order, i.e., `1, 5, 6, 13, 32`.

2. Look for the middle value which in this case is `6`. So, the required median value is `6`.

In general, let $n$ be the number of numbers in a set.

1. If $n$ is odd, the median value lies at the
$\left(\frac{n + 1}{2}\right)^{th}$
position after arranging the numbers in the ascending order.

2. If $n$ is even, the median value is the mean (or average) of the values at the
$\left(\frac{n}{2}\right)^{th}$
and
$\left(\frac{n}{2} + 1\right)^{th}$
positions.

Let's say we want to find the median of the numbers `34, 12, 8, 7, 21, 19`.

1. First, arrange the numbers in increasing order, i.e., `7, 8, 12, 19, 21, 24`.

2. There are 6 numbers, so the middle values are `12` and `19`. Their mean (or average) is
$\frac{12+19}{2}$
`= 15.5`.
So, the required median value is `15.5`

Let's first create a NumPy array for the phone data given above, then find the answers to these questions one-by-one.


In [None]:
# Student Action: Create two NumPy arrays: one for the smartphone prices and another for the number of units available.
import numpy as np
prices=np.array([1399,6928,10999,14999,7298,6385,8999,9999,9999,13868])
units_available=np.array([9,8,9,9,8,8,9,6,5,7])


Now, let's answer the first question. To find the total monetary value of the inventory, you have to multiply each smartphone price with its corresponding number of units available and then add all the products of the multiplications.

Let the total monetary value be
$M$
, price of a smartphone be
$p$
, the number of units available be $u$ and the varieties of smartphones be
$n$
. Then
$M = p_1 \times u_1 + p_2 \times u_2 + p_3 \times u_3 + \dots + p_n \times u_n$

Therefore, we have to multiply the `prices` array values with the `units_available` array values to get a new array containing total prices for each smartphone. Then using the `sum()` function, we will add all the values of the new array.

In [None]:
# Teacher Action: Compute the total monetary value of the inventory.

# Create an array containing total price for each smartphone.
total_price_for_each_smartphone = prices * units_available
print(total_price_for_each_smartphone)

# Calculate the total monetary value.
total_monetary_value = np.sum(total_price_for_each_smartphone)
total_monetary_value

[ 12591  55424  98991 134991  58384  51080  80991  59994  49995  97076]


699517

**Note:** We cannot add, subtract, multiply and divide two Python lists like NumPy arrays.

Now, using the `mean()` function, compute the average price of a smartphone.

In [None]:
# Student Action: Compute the average price of a smartphone.
x=np.mean(prices)
print(x)

9087.3


Now, using the `min()` function, compute the price of the cheapest smartphone.

In [None]:
# Student Action: Using the 'min()' function, compute the lowest price of a smartphone.
y=np.min(prices)
y

1399

Now, using the `max()` function, compute the price of the expensive smartphone.

In [None]:
# Student Action: Using the 'max()' function, compute the highest price of a smartphone.
z=np.max(prices)
z

14999

Now, using the `median()` function, compute the median price of a smartphone.

In [None]:
# Student Action: Using the 'median()' function, compute the median price of a smartphone.
c=np.median(prices)
c

9499.0

Now, let's compute the most commonly occurring price of a smartphone. If you look at the dataset, the most commonly occurring price is `9999` because it occurs twice. Rest of the prices occur only once.

The value which occurs the most number of times is called the **modal** value or simply **mode**.

Unfortunately, the `numpy` module does not have a function to calculate the modal value. So, either we can create our own function which is a very complicated process or we can use the `mode()` function from the `scipy` library.

For the time being we will choose the second option. At the end of the class we will create our own version of the `mode()` function.

In the `scipy` library, there is a module called `stats` which contains the `mode()` function. So we have to import the `stats` module from the `scipy` library.

In [None]:
# Teacher Action: Compute the modal value using the 'mode()' function from the 'scipy.stats' module.
from scipy import stats
stats.mode(prices)

ModeResult(mode=array([9999]), count=array([2]))

In the output, you can see that `9999` is the modal value and it occurs twice in the `prices` array.

**Note:** `from library_name import module_name` is another way of importing a module. It is also a standard practice.

---

#### Activity 4: Few More Operations On A NumPy Array^^

Performing mathematical operations on a NumPy array is easier compared to a Python list.

Let's say you have a NumPy array with radii of 20 circles and want to compute the area of every circle. Then you can simply use the double-asterisk (`**`) operator on the NumPy array to square the values. Then multiply the NumPy array with `pi`.

**Note:** Area of a circle with the radius
$r$
is
$\pi r^{2}$.

In [None]:
# Teacher Action: Square the values in a numpy array.
import random
import numpy as np

# 1. First create a Python list having radii of 20 circles where each radii is a random number from 1 to 10.
radii = [random.randint(1, 10) for i in range(20)]
print(radii)

# 2. Convert the list into a NumPy array using the 'array()' function.
np_radii = np.array(radii)
print(np_radii)

# 3. Square the elements of NumPy array using the exponent (**) operator. Use can use the 'np.pi' keyword to get the value of 'pi'.
np_area_circles = np.pi * (np_radii ** 2)
np_area_circles

[7, 3, 1, 9, 1, 9, 4, 1, 2, 5, 7, 9, 8, 8, 1, 1, 2, 3, 5, 8]
[7 3 1 9 1 9 4 1 2 5 7 9 8 8 1 1 2 3 5 8]


array([153.93804003,  28.27433388,   3.14159265, 254.46900494,
         3.14159265, 254.46900494,  50.26548246,   3.14159265,
        12.56637061,  78.53981634, 153.93804003, 254.46900494,
       201.06192983, 201.06192983,   3.14159265,   3.14159265,
        12.56637061,  28.27433388,  78.53981634, 201.06192983])

Notice that when you print the values of a NumPy array (in this case `np_radii`) using the `print()` function, the items of the NumPy array are not separated by comma in the output. For all practical purposes, this is just a different behaviour of a NumPy array. Do not worry about it.

If you try to square the radii values stored in a Python list using the same process, then Python will throw the `TypeError` error.

In [None]:
# Teacher Action: Directly apply the exponent (**) operator on a Python list.
radii ** 2

TypeError: ignored

Even if you simply multiply a list containing numeric values with a floating-point number, then also Python will throw the `TypeError` error

In [None]:
# Teacher Action: Directly multiply a Python list with a number.
3.14 * radii

TypeError: ignored

To find the area of the circles whose radii are stored in a Python list, you will have to use a loop.

In [None]:
# Teacher Action: Square all the items in a Python list.
for radius in radii:
  print(np.pi * (radius ** 2))

153.93804002589985
28.274333882308138
3.141592653589793
254.46900494077323
3.141592653589793
254.46900494077323
50.26548245743669
3.141592653589793
12.566370614359172
78.53981633974483
153.93804002589985
254.46900494077323
201.06192982974676
201.06192982974676
3.141592653589793
3.141592653589793
12.566370614359172
28.274333882308138
78.53981633974483
201.06192982974676


Now, using the same approach, you create two NumPy arrays: one having radii (numbers from `1` to `10`) of `10` cylinders and another having their corresponding heights (numbers from `11` to `20`).

**Note:** The volume of a cylinder is
$\pi r^{2}h$
, where
$h$
is height of the cylinder and
$r$
is the radius of the cylinder.

In [None]:
# Student Action: Create two NumPy arrays. One having a radii of 10 cylinders and another having their corresponding heights.
# Compute the volume of the 10 cylinders by multiplying the NumPy arrays and store the new NumPy array in the new variable.
import numpy as np
radii=np.arange(1,11)
heights=np.arange(11,21)
volume=np.pi*(radii**2)*heights
print(volume)


[  34.55751919  150.79644737  367.56634047  703.7167544  1178.0972451
 1809.55736847 2616.94668044 3619.11473694 4834.91109387 6283.18530718]


So, here we got an array containing the volumes of the corresponding cylinders.

---

#### Activity 5: Python List And NumPy Array Performance Comparison^

As we discussed earlier, the execution time for a NumPy array is lesser as compared to a Python list. The difference is most significant when the sizes of lists and arrays are in thousands and above.

Let's first create a Python list and a NumPy array both having 100 thousand (or 1 lakh) items. Then let's compute how much time (in seconds) is taken to create the list and the array.

In [None]:
# Student Action: Run the code shown below to see that NumPy arrays are faster than Python lists.
# 1. Import the 'numpy' and 'time' modules
import numpy as np
import time
t_0=time.time()
my_list=[ i for i in range (1,100001)]
t_1=time.time()
time_diff = (t_1 - t_0)
print("time taken to create a python list is :",time_diff,"seconds")
T_0=time.time()
array_1=np.arange(1,100001)
T_1=time.time()
Time_Diff = (T_1 - T_0)
print("time taken to create an array is:",Time_Diff,"seconds")
print("a numpy array is:",time_diff//Time_Diff,"times faster then a python list for a same size")

time taken to create a python list is : 0.008270740509033203 seconds
time taken to create an array is: 0.00022602081298828125 seconds
a numpy array is: 36.0 times faster then a python list for a same size


If you run the above code several times, you will see that almost always NumPy arrays are faster than Python lists by a huge margin.

---

#### Activity 6: The User-Defined `mode()` Function^^^

Now let's create our own version of the `mode()` function. It should take a one-dimensional NumPy array (`input_array`) as an input and should return a pair of the modal value and its count as an output.

To create this function:

1. First we will create an empty Python list to store the count of every item in the `input_array`.

  ```
  counts_list = []
  ```

2. Next, we will convert the `input_array` to a Python list and store it in the `input_list` variable.

  ```
  input_list = list(input_array)
  ```

3. We will iterate through each item in the `input_list` and count its value. Then we will add the counts to the `counts_list` using the `append()` function.

  ```
  for item in input_list:
      item_count = input_list.count(item)
      counts_list.append(item_count)
  ```

4. Then we will convert the `counts_list` into a NumPy array.

  ```
  counts_array = np.array(counts_list)
  ```

5. Next, we will compute the maximum count value in the `counts_array` using the `count()` function.

  ```
  max_count = np.max(counts_array)
  ```

6. Then, we will find the index of the `max_count` value using the `index()` function.

  ```
  max_count_index = counts_list.index(max_count)
  ```

7. Finally, we will find the modal value using the list indexing method.
  
  ```
  mode = input_list[max_count_index]
  ```

**Note:** There could be other ways to create the `mode()` function. You are free to explore them in your own time.

In [None]:
# Student Action: Create the 'mode()' function which takes a 1D NumPy array as an input and returns the modal value and its count as an output.
def mode(input_1d_ar):
  count_list=[]
  input_list=list(input_1d_ar)
  for item in input_list:
    item_count=input_list.count(item)
    count_list.append(item_count)
  counts=np.array(count_list)
  max_count=np.max(counts)
  max_count_index=count_list.index(max_count)
  mode=input_list[max_count_index]
  return mode,max_count
mode(prices)


(9999, 2)

So, in this way, we can create the `mode()` function which takes a one-dimensional array as an input and returns a pair of the modal value and its count as an output.

Now, we are well equipped to build the actual Mind Reader game algorithm. We will create it in the next class.

---