###  **------------------------------------------------- Advanced Numpy -----------------------------------------------------**

Topics covered in this notebook:
1. Broadcasting
2. Masking
3. Fancy Indexing
4. Combined Indexing 
    - Indexing with slicing
    - Indexing with Masking
5. Array Structured properties

In [2]:
import numpy as np

In [2]:
arr1 = np.array([11,21,31,41,49])
arr1

array([11, 21, 31, 41, 49])

In [3]:
arr2 = np.array([[1,9,21],[29,41,51],[58,72,8]])
arr2

array([[ 1,  9, 21],
       [29, 41, 51],
       [58, 72,  8]])

In [4]:
arr3 = np.array([[15,7,10,-11,20,75],[-3,9,24,1,100,-14]])
arr3

array([[ 15,   7,  10, -11,  20,  75],
       [ -3,   9,  24,   1, 100, -14]])

In [5]:
arr1.ndim, arr2.ndim, arr3.ndim

(1, 2, 2)

In [6]:
arr1.sum()

153

In [7]:
arr2.sum()

290

In [8]:
arr3.sum()

233

In [9]:
arr2.sum(axis = 0) ## row wise addition

array([ 88, 122,  80])

In [10]:
arr2.sum(axis = 1) ## column wise addition 

array([ 31, 121, 138])

In [11]:
arr3.sum(axis = 0)

array([ 12,  16,  34, -10, 120,  61])

In [12]:
arr3.sum(axis = 1)

array([116, 117])

In [13]:
np.average(arr1)

30.6

In [14]:
np.average(arr2,axis = 0)

array([29.33333333, 40.66666667, 26.66666667])

In [15]:
np.average(arr2,axis = 1)

array([10.33333333, 40.33333333, 46.        ])

In [16]:
np.average(arr3,axis = 0)

array([ 6. ,  8. , 17. , -5. , 60. , 30.5])

In [17]:
np.average(arr3,axis = 1)

array([19.33333333, 19.5       ])

In [18]:
arr2.min()

1

In [19]:
arr1.min()

11

In [20]:
arr3.max()

100

In [21]:
arr3.max(axis=0)

array([ 15,   9,  24,   1, 100,  75])

In [22]:
arr3.max(axis=1)

array([ 75, 100])

In [23]:
arr2.mean()

32.22222222222222

#### *Computation on Arrays: Broadcasting*

The term broadcasting refers to how numpy treats arrays with different dimension while arithmetic operations leading to specific constraints. 
Moreover, the smaller arrayis broadcast across the larger array so that they have compatible shapes.

Broadcasting provides a means of vectorising array operations so that looping occurs in C rather than Python as we understand that Numpy implemented in C programming language.

It does this without creating unnecessary data copies and which leads to efficient algorithm implementations.

In some cases, broadcasting is a bad idea because it leads to ineffective memory utilisation which declines the computation.

In [24]:
a1 = np.array([1,3,7,5])
b1 = np.array([90,50,0,30])

In [26]:
c1 = a1 * b1
c1

array([ 90, 150,   0, 150])

In [27]:
d1 = 3

In [28]:
e1 = a1 * d1
e1

array([ 3,  9, 21, 15])

In [29]:
f1 = a1 + d1
f1

array([ 4,  6, 10,  8])

Broadcasting Rules:
• The following are the rules in order to broadcast two arrays together:
1. Prepend the shape of the lower rank array with 1s until both shapes have the same
length if the arrays do not have the same rank
2. In a dimension, the two arrays are compatible if they have the same size in the
dimension or if one of the arrays has size 1 in that dimension
3. If arrays are compatible with all dimensions then they can be broadcasted together
4. After broadcasting, every array acts as if it had shape equivalent to the element‐wise maximum of shapes of the two input arrays
5. In any dimension where one array had size 1, as well as the other array had size greater than 1, the first array acts as if it were copied along that dimension

In [30]:
a1 =np.array([[12,23,34],[11,21,31]])
a1

array([[12, 23, 34],
       [11, 21, 31]])

In [31]:
b1 = 4
b1

4

In [32]:
c1 = a1 + b1
c1

array([[16, 27, 38],
       [15, 25, 35]])

In [35]:
a1= np.array([1,3,5,7])
d1= np.array([1,5,5,3])

In [36]:
e1= d1*a1
e1

array([ 1, 15, 25, 21])

#### *random function*

A two‐dimensional array is generated by the first array having a size of 5 rows and columns, and the values are within 10 and 50.

In [37]:
arr_random = np.random.randint(10, 50, size = (5, 8))
arr_random

array([[36, 25, 18, 26, 42, 24, 37, 31],
       [12, 40, 28, 26, 27, 37, 34, 23],
       [24, 27, 35, 31, 23, 39, 43, 44],
       [45, 19, 35, 48, 22, 16, 33, 28],
       [26, 20, 35, 37, 15, 10, 43, 27]])

In [38]:
arr2_rand = np.random.randint(1, 20, size = (2, 3, 6))
arr2_rand

array([[[16, 17, 14,  3,  2, 15],
        [ 5, 10,  6, 14,  4, 10],
        [ 5,  7,  9, 12,  2, 11]],

       [[ 2, 12,  6, 11,  4,  3],
        [ 4, 11, 14, 18, 11, 13],
        [11, 14, 18, 18, 17,  9]]])

In [39]:
arr2_rand.ndim

3

In [3]:
a = np.array([0,2,3,0,1,6,5,2])
a

array([0, 2, 3, 0, 1, 6, 5, 2])

In [4]:
np.greater(a,2)

array([False, False,  True, False, False,  True,  True, False])

In [5]:
np.greater_equal(a,2)

array([False,  True,  True, False, False,  True,  True,  True])

In [6]:
np.less(a,2)

array([ True, False, False,  True,  True, False, False, False])

In [7]:
np.less_equal(a,2)

array([ True,  True, False,  True,  True, False, False,  True])

In [8]:
a =np.reshape(np.arange(25),(5,5))
a

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

In [9]:
greater_values = (a > 10)
greater_values

array([[False, False, False, False, False],
       [False, False, False, False, False],
       [False,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True]])

#### *Masking in Numpy*

1. In numpy.ma.mask_rows() function, mask rows of a 2‐Dimensional array which hold masked values. 
2. The numpy.ma.mask_rows() function is a shortcut to mask_rowcols with axis equal to 0.

In [1]:
import numpy as np
import numpy.ma as MA

In [2]:
array = np.zeros((4,4),dtype = int)
array

array([[0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0]])

In [3]:
array[2,2] = 1
array

array([[0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 1, 0],
       [0, 0, 0, 0]])

In [4]:
array1 = MA.masked_equal(array,1)
array1

masked_array(
  data=[[0, 0, 0, 0],
        [0, 0, 0, 0],
        [0, 0, --, 0],
        [0, 0, 0, 0]],
  mask=[[False, False, False, False],
        [False, False, False, False],
        [False, False,  True, False],
        [False, False, False, False]],
  fill_value=1)

#### *Fancy Indexing*
- Fancy indexing is like the simple indexing, but we pass arrays of indices instead of single scalars
- It permits us to quickly access as well as change complicated subsets of an array's values

Exploring Fancy Indexing
- Fancy indexing is conceptually simple which means passing an array of indices in order to access multiple array elements at one time
- For instance, consider the below‐written array:

In [3]:
x = np.random.randint(100,size = 10)
x

array([88, 57, 45, 70,  2, 32, 42, 68, 58, 97])

• Let us suppose, we want to access three different elements : [x[3],x[7],x[2]]

In [4]:
[x[3], x[7], x[2]]

[70, 68, 45]

In [5]:
#passing the indices in the form of list
indices = [3,7,4]
indices

[3, 7, 4]

In [6]:
# passing the index to the array variable
x[indices]

array([70, 68,  2])

In [7]:
indices2 = np.array([[3,4],[5,7]])
indices2

array([[3, 4],
       [5, 7]])

In [8]:
x[indices2]

array([[70,  2],
       [32, 68]])

• Even fancy indexing works in multiple dimensions. See the example shown below:

In [9]:
y = np.arange(12).reshape((3,4))
y

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [24]:
row = np.array([0,1,2])
col = np.array([2,1,3])
row,col

(array([0, 1, 2]), array([2, 1, 3]))

In [26]:
# first it match the first row number with column number i.e, 0th row with 2nd column gives 2 (value) in the y array . likewise same follows...
y[row,col]

array([ 2,  5, 11])

• The broadcasting rules are followed by the pairing of indices in fancy indexing.

Therefore, for instance, we get a two‐dimensional result if we combine a column vector as well as a row vector within the indices:

In [27]:
y

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [33]:
#change the axis 
y[row[:,np.newaxis],col]

array([[ 2,  1,  3],
       [ 6,  5,  7],
       [10,  9, 11]])

- It is always necessary to memorise with fancy indexing that the broadcasted shape of the indices is reflected by the return value, instead of the shape of the array being indexed

##### *Combined Indexing*
- Fancy indexing can be combined with the other indexing schemes for more powerful operations.

- We can combine fancy indexing with slicing as well.

- Even, fancy indexing can be combined with masking.

- All of these indexing options combined lead to a very flexible group of operations for accessing as well as modifying array values.

In [34]:
y

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [35]:
y[2,[2,0,2]]

array([10,  8, 10])

In [36]:
y[1:,[2,0,3]]

array([[ 6,  4,  7],
       [10,  8, 11]])

In [37]:
mask = np.array([0,0,1,0],dtype = bool)
mask

array([False, False,  True, False])

In [38]:
y

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [39]:
y[row[:,np.newaxis],mask]

array([[ 2],
       [ 6],
       [10]])

#### *NumPy’s Structured Array*

- Numpy’s Structured Array is similar to Struct in C programming language. It is used in order to group data of different sizes and types.

- Data containers named as fields are used by the structure array. Every data field can contain data of any size and type. With the help of dot notation, array elements can be accessed.

##### **Structured Array Properties**

• All structs in the array have the similar number of fields

• All structs have same fields names

• For instance, consider a student's structured array with different fields such as year,name, and marks

• Every record in array student has a structure of class Struct. Moreover, the array of a  structure is referred to as struct as adding any new fields for a new struct in the array, contains the empty array

In [66]:
a1 = np.array([('Anand',2022,8.88),('Anil',2021,9.88) ,('Shubham',2020,8.90)], 
                dtype = [('name',(np.str_,10)),('year',np.int32),('cgpa',np.float64)])
a1

array([('Anand', 2022, 8.88), ('Anil', 2021, 9.88),
       ('Shubham', 2020, 8.9 )],
      dtype=[('name', '<U10'), ('year', '<i4'), ('cgpa', '<f8')])

In [44]:
a1[0]

('Anand', 2022, 8.88)

In [46]:
a1[0][0]

'Anand'

In [50]:
a1['name']=='Shubham'

array([False, False,  True])

In [51]:
a1[a1['name']=='Shubham']

array([('Shubham', 2020, 8.9)],
      dtype=[('name', '<U10'), ('year', '<i4'), ('cgpa', '<f8')])

In [54]:
b = np.sort(a1,order='name')
b

array([('Anand', 2022, 8.88), ('Anil', 2021, 9.88),
       ('Shubham', 2020, 8.9 )],
      dtype=[('name', '<U10'), ('year', '<i4'), ('cgpa', '<f8')])

In [55]:
c = np.sort(a1, order = 'year')
c

array([('Shubham', 2020, 8.9 ), ('Anil', 2021, 9.88),
       ('Anand', 2022, 8.88)],
      dtype=[('name', '<U10'), ('year', '<i4'), ('cgpa', '<f8')])

In [56]:
d = np.sort(a1, order = 'cgpa')
d

array([('Anand', 2022, 8.88), ('Shubham', 2020, 8.9 ),
       ('Anil', 2021, 9.88)],
      dtype=[('name', '<U10'), ('year', '<i4'), ('cgpa', '<f8')])

In [63]:
# name in descending order
d_desc = np.sort(a1, order = 'name')[::-1]
d_desc

array([('Shubham', 2020, 8.9 ), ('Anil', 2021, 9.88),
       ('Anand', 2022, 8.88)],
      dtype=[('name', '<U10'), ('year', '<i4'), ('cgpa', '<f8')])