# NumPy Basics: Arrays and Vectorized computation


NumPy is a diverse library in python which works with mathematcal computations and deals with the mathematical part of data analysis. We can import the NumPy module by the code <code>import numpy as np</code>. Here the <code>np</code> is the alias given to the NumPy by which we can use the different functions under this module 

## The NumPy ndarray : A multidimensional array object

Firstly, let use understand what is an array. An array is a special type of data type which works with the vectorization. This means while accessing an element in a list or a tuple we have to go through a loop but in case of an array. We don't need to do that. We can access each element of an array by just using that object. \
Let's see a basic difference between a list and an array. 

In [3]:
import numpy as np
List=[1,2,3,4,5] #Declaration of a list
#Let's turn it into an array
arr=np.array(List)
arr

array([1, 2, 3, 4, 5])

An array is many times called as **ndarray**, this is because it is multidimensional in nature. The array could be one dimensioanl or multi dimensional\
We can use <code>array.shape</code> to know about the shape of an array. Let's see some arrays with different dimensions and shapes.

In [14]:
#This is a one dimensional array
data_1D=[1,3,4,6]
data_1D=np.array(data_1D)

print(data_1D.shape)
data_1D

(4,)


array([1, 3, 4, 6])

In [15]:
#Let us look at a 2D array
data2D=[[1,2,3,4,5],[4,5,6,7,8]]
data2D=np.array(data2D)
print(data2D.shape)
data2D

(2, 5)


array([[1, 2, 3, 4, 5],
       [4, 5, 6, 7, 8]])

In [16]:
#A three dimensional array
data3D=[[1,2,3,4],[4,5,6,7],[6,7,8,9]]

data3D=np.array(data3D)
print(data3D.shape)
data3D

(3, 4)


array([[1, 2, 3, 4],
       [4, 5, 6, 7],
       [6, 7, 8, 9]])

So here we saw some arrays with different dimensions.

## Creating ndarrays

We saw that we can create ndarrays using <code>np.array(<*object*>)</code>. Here the object can be a list, tuple etc. \ 
There are more ways in which we can declare a ndarray-:
* <code>np.zeros()</code>- Created an ndarray of the given shape filled with only 0 as input
* <code>np.ones()</code>-Created an ndarray of the given shape filled with only 1 as input
* <code>np.empty()</code>-Created an ndarray of the given shape filled with no input.
* <code>np.arange()</code>-Created an 1 dimensional ndarray of the given scalar from the 0 to that number


Let's see these methods one by one

In [31]:
#Creating !d array with np.zeros
zeros=np.zeros(4)
print(zeros)
print(zeros.shape)

#Creating 2d arrays
zeros2d=np.zeros((4,3))
print(zeros2d)
print(zeros2d.shape)

#Creating 3d arrays
zeros3d=np.zeros((3,4,5))
print(zeros3d)
print(zeros3d.shape)

[0. 0. 0. 0.]
(4,)
[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
(4, 3)
[[[0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0.]]

 [[0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0.]]

 [[0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0.]
  [0. 0. 0. 0. 0.]]]
(3, 4, 5)


Let's do it for <code>np.ones()</code>.

In [32]:
#Creating !d array with np.ones
ones=np.ones(4)
print(ones)
print(ones.shape)

#Creating 2d arrays
ones2d=np.ones((4,3))
print(ones2d)
print(ones2d.shape)

#Creating 3d arrays
ones3d=np.ones((3,4,5))
print(ones3d)
print(ones3d.shape)

[1. 1. 1. 1.]
(4,)
[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]
(4, 3)
[[[1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1.]]

 [[1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1.]]

 [[1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1.]]]
(3, 4, 5)


## Data Types for ndarrays

NumPy module has different **dtype** like **float64** , **int32** etc. <br>
The data type makes the ndarray how it should be interpreted as a chunk of memory. Let us declare array with their dtypes.

![dtypes in NumPy](\n1.png)

The number against each dtype shows the bits of memory it is consuming. </br>
Let's declare an array with it's dtype. </br>
Firstly, with <code>dtype=np.float64</code> and then by <code>dtype=np.int32</code>

In [35]:
arr1=np.array([1,2,3],dtype=np.float64)
arr1

array([1., 2., 3.])

In [36]:
arr2=np.array([1,2,3],dtype=np.int32)
arr2

array([1, 2, 3])

## Type casting in Data types

We can convert the data types implicitly or explicitly. Let's change the data type implicitly.

In [41]:
arr1=np.array([1,2,3],dtype=np.float64)
arr2=np.array([3,2,1],dtype=np.int32)
arr3=arr2+arr1
arr3.dtype

dtype('float64')

In the above example we can see that the **arr1** and **arr2** had <code>dtype</code> of <code>float64</code> and <code>int32</code> but when they both were added the resultant **arr3** had dtype of <code>float64</code>.

Now, let us do explicit type casting of ndarray. For this we will use the following syntax <code><*array*>.dtype(<*dtype*>)</code>

In [49]:
arr1=np.array([2,3,4,5],dtype=np.int64) #Declaration of a nd.array
new_arr=arr.astype(np.float64) #Explicit type casting of arr
new_arr.dtype

dtype('float64')

In [50]:
arr2=np.array([5,6,7,8],dtype=np.float64) #Declaration of an array of dtype "float64"
new_arr2=arr.astype(np.int32) #Explicit type casting of arr2
new_arr2.dtype

dtype('int32')

## Arithmetic with NumPy arrays

NumPy is a great modules to deal with arithmetics calculations and the reason behind it is very simple that is NumPy uses **vectorization**. That is it can do element wise operations without iterations or using any loop.

Let's have a look on some examples.
* Operation on 2D array
* Array with a scalar
* Comparison between two same sized arrays.

In [60]:
#Operation on 2D arrays
arr=np.arange(6).reshape((2,3))
print(arr)

#Squaring of each element
print(arr*arr)
print(arr**2)

print(arr-arr)

print(arr**0.5) #Square root of each element in an array.

#Operations on a scalar
print(1/arr)
print(arr+2)

#Comparison between two same sized arrays
arr2=np.arange(2,8).reshape((2,3))
arr2>arr

[[0 1 2]
 [3 4 5]]
[[ 0  1  4]
 [ 9 16 25]]
[[ 0  1  4]
 [ 9 16 25]]
[[0 0 0]
 [0 0 0]]
[[0.         1.         1.41421356]
 [1.73205081 2.         2.23606798]]
[[       inf 1.         0.5       ]
 [0.33333333 0.25       0.2       ]]
[[2 3 4]
 [5 6 7]]


  print(1/arr)


array([[ True,  True,  True],
       [ True,  True,  True]])

## Indexing and Slicing

The indexing and slicing in the ndarrays are very useful when we want to deal with only a part of data.
* Accessing by index
* Slicing by indices
* Assigning a scalar value to a slice of an array
* Indexing in a multidimensional array </br>
Let's see them one by one.

In [61]:
#Accessing an element by index
arr=np.array([2,32,4,5])
arr[2]

4

In [63]:
#Slicing by indices
arr=np.array([23,45,67,89,21,45,65,12,45,81])
new_arr=arr[2:4] #Sliced array
new_arr

array([67, 89])

In [66]:
#Assigning a scalar value to a slice of an array
arr=np.arange(13)
arr[2:7]=46 #Assigning a scalar to a sliced array
arr


array([ 0,  1, 46, 46, 46, 46, 46,  7,  8,  9, 10, 11, 12])

Interestingly, the modifications that we make in a slice change the actual data of the unsliced array.</br>
If in case we want a data not to be changed when we make a modification in a slice of an array then in that case we will have to make a copy of the array by using <code><*array*>[:].copy()</code>. </br>
Let us see that in the next cell.

In [68]:
#Declaring an array 
arr=np.arange(16)
arr_copy=arr.copy() #Making of the copy of the array
arr[3:7]=12         #Sliced array given a scalar value
print(arr)
print(arr_copy)

[ 0  1  2 12 12 12 12  7  8  9 10 11 12 13 14 15]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15]


In [70]:
#Indexing and accessing a value in a multi-dimensional array
array=np.arange(9).reshape(3,3)
array

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In a multidimensional array we can access a value in a row by the basic pythonic accessing. For example <code><*array*>[row][column]</code> but in NumPy ndarray we can access the value by a more comfortable way <code><*array*>[row,column]</code>. </br>
Let us see this by the following example.

In [72]:
array
print(array[0][2])
print(array[0,2])

2
2


![](\n2.png)

## Accessing 1D array from multidimensional array

In [75]:
new_arr=np.arange(32).reshape(8,4)
new_arr


array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31]])

In [76]:
#Accessing 1D array from multidimensional array
new_arr[1]

array([4, 5, 6, 7])

## Indexing with slicing

Let us see how we can do different sort of slicing.

In [79]:
#Basic slicing
array=np.arange(9)
array[0:5]

array([0, 1, 2, 3, 4])

In [83]:
#Operation on a 2D array
array=np.arange(12).reshape((3,4))
array

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [86]:
array[0:2]

array([[0, 1, 2, 3],
       [4, 5, 6, 7]])

In [88]:
array[0:2, :3]

array([[0, 1, 2],
       [4, 5, 6]])

In [89]:
array[0:3, 1:]

array([[ 1,  2,  3],
       [ 5,  6,  7],
       [ 9, 10, 11]])

## Boolean Indexing

Boolean indexing refers to the using of boolean values <code>True, False</code> to access the values. The <code>True</code> usually when used as an index refers to the value which is to be accessed and the bool <code>False</code> usually means which is not be iterated. </br>
Let's understand this with examples

In [94]:
data=np.array([23,45,65,47,67,18])
data[data>40]

array([45, 65, 47, 67])

In the above example we have passed a boolean expression as the index and the value for which it gives <code>True</code> passes as an index.

In [97]:
beast=np.array(["bull","goat","bull","lion","elephant","bull"])
beast=="bull"

array([ True, False,  True, False, False,  True])

In the above expression we have produced an array with boolean values. Now this array can be used to access the values of other array by the bool values inside the array.

In [99]:
data=np.arange(1,7)
data[beast=="bull"]

array([1, 3, 6])

In the case above the boolean array must be of the same size of the the data array.

In [103]:
# Using "!=" and ~condition
data=np.arange(1,7)
print(data[beast!="bull"])
print(data[~(beast=="bull")])

[2 4 5]
[2 4 5]


## Fancy Indexing



It is a type of indexing in which integer arrays are used as indices.

In [106]:
#Firstly let us declare an array.
arr=np.empty((8,4))
for i in range(8):
    arr[i]=i  
arr

array([[0., 0., 0., 0.],
       [1., 1., 1., 1.],
       [2., 2., 2., 2.],
       [3., 3., 3., 3.],
       [4., 4., 4., 4.],
       [5., 5., 5., 5.],
       [6., 6., 6., 6.],
       [7., 7., 7., 7.]])

In [107]:
arr[[4,3,0,6]] #Passing array of integer as indices

array([[4., 4., 4., 4.],
       [3., 3., 3., 3.],
       [0., 0., 0., 0.],
       [6., 6., 6., 6.]])

In [108]:
arr[[-3,-5,-7]]

array([[5., 5., 5., 5.],
       [3., 3., 3., 3.],
       [1., 1., 1., 1.]])

In [110]:
#Passing a multi-dimensional array
arr=np.arange(16).reshape((4,4))
arr

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [112]:
print(arr[1,2]) #Passing indices 
print(arr[[1,2]]) #Passing an array of indices

6
[[ 4  5  6  7]
 [ 8  9 10 11]]


In [114]:
#Passing multiple arrays as indices
arr[[1,2],[2,3]]

array([ 6, 11])

## Transposing arrays 

In ndarrays the axes can be transposed. This actually means that shape gets inverted. </br>
For a 2D array <code><*array*>.T</code>. This syntax is used to transpose the axes and invert the shape from <code>(x,y)</code> to <code>(y,x)</code> 

In [116]:
arr=np.arange(8).reshape((4,2))
arr

array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7]])

In [118]:
print(arr.T)
print(arr.T.shape)

[[0 2 4 6]
 [1 3 5 7]]
(2, 4)


If in case it is not a 2D array then for transposing we can do <code><*array*>.transpose()</code>. It also can take input like inverting the order of which axes. </br>
Let us see an example of that

In [124]:
#Declaring a multi-dimensional array
arr=np.arange(42).reshape((7,2,3))
arr

array([[[ 0,  1,  2],
        [ 3,  4,  5]],

       [[ 6,  7,  8],
        [ 9, 10, 11]],

       [[12, 13, 14],
        [15, 16, 17]],

       [[18, 19, 20],
        [21, 22, 23]],

       [[24, 25, 26],
        [27, 28, 29]],

       [[30, 31, 32],
        [33, 34, 35]],

       [[36, 37, 38],
        [39, 40, 41]]])

In [125]:
print(arr.transpose())
print(arr.transpose().shape)

[[[ 0  6 12 18 24 30 36]
  [ 3  9 15 21 27 33 39]]

 [[ 1  7 13 19 25 31 37]
  [ 4 10 16 22 28 34 40]]

 [[ 2  8 14 20 26 32 38]
  [ 5 11 17 23 29 35 41]]]
(3, 2, 7)


As in the above example the shape has become (z,y,x) from (x,y,z)</br>
If we just want to interchange x and y axes then we can do it as below

In [126]:
print(arr.transpose(1,0,2))
print(arr.transpose().shape)

[[[ 0  1  2]
  [ 6  7  8]
  [12 13 14]
  [18 19 20]
  [24 25 26]
  [30 31 32]
  [36 37 38]]

 [[ 3  4  5]
  [ 9 10 11]
  [15 16 17]
  [21 22 23]
  [27 28 29]
  [33 34 35]
  [39 40 41]]]
(3, 2, 7)


## Universal Functions 

NumPy has Universal functions. These can be unary (taking one argument as input) or binary (taking two arguments as input).

Let's look at some universal functions
* <code>np.sqrt(arr)</code>
* <code>np.modf(arr)</code>- A unary function which gives two outputs a the greatest integer part and the fractional part
* <code>np.exp(arr)</code>
* <code>np.maximum(x,y)</code>- A binary function

## Mathematical  and Statistical Methods

* <code>arr.mean()</code> The axis is the optional argument
* <code>arr.sum()</code>  The axis is the optional argument
* <code>arr.cumsum()</code> The axis is the optional argument
* <code>arr.cumprod()</code>The axis is the optional argument


## Methods for boolean arrays

* <code>(arr>0).sum()</code>-The <code>True</code> is considered as 1 and the <code>False</code> as 0 and the sum is calculated.
* <code>bools.any()</code>-Returns <code>True</code> if there is even a single <code>True</code> value.
* <code>bools.all()</code>-Returns <code>True</code> if there are all values are <code>True</code>. 