## Python - Numpy Guide


#### Priyaranjan Mohanty

In [45]:
2 + 4 # Its a comment



6

### Numpy : what is it ?

Numpy is the most basic and a powerful package for working with data in python.

If you are going to work on data analysis or machine learning projects, then having a solid understanding of numpy is nearly mandatory.

Because other packages for data analysis (like pandas) is built on top of numpy and the scikit-learn package which is used to build machine learning applications works heavily with numpy as well.

So what does numpy provide?

At the core, numpy provides the excellent ndarray objects, short for n-dimensional arrays.

In a ‘ndarray’ object, aka ‘array’, you can store multiple items of the same data type. It is the facilities around the array object that makes numpy so convenient for performing math and data manipulations.

You might wonder, ‘I can store numbers and other objects in a python list itself and do all sorts of computations and manipulations through list comprehensions, for-loops etc. What do I need a numpy array for?’

Well, there are very significant advantages of using numpy arrays overs lists.

To understand this, let’s first see how to create a numpy array.

### How to create NDArray ( Numpy Array )

There are multiple ways to create a numpy array. 

However one of the most common ways is to create one from a list or a list like an object by passing it to the np.array function.

In [46]:
# Install Numpy 

!pip install numpy



In [47]:
# Import Numpy 

import numpy as np

In [48]:
# Create an 1 dimensional array from a list

list1 = [0,1,2,3,4]

Arr_1d = np.array(list1)

In [49]:
# Print the array and its type
print(type(Arr_1d))

print("\n")

print(Arr_1d)

<class 'numpy.ndarray'>


[0 1 2 3 4]


In [50]:
# Print the shape of the Array

print(Arr_1d.shape)

(5,)


The key difference between an array and a list is, arrays are designed to handle vectorized operations while a python list is not.

That means, if you apply a function it is performed on every item in the array, rather than on the whole array object.

Let’s suppose you want to add the number 2 to every item in the list. The intuitive way to do it is something like this:

In [51]:
print(list1)

[0, 1, 2, 3, 4]


In [52]:
list1 + 2

TypeError: can only concatenate list (not "int") to list

In [53]:
print(Arr_1d)

[0 1 2 3 4]


In [54]:
Arr_1d + 2

array([2, 3, 4, 5, 6])

### Next Step - Lets create 2 Dimensional Array 



In [55]:
# Create a 2d array from a list of lists

list_2D = [[0,1,2,3], 
           [4,5,6,7], 
           [8,9,10,11]]

Arr_2D = np.array(list_2D)

In [56]:
Arr_2D

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [57]:
print(type(Arr_2D))

print("\n")

print(Arr_2D)

print("\n")

print(Arr_2D.shape)

<class 'numpy.ndarray'>


[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


(3, 4)


In [58]:
# We can also create a 2-D Array from a flat list ( Not a nested list )

List_Num = [0,1,2,3,4,5,6,7,8,9,10,11]

# Array_1D_2 = np.array(List_Num)

# Array_2D_2 = Array_1D_2.reshape(3,4)

Array_2D_2 = np.array(List_Num).reshape(3,4)

Array_2D_2

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [59]:
Array_2D_2.shape

(3, 4)

In [60]:
Array_2D_3 = Array_2D_2.reshape(6,2)

Array_2D_3

array([[ 0,  1],
       [ 2,  3],
       [ 4,  5],
       [ 6,  7],
       [ 8,  9],
       [10, 11]])

We can also specify the datatype by setting the dtype argument. 

Some of the most commonly used numpy dtypes are: 'float', 'int', 'bool', 'str'

In [61]:
Arr_2D_2 = np.array(list_2D)

print(Arr_2D_2)


[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


In [63]:
# Create a float 2d array

Arr_2D_2 = np.array(list_2D, dtype='float')

print(Arr_2D_2)

print(Arr_2D_2.dtype)

[[ 0.  1.  2.  3.]
 [ 4.  5.  6.  7.]
 [ 8.  9. 10. 11.]]
float64


#### Note -

A numpy array must have all items to be of the same data type, unlike lists. This is another significant difference.

### Note :

We can always convert a Numpy Array back to List 

In [64]:
print(Arr_1d)

print(type(Arr_1d))

[0 1 2 3 4]
<class 'numpy.ndarray'>


In [65]:
# Convert an ndarray to list 

print(Arr_1d.tolist())

print("\n")

print(type(Arr_1d.tolist()))

[0, 1, 2, 3, 4]


<class 'list'>


In [66]:
print(Arr_2D)

print("\n")

print(type(Arr_2D))

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


<class 'numpy.ndarray'>


In [67]:
# Convert 2-d array to list 

print(Arr_2D.tolist())

print("\n")

print(type(Arr_2D.tolist()))

[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]


<class 'list'>


#### To summarise, 

The main differences with python lists are:

Arrays support vectorised operations, while lists don’t.

Once an array is created, you cannot change its size. You will have to create a new array or overwrite the existing one.

Every array has one and only one dtype. All items in it should be of that dtype.

An equivalent numpy array occupies much less space than a python list of lists.

### Some Functions , Methods or Attributes associates with Numpy Array Object

In [70]:
print(Arr_2D)

print(Arr_2D.shape)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
(3, 4)


In [68]:
# Get the number of dimensions of a Numpy array using ndim attribute

print(Arr_2D.ndim)

2


In [71]:
# get the shape of the numpy array object

print(Arr_2D.shape)

(3, 4)


In [72]:
# get the size of the numpy array object

print(Arr_2D.size)

12


In [73]:
# get the data type of the numpy array object

print(Arr_2D.dtype)

int32


What if I try to create an Array from a list containing hetrogenous data.

In [74]:
List_var = [1,2,'a','b']

Array_2D_3 = np.array(List_var).reshape(2,2)

print(Array_2D_3)

[['1' '2']
 ['a' 'b']]


### Extracting specific items from an array

In [75]:
print(Arr_1d)

[0 1 2 3 4]


In [76]:
print(Arr_1d[0])

0


In [77]:
print(Arr_1d[-1])

4


In [79]:
print(Arr_1d[-3])

2


Extracting elements from a multi dimensional array 

In [80]:
print(Arr_2D)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


In [81]:
List_2_Var = [[ 0 , 1 , 2 , 3] ,  [ 4 , 5 , 6 , 7] ,  [ 8 , 9 ,10 , 11]]

In [82]:
print(Arr_2D[0])

[0 1 2 3]


In [33]:
# Extract the element which is in 2nd row and 1st column of the array 

Arr_2D[1,0]

4

In [83]:
# Extract the elements which are in 2nd row ( all columns )

print(Arr_2D[1,:])

[4 5 6 7]


In [84]:
# Extract the elements which are in 1st column ( all rows )

Arr_2D[:,0]

array([0, 4, 8])

In [85]:
# Extract the elements which are in 1st & 2nd row of column 1

Arr_2D[:2,0]

array([0, 4])

Additionally, numpy arrays support boolean indexing.

In [86]:
Arr_2D

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [90]:
Bool_Idx = Arr_2D > 5

print(Bool_Idx)

[[False False False False]
 [False False  True  True]
 [ True  True  True  True]]


In [91]:
Arr_2D[Bool_Idx]

array([ 6,  7,  8,  9, 10, 11])

In [92]:
print(type(Arr_2D[Bool_Idx]))

print(Arr_2D[Bool_Idx].shape)

<class 'numpy.ndarray'>
(6,)


Modifying / updating the contents of an array 

In [93]:
print(Arr_2D)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


In [94]:
Arr_2D[2,1]

9

In [97]:
#Update the last element in last row of the array

Arr_2D[2,1] = 99

In [98]:
Arr_2D

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8, 99, 10, 11]])