# Introduction to Numpy

Many of the libraries you will use to perform data analysis in Python, <br>
as well as many of the mathematical functions you'll use, will involve working with <b>`Numpy`</b>. <br>

<b>`Numpy`</b> (short for <b>Numerical Python</b>) :
 - is used for numeric computing https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.math.html and <b>
 - includes support for multi-dimensional arrays and matrices along with a variety of mathematical functions to apply to them.

In this lesson, we will learn about :
- Numpy's primary data structures and 
- how to apply some basic math functions to them.

# Importing numpy convention

In order to use Numpy, you must first import it.<br>
It is common to also alias it to <b>np</b> using the as keyword so that you don't have to spell out "numpy" every time you want to call one of its methods.<br>
Once the library has been imported, it is ready to use.

In [1]:
# To install Numpy library
!pip install numpy --upgrade --user

Requirement already up-to-date: numpy in c:\users\shcle\appdata\roaming\python\python37\site-packages (1.18.5)


In [2]:
# Importing numpy  
import numpy as np

In [3]:
# Checking numpy version
np.__version__

'1.18.5'

# Numpy Arrays

The basic data structures in Numpy are <b>arrays</b>, which can be used to represent tabular data. <br>

You can think of arrays as lists of lists, where all the elements of a list are of the same type <br>
(typically numeric since the reason you use Numpy is to do numeric computing). <br>

A <b>matrix</b> is just a two-dimensional array.

## Creating a numpy array from a list (np.array)

Converting Other Data Structures to Arrays: <br>
If you have data in another type of data structure and you would like to convert it to an array <br>
so that you can take advantage of Numpy's mathematical functions, <br>
you can convert them using the <b>array()</b> method as follows.

This works the same way whether you have: <br>
 - a list of lists, 
 - a list of tuples, 
 - a tuple of lists, or 
 - a tuple of tuples.

In [4]:
a = [1,3,5,10]
a

[1, 3, 5, 10]

In [5]:
# Creating array from the list
b = np.array(a)
b

array([ 1,  3,  5, 10])

In [6]:
type(b)

numpy.ndarray

## Multiplication operator (*) : List and Array

In [7]:
a * 3 # multiplication of list --> append the list * times

[1, 3, 5, 10, 1, 3, 5, 10, 1, 3, 5, 10]

In [8]:
b * 3 # multiplication of array --> multiplication of values of elements of array

array([ 3,  9, 15, 30])

In [9]:
a.append

<function list.append(object, /)>

## ndarray .max(),  .mean(), etc

In [10]:
b.max()

10

In [11]:
b.mean()

4.75

In [12]:
b.argmax() # Return indices of the maximum values along the given axis

3

## .size / .shape  methods

The <b>size</b> of an array is the <b>total number of elements</b> in every list.<br>
The <b>shape</b> of an array is the size of the array along each <b>dimension</b> (e.g. number of rows and number of columns for a two-dimensional array). 


In [13]:
b.shape # return the values of tuple

(4,)

In [14]:
b.size

4

>Let's create a two dimensional 10 x 4 array containing random numbers and calculate the shape and size of the array <br>
using the shape and size methods.

In [15]:
a = np.random.random(size=(10,4))
print(a)

[[0.97264408 0.91390406 0.60206469 0.05807829]
 [0.38766936 0.2482337  0.62200297 0.72149846]
 [0.12939149 0.55329541 0.62710751 0.24389763]
 [0.89965999 0.26061702 0.5933696  0.70898908]
 [0.56363745 0.6236581  0.55350421 0.46233396]
 [0.98402263 0.79228775 0.83368814 0.87327925]
 [0.02163462 0.11737332 0.57466983 0.02463547]
 [0.33247733 0.31709108 0.81495117 0.2105703 ]
 [0.80065657 0.78094553 0.38276631 0.60204031]
 [0.77986665 0.95171827 0.04642336 0.45801398]]


In [16]:
print(a.shape)  # the array has a shape of 10 x 4
print(a.size)   # the total number of elements in the array

(10, 4)
40


Now that we have seen an example of a basic two-dimensional array (a matrix)

## Creating a numpy array using (np.random)

### 1-D array 

In [17]:
np.random.randint(0,100)

29

In [18]:
np.random.randint(0,10,size=4)

array([8, 5, 4, 5])

In [19]:
np.random.randint(0,10,size=4).shape

(4,)

In [20]:
np.random.random(size=4)

array([0.48904863, 0.82527715, 0.06013474, 0.19851036])

### 2-D array: row, column (matrix)

In [21]:
# list of list
a = [[1,2,3],[4,5,6]]
a[0][-1]

3

In [22]:
# array from a list of lists
b = np.array([[1,2,3],[4,5,6]])
b

array([[1, 2, 3],
       [4, 5, 6]])

In [23]:
b[0][-1]

3

In [24]:
try:
    a[0, -1]   # Note: a is the list
except:
    print('TypeError: list indices must be integers or slices, not tuple')

TypeError: list indices must be integers or slices, not tuple


In [25]:
b[0,2]    # Note: b is the array

3

In [26]:
b[1,1] # array b[row,column]

5

In [27]:
b

array([[1, 2, 3],
       [4, 5, 6]])

In [28]:
b.shape

(2, 3)

In [29]:
b.size

6

In [30]:
np.random.random(size=(7,4))

array([[0.35606633, 0.90857115, 0.99526776, 0.87056735],
       [0.77702312, 0.57736628, 0.41528303, 0.78736859],
       [0.09480554, 0.69250909, 0.05614038, 0.29720873],
       [0.8103311 , 0.06208363, 0.24159428, 0.09873172],
       [0.42674746, 0.71908501, 0.70953005, 0.79455183],
       [0.12961238, 0.24104547, 0.85581   , 0.08928644],
       [0.56492111, 0.33154482, 0.80573743, 0.88156248]])

### 3-D array

In [31]:
# array from a list of list of lists
a = np.array([[[1,2,3],[4,5,6],[7,8,9]],
          [[10,11,12],[13,14,15],[16,17,18]]])
a

array([[[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9]],

       [[10, 11, 12],
        [13, 14, 15],
        [16, 17, 18]]])

In [32]:
a[0]

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [33]:
a[0][1]

array([4, 5, 6])

In [34]:
a[0][1][2]

6

In [35]:
a[0,1,2]

6

In [36]:
a.shape

(2, 3, 3)

In [37]:
b = np.random.random(size=(4,2,3))
b

array([[[0.03912102, 0.12332493, 0.84712197],
        [0.01469771, 0.6520845 , 0.67054801]],

       [[0.24044921, 0.90580931, 0.86247705],
        [0.36835502, 0.76719953, 0.13112322]],

       [[0.57005749, 0.02372064, 0.68408735],
        [0.75637716, 0.03304779, 0.80605725]],

       [[0.33995035, 0.83007823, 0.02477581],
        [0.12752838, 0.21495783, 0.49103396]]])

> Let's build a three-dimensional array of random numbers and see what that looks like.

In [38]:
b = np.random.random((5,2,3))
print(b)

[[[0.55843522 0.75986576 0.47419928]
  [0.58312417 0.51910087 0.61295389]]

 [[0.26236254 0.73910188 0.18298601]
  [0.58092336 0.23190847 0.6068777 ]]

 [[0.50588895 0.23801929 0.87079208]
  [0.61488314 0.32777566 0.16697057]]

 [[0.63748261 0.7820031  0.90812883]
  [0.25742787 0.71667828 0.26399887]]

 [[0.13276537 0.02812167 0.26280355]
  [0.8173207  0.68389972 0.66677852]]]


> This created an array with five groups of 2 x 3 matrices. <br>
Let's see what happens if we pass four dimensions (4-D).
This time, we got two groups of three 4 x 5 matrices

In [39]:
# 4-D array
c = np.random.random((2,3,4,5))
print(c)

[[[[0.42275869 0.89124276 0.72863214 0.95025922 0.91309055]
   [0.55712938 0.98622104 0.62911251 0.33040489 0.44652362]
   [0.49759122 0.08122534 0.55523916 0.87271376 0.28396455]
   [0.14258229 0.83967578 0.73747461 0.18319822 0.34541145]]

  [[0.18206722 0.95994919 0.2983086  0.81844966 0.89272442]
   [0.96196467 0.25538861 0.93652137 0.16649178 0.15175771]
   [0.00772952 0.38657863 0.93079492 0.20565585 0.04514345]
   [0.03678942 0.3401864  0.09066498 0.3710891  0.29901031]]

  [[0.05812957 0.87377265 0.21939888 0.8936456  0.49784732]
   [0.54613018 0.86464698 0.5044984  0.71798213 0.54390955]
   [0.61464179 0.69822837 0.88730535 0.98027137 0.130173  ]
   [0.55894034 0.93395928 0.5285937  0.57119376 0.67476817]]]


 [[[0.95122154 0.65791293 0.66572824 0.6793423  0.09136804]
   [0.76572673 0.60000518 0.82813698 0.48070886 0.78393416]
   [0.0900991  0.07365034 0.21242312 0.50744792 0.73986173]
   [0.38825337 0.41181635 0.49528307 0.54015113 0.31807642]]

  [[0.58507943 0.96677487 0.53

## Slices

In [40]:
# 2-D array 
a = np.random.randint(0, 10, size=(4,3)) 

In [41]:
my_list = [1,2,4,7,8,10,4,6]
my_list

[1, 2, 4, 7, 8, 10, 4, 6]

In [42]:
# length of the list
len(my_list) 

8

In [43]:
# acessing the elements which its indice from 4 to the end (len(my_list)-1) 
my_list[4:]  

[8, 10, 4, 6]

In [44]:
# acessing the elements which its indice from 0 to 2 
my_list[:3]

[1, 2, 4]

In [45]:
my_list[:]  # acessing all elements

[1, 2, 4, 7, 8, 10, 4, 6]

In [46]:
a

array([[0, 3, 7],
       [7, 5, 2],
       [2, 0, 9],
       [8, 6, 8]])

In [47]:
# a[row, column]
a[3,2]

8

In [48]:
# a[row, column]
a[2:, :] 

array([[2, 0, 9],
       [8, 6, 8]])

In [49]:
a[:, 2]

array([7, 2, 9, 8])

## Other methods (np.zeros/  np.empty/  np.ones/ np.arange)

In [50]:
np.zeros(shape=(3,2))

array([[0., 0.],
       [0., 0.],
       [0., 0.]])

In [51]:
np.empty(shape=(3,3))

array([[0.00000000e+000, 0.00000000e+000, 0.00000000e+000],
       [0.00000000e+000, 0.00000000e+000, 7.84576246e-321],
       [1.69121095e-306, 2.22522596e-306, 0.00000000e+000]])

In [52]:
np.ones((3, 3))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [53]:
range(10)

range(0, 10)

In [54]:
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [55]:
np.arange(0, 5, 0.5)

array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5])

In [56]:
try:
    range(0,5,0.5)
except:
    print("TypeError: 'float' object cannot be interpreted as an integer")

TypeError: 'float' object cannot be interpreted as an integer


In [57]:
np.arange(1,10, 0.5)

array([1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. , 5.5, 6. , 6.5, 7. ,
       7.5, 8. , 8.5, 9. , 9.5])

In [58]:
# Dot:  product of two arrays (same dimension)
np.dot(np.arange(1,10, 0.5) , np.arange(1,10, 0.5) )

617.25

# Accessing elements

In [59]:
# 1-D array
array_1d = np.random.random(10)
array_1d

array([0.87076397, 0.57584231, 0.73316413, 0.30912802, 0.96821007,
       0.37741321, 0.15769662, 0.99161228, 0.04338683, 0.22306615])

In [60]:
# 2-D array 
array_2d = np.random.random((5,3))  
array_2d

array([[0.47187499, 0.14164039, 0.92905626],
       [0.09315554, 0.01182963, 0.59473099],
       [0.26776317, 0.5508637 , 0.15637292],
       [0.31473477, 0.62146033, 0.98185892],
       [0.49993511, 0.90746657, 0.76127938]])

## For 2-D arrays, there are two ways of doing it.

In [61]:
array_2d[0]

array([0.47187499, 0.14164039, 0.92905626])

In [62]:
array_2d[4,2]

0.761279377675901

In [63]:
array_2d[4][2]

0.761279377675901

In [64]:
# 2-D array 
array_3d = np.random.random((5,3,4))
array_3d

array([[[0.43716134, 0.32893775, 0.48753007, 0.69421521],
        [0.38087701, 0.25075918, 0.81058399, 0.98101371],
        [0.67530971, 0.4903889 , 0.00108924, 0.7810294 ]],

       [[0.66627386, 0.58904787, 0.01851396, 0.82310482],
        [0.40945864, 0.21365801, 0.80208858, 0.39455756],
        [0.51973733, 0.86582604, 0.47390035, 0.65393763]],

       [[0.32944074, 0.81787244, 0.68012183, 0.79861558],
        [0.81744917, 0.33627601, 0.48927045, 0.62459366],
        [0.73809855, 0.60584605, 0.25883981, 0.74873626]],

       [[0.35010326, 0.86686463, 0.98078758, 0.31993208],
        [0.89463584, 0.35583778, 0.1657596 , 0.1189355 ],
        [0.79707833, 0.43940976, 0.29641702, 0.77755866]],

       [[0.28726511, 0.12199856, 0.17804699, 0.54679848],
        [0.50415595, 0.28258287, 0.93102123, 0.56200618],
        [0.13540744, 0.43062144, 0.9756213 , 0.58634222]]])

In [65]:
array_3d[0,0,-1]

0.6942152106624052

# Converting things to np.array

In [66]:
lst = [1,4,7,8]
np.array(lst)

array([1, 4, 7, 8])

In [67]:
list_of_lists = [[1,2,3],[4,5,6],[7,8,9]]
array_lst_lst = np.array(list_of_lists)
array_lst_lst

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [68]:
len(list_of_lists)

3

In [69]:
array_lst_lst.shape

(3, 3)

In [70]:
a = [[1,3,np.array([5,6,8])]]
a

[[1, 3, array([5, 6, 8])]]

In [71]:
np.array((1,3,5))

array([1, 3, 5])

# Math functions

## .sum() / .sum(axis=1) / a[1].sum() / np.sum()

In [72]:
a = np.random.randint(0, 10, size=(2,3))
a

array([[3, 7, 7],
       [4, 5, 4]])

In [73]:
a.shape

(2, 3)

In [74]:
# sum of all elements
a.sum()

30

In [75]:
a.sum(axis=0) 

array([ 7, 12, 11])

In [76]:
a.sum(axis=1)

array([17, 13])

In [77]:
np.sum(a, axis=1)

array([17, 13])

In [78]:
a[0].sum() 

17

In [79]:
np.sum(a[0]) 

17

In [80]:
# Sum of all elements in matrix a
# Sum of each column in matrix a (sum over rows - dimension 0)
# Sum of each row in matrix a (sum over columns - dimension 1)
# Sum of all the elements in the first LINE of matrix a

a.sum()   # soma 

a.sum(axis=0)

a.sum(axis=1)

a[:2].sum()

30

## mean() / np.mean()

In [81]:
a

array([[3, 7, 7],
       [4, 5, 4]])

In [82]:
a.sum()/a.size

5.0

In [83]:
a.mean()

5.0

In [84]:
a.mean(axis=1)  # np.mean(a, axis=1)

array([5.66666667, 4.33333333])

In [85]:
# Mean of all elements in matrix a
# Mean of each column in matrix a
# Mean of each row in matrix a
# Mean of all the elements in the first line of matrix a

np.mean(a)
# np.mean(a).round()

np.mean(a, axis=0)

a.mean(axis=1)

a[0].mean()

5.666666666666667

## np.add(x,y) / np.substract(x,y) / np.multiply(x,y) / np.divide(x,y)

## .transpose() / .T

In [86]:
a

array([[3, 7, 7],
       [4, 5, 4]])

In [87]:
a.T

array([[3, 4],
       [7, 5],
       [7, 4]])

In [88]:
a.transpose()

array([[3, 4],
       [7, 5],
       [7, 4]])

# Operations between np.arrays

In [89]:
a

array([[3, 7, 7],
       [4, 5, 4]])

In [90]:
a.shape

(2, 3)

In [91]:
c = np.random.randint(0,10, size=(2,3))
c

array([[6, 0, 5],
       [2, 3, 9]])

In [92]:
a + c

array([[ 9,  7, 12],
       [ 6,  8, 13]])

In [93]:
a - c

array([[-3,  7,  2],
       [ 2,  2, -5]])

In [94]:
x = a[0, 0] + c[-1, -1]
x

12

## Shallow Copy / Deep Copy

In [95]:
a

array([[3, 7, 7],
       [4, 5, 4]])

In [96]:
# Shallow copy
b = a 

In [97]:
b[0,0] = 1000

In [98]:
b

array([[1000,    7,    7],
       [   4,    5,    4]])

In [99]:
# Deep copy
b = a.copy()
# b = a[:] DOESNT WORK for numpy arrays (as it worked in lists)

In [100]:
a

array([[1000,    7,    7],
       [   4,    5,    4]])

In [101]:
b

array([[1000,    7,    7],
       [   4,    5,    4]])

In [102]:
b[0,0] = -1
b

array([[-1,  7,  7],
       [ 4,  5,  4]])

In [103]:
a

array([[1000,    7,    7],
       [   4,    5,    4]])

In [104]:
b = a[:]
b

array([[1000,    7,    7],
       [   4,    5,    4]])

In [105]:
a

array([[1000,    7,    7],
       [   4,    5,    4]])

In [106]:
b[0,0] = -1
b

array([[-1,  7,  7],
       [ 4,  5,  4]])

In [107]:
a

array([[-1,  7,  7],
       [ 4,  5,  4]])

In [108]:
for item in a:
    for another_item in item:
        print(item, another_item)

[-1  7  7] -1
[-1  7  7] 7
[-1  7  7] 7
[4 5 4] 4
[4 5 4] 5
[4 5 4] 4


In [109]:
a.shape

(2, 3)

In [110]:
a

array([[-1,  7,  7],
       [ 4,  5,  4]])

In [111]:
for i in range(a.shape[0]):
    for j in range(a.shape[1]):
        print(i, j, a[i,j])

0 0 -1
0 1 7
0 2 7
1 0 4
1 1 5
1 2 4


# Reshaping arrays : .reshape()

In [112]:
a = np.arange(1, 10)
a

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [113]:
# Reshape the array 1x10 to 3x3
a.reshape((3,3))

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

# Concatenating : np.concatenate()

In [114]:
# in lists: x + y
x = np.array([1, 2, 3])
y = np.array([6, 7, 8])

np.concatenate([x, y])

array([1, 2, 3, 6, 7, 8])

In [115]:
# + : sum of elements
x + y

array([ 7,  9, 11])

In [116]:
x.shape

(3,)

---