# Numpy

NumPy is the fundamental package for scientific computing with Python. It contains among other things:

- a powerful N-dimensional array object
- sophisticated (broadcasting) functions
- tools for integrating C/C++ and Fortran code
- useful linear algebra, Fourier transform, and random number capabilities

Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.

[Numpy.org](https://www.numpy.org/)

It is widely used in Data Science, as almost all PyData Ecosystem relies on Numpy. Most of the time, we can use plain Numpy instead of Python lists as Numpy arrays are more memory efficient and fast. For more info on why you would want to use Arrays instead of lists, check out this great [StackOverflow post](http://stackoverflow.com/questions/993984/why-numpy-instead-of-python-lists).

In this course, we will learn the basics of Numpy for Data Analysis like vectors, arrays, matrices before moving on to Pandas which is more SQL-like.


## Numpy Arrays
Numpy arrays are either vectors or matrices. Vectors strictly 1-d arrays and matrices are 2-d (but you should note a matrix can still have only one row or one column).




### Numpy from Python Lists

In [1]:
my_list = [1,2,3]
my_list

[1, 2, 3]

In [2]:
import numpy as np

In [3]:
np.array(my_list)

array([1, 2, 3])

In [4]:
#list of lists
my_matrix = [[1,2,3],[4,5,6],[7,8,9]]
my_matrix

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

In [5]:
np.array(my_matrix)

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

### arange, zeros & ones, linspace, rand, randn, randint, eye

In [6]:
#numpy.arange(low,high)
np.arange(0,10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [7]:
np.arange(0,11,2)

array([ 0,  2,  4,  6,  8, 10])

In [10]:
#np.zeros and ones
np.zeros((3,3))

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [12]:
np.ones(3)

array([1., 1., 1.])

In [13]:
np.ones((3,3))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [14]:
#identity matrix
np.eye(5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

In [18]:
#linspace: returns evenly spaced numbers within a specified range
np.linspace(0,10,5)

array([ 0. ,  2.5,  5. ,  7.5, 10. ])

In [33]:
#np.random.randint
np.random.seed(52)
np.random.randint(1,100)

29

In [35]:
#np.random.randn = normal distribution
np.random.randn(3,3)

array([[ 0.14620613,  2.2414261 ,  0.34679536],
       [ 0.7603473 ,  0.7597781 ,  1.451122  ],
       [-0.44436722,  0.74676116, -0.13459157]])

In [38]:
#standard normal distribution between 0-1
np.random.rand(3,3)

array([[0.38049568, 0.09495426, 0.32489071],
       [0.41511219, 0.74227395, 0.65790887],
       [0.20131683, 0.80848791, 0.78640244]])

### array shape and reshape

In [40]:
arr = np.arange(25)

In [42]:
arr.shape

(25,)

In [43]:
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24])

In [46]:
#reshape to a 5 by 5 matrix
arr.reshape(1,25)

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,
        16, 17, 18, 19, 20, 21, 22, 23, 24]])

## Numpy Built-in methods

In [47]:
ranarr = np.random.randint(0,50,10)

In [48]:
ranarr

array([15, 31, 21, 26,  0, 26, 38, 30, 49,  9])

In [49]:
ranarr.max()

49

In [50]:
ranarr.min()

0

In [52]:
ranarr.argmax()

8

## Numpy array indexing, selection for comparison operators and broadcasting

In [54]:
ranarr

array([15, 31, 21, 26,  0, 26, 38, 30, 49,  9])

In [56]:
ranarr[3:]

array([26,  0, 26, 38, 30, 49,  9])

In [57]:
arr_2d = np.array([[5,10,15],[20,25,30],[35,40,45]])

#Show
arr_2d

array([[ 5, 10, 15],
       [20, 25, 30],
       [35, 40, 45]])

In [58]:
arr_2d[0]

array([ 5, 10, 15])

In [59]:
arr_2d[0][1]

10

In [61]:
arr_2d[arr_2d>3]

array([ 5, 10, 15, 20, 25, 30, 35, 40, 45])

## Numpy Operations

In [62]:
#sum of arrays
sum(ranarr)

245

In [63]:
arr_2d+arr_2d

array([[10, 20, 30],
       [40, 50, 60],
       [70, 80, 90]])

In [64]:
#finding standard deviation of the dataset
ranarr.std()

13.425721582097552

In [65]:
np.std(ranarr)

13.425721582097552

In [66]:
ranarr/ranarr

  """Entry point for launching an IPython kernel.


array([ 1.,  1.,  1.,  1., nan,  1.,  1.,  1.,  1.,  1.])

In [67]:
ranarr

array([15, 31, 21, 26,  0, 26, 38, 30, 49,  9])