# NUMPY: numerical python

Most examples come from 

a) the NumPy website: https://docs.scipy.org/doc/numpy/user/quickstart.html 

b) the book: Python for Data Analysis by Wes Mckinney(O'Reilly)

c) the book: Python Data Science Handbook by Jake VanderPlas (O'Reilly) - chapter 2


In [1]:
# the following import statement allows us to import the NumPy module
# and we give it an alias

import numpy as np

In [4]:
# methods and attributes that start/end with two __ are called dunder
# they are private
np.__version__

'1.16.4'

# NumPy 

NumPy has many built-in functions. The main builing block is an n-dimensional array called *ndarray*.

NumPy arrays can be: vectors and matrices. Vectors are strictly 1-d arrays and matrices are mostly 2-d.

Arrays is a table that is mostly used to contain numbers. It can be indexed by a tuple e.g., (row, column)

Most data (images, sound, text, documents) can be changed to numeric form (arrays of numbers) - this is the form most amenable for using NumPy for data analysis.

In [85]:
# use this to see all available attributes and methods
#np.<TAB>

# use this to pull up documentation
np?

## Creating NumPy arrays

only need to provide a list or tuple of values. The type of the resulting array is deduced from the type of the elements in the sequences.

In [3]:
# from a Python list
import numpy as np
my_list=['a',1,2.9]


#my_list = [10,20,30, 40]
np.array(my_list)

array(['a', '1', '2.9'], dtype='<U3')

In [30]:
# from a Python tuple

my_list = (10,20,30, 40)
np.array(my_list)

array([10, 20, 30, 40])

In [63]:
# from a Python 2D list

mat1 = [[1,2,3],[4,5,6],[7,8,9]]
np.array(mat1)

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [65]:
# from a custom function - default type is float

np.fromfunction(lambda i, j: i + j, (3, 3))

array([[0., 1., 2.],
       [1., 2., 3.],
       [2., 3., 4.]])

In [67]:
# from a string, specify the separator - default type is float

np.fromstring('1, 2, 3', sep=',')

array([1., 2., 3.])

## Other ways to create special arrays


### arange

Returns an array of sequence of numbers: with evenly spaced values within a given interval

can specify starting (incl), end (not incl), step - similar to Python's range() function

In [8]:
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [22]:
np.arange(0,10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [23]:
np.arange(0,11,2)

array([ 0,  2,  4,  6,  8, 10])

### zeros and ones

functions zeros() and ones() generate arrays of zeros or ones

the args inside can be a list or a tuple with dimensions zeros([]) or zeros(())

In [24]:
np.zeros(3)

array([ 0.,  0.,  0.])

In [14]:
np.zeros([3,3])

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [27]:
np.ones(3)

array([ 1.,  1.,  1.])

In [24]:
np.ones([2,2])

array([[1., 1.],
       [1., 1.]])

function empty() returns an array of uninitialized (arbitrary) data of the given shape. 

Object arrays will be initialized to None.

In [27]:
# the array values are not initalized and 
# depend on the state of memory at the time 

np.empty([2,3])

array([[0., 0., 0.],
       [0., 0., 0.]])

full() will create an array of specified shape and fill with given value

In [30]:
# intialize all values in array of shape given in arg1 with value in arg2
e = np.full((2,3), 3.14)
e

array([[3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14]])

can use zeros_like() or ones_like() to create arrays of zeros or ones of the same shape as an array specified as argument

In [33]:
np.zeros_like(e, dtype=int)

array([[0, 0, 0],
       [0, 0, 0]])

In [32]:
np.ones_like(e)

array([[1., 1., 1.],
       [1., 1., 1.]])

### identity- eye

Creates an identity matrix

In [37]:
np.eye(4)

array([[ 1.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.],
       [ 0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  1.]])

### linspace
Return evenly spaced numbers over a specified interval. we can specify exactly how many elements we want (as opposed to arange)

also logspace

In [12]:
np.linspace(0,5,3)

array([0. , 2.5, 5. ])

In [13]:
np.linspace(0,10,20)

array([ 0.        ,  0.52631579,  1.05263158,  1.57894737,  2.10526316,
        2.63157895,  3.15789474,  3.68421053,  4.21052632,  4.73684211,
        5.26315789,  5.78947368,  6.31578947,  6.84210526,  7.36842105,
        7.89473684,  8.42105263,  8.94736842,  9.47368421, 10.        ])

## Creating array of random elements

### rand()
Create an array of the given shape and populate it with
random samples from a uniform distribution
over ``[0, 1)``. \
the args are ints representing size of each dimension

In [6]:
# set the seed for repeatability
np.random.seed(3)

In [18]:
np.random.rand(3)

array([0.14295718, 0.75671987, 0.02503297])

In [42]:
np.random.rand(3,3)

array([[0.65047686, 0.72393914, 0.47508861],
       [0.59666377, 0.06696942, 0.07256214],
       [0.19897603, 0.151861  , 0.10010434]])

### random()

In [41]:
# this picks values from the continuous distribution [0,1)
# args is an int or  atuple of ints
np.random.random((3,3))


array([[0.33984866, 0.57279387, 0.32580716],
       [0.44514505, 0.06152893, 0.24267542],
       [0.97160261, 0.2305842 , 0.69147751]])

### randn()

Return a sample (or samples) from the "standard normal" distribution. 

In [20]:
np.random.randn(3)

array([ 0.99634182,  0.25357991, -0.72042125])

In [76]:
np.random.randn(3,3)

array([[-0.52509337,  1.03423899, -0.24778646],
       [ 0.02611462, -0.56440771, -0.96428119],
       [-0.71731016, -0.12469301,  0.51265729]])

In [10]:
# np.random.<TAB> will pull up many available dstributions
# to use a gaussian/normal distribution with mean 3 and std dev 2.5

np.random.normal(3,2.5, (2,4))

array([[1.8498947 , 2.8551979 , 8.19393534, 1.49671881],
       [5.34809097, 0.04897846, 2.13167456, 3.16574683]])

In [13]:
np.random.rand?

### randint
Return random integers from [start, end)

In [25]:
np.random.randint(10)

6

In [91]:
np.random.randint(1,100,10)

array([93, 28, 30,  6, 71, 20, 42, 99, 37, 19])

In [7]:
np.random.randint(1,100,(3,3))

array([[25,  4, 57],
       [73,  1, 22],
       [20, 75, 42]])

# Creating arrays with additional parameters


## specifying the type of the array

see the slides for available datatypes

In [126]:
# dtype can be standard Python types (int, float,double,byte)
# or NumPy types (np.*)
# default NumPy types - int32 and float64

np.array([ [1,2], [3,4] ], dtype=float)

array([[1., 2.],
       [3., 4.]])

In [48]:
np.array([ [1,2], [3,4] ], dtype=np.double)

array([[1., 2.],
       [3., 4.]])

# Attributes of ndarray

In [70]:
myarr1 = np.arange(10)
myarr2 = np.array([[1,2,3],[4,5,6]],dtype=np.float)

In [73]:
myarr2

array([[1., 2., 3.],
       [4., 5., 6.]])

In [72]:
# ndim show the number of axes
print(myarr1.ndim)
print(myarr2.ndim)

1
2


In [74]:
# For a vector, the shape includes one value - number of elements
# For a matrix, the shape in (m,n) (row, column)
print(myarr1.shape)
print(myarr2.shape)

(10,)
(2, 3)


In [68]:
# size gives the numer of elements
print(myarr1.size)
print(myarr2.size)

10
6


In [93]:
# size gives the numer of elements
print(myarr1.dtype)
print(myarr2.dtype)

int32
float64


In [95]:
# type is a Python function - and will tell us the type of the object 
# not what data type is contains
type(myarr1)

numpy.ndarray

In [98]:
# size in bytes of each element in the array
print(myarr1.itemsize)
print(myarr2.itemsize)

4
8
