## Numpy

- NumPy (short for Numerical Python) provides an efficient interface to store and operate on dense data buffers. 
- NumPy arrays are like Python’s built-in list type, 
    - they provide much more efficient storage and data operations as the arrays grow larger in size. 
- NumPy arrays form the core of nearly the entire ecosystem of data science tools in Python


In [None]:
import numpy as np

- The NumPy ndarray: A Multidimensional Array Object
    - N-dimensional array object, or ndarray, which is a fast, flexible container for large data sets 
    - Arrays enable you to perform mathematical operations on whole blocks of data using similar syntax
    - An ndarray is a generic multidimensional container for homogeneous data
        - all of the elements must be the same type. 
        - Shape, a tuple indicating the size of each dimension, 
        - dtype, an object describing the data type of the array

### Creating ndarrays
- the array function accepts any sequence-like object (including other arrays) and produces a new NumPy array containing the passed data


In [None]:
data1 = [1,3,5,7,9,11]

In [None]:
arr1 = np.array(data1)

In [None]:
arr1

In [None]:
data2 = [1,3.7,5,7,9,11.4]

In [None]:
arr2 = np.array(data2)

In [None]:
arr2

In [None]:
arr2 *100

In [None]:
arr2+arr1

In [None]:
arr2.ndim

In [None]:
arr2.shape

In [None]:
data_2 = [[1,2,3],[4,5,6]]

In [None]:
arr_2 = np.array(data_2)

In [None]:
arr_2

In [None]:
arr_2.ndim

In [None]:
arr_2.shape

In [None]:
arr_3=np.array([[[1,2],[3,4]],[[5,6],[7,8]]])

In [None]:
arr_3.shape

In [None]:
arr_3.ndim

In [None]:
arr_3

- Other techniques for initializing ndarrays

In [None]:
np.zeros(16)

In [None]:
np.zeros((3,4))

In [None]:
np.empty((3,2,3))

In [None]:
np.ones((2,2))

In [None]:
np.eye(4)

- When constructing an array, you can specify the data type using a string

In [None]:
np.ones(10, dtype='float32')

### Basic array manipulations
- Attributes of arrays
    - Determining the size, shape, memory consumption, and data types of arrays


In [None]:
a1 = np.random.randint(10, size = 10) #one-dimensional array
a2 = np.random.randint(10, size = (10,4)) #two-dimensional array
a3 = np.random.randint(10, size = (10,3,3)) #three-dimensional array

In [None]:
a1.ndim

In [None]:
a1.shape

In [None]:
a2.shape

In [None]:
a3.dtype

In [None]:
a3.itemsize #the size (in bytes) of each array element

In [None]:
a3.nbytes #the total size (in bytes) of the array

- Indexing and slicing arrays
    - Getting and setting the value of individual array elements
    - Getting and setting smaller subarrays within a larger array

In [None]:
arr = np.arange(10)

In [None]:
arr

In [None]:
arr[5]

In [None]:
arr[2:5]

In [None]:
arr[2:5] =111 #data is not copied, and any modifications to the view will be reflected in the source array

In [None]:
arr

In [None]:
lista= [0,1,2,3,4,5,6,7,8,9]

In [None]:
lista

In [None]:
lista[2:5]

In [None]:
lista[2:5]=111

In [None]:
array2D = np.array([[1,2,3],[4,5,6],[7,8,9]])

In [None]:
array2D[1]

In [None]:
array2D[1][0]

In [None]:
array2D[1,0]

In [None]:
array2D[1][0]=44

In [None]:
array2D

In [None]:
array2D[1:,1:]

In [None]:
array2D[2:,1:]

In [None]:
array2D[:,1:]

- Boolean indexing

In [None]:
days = np.array(['Mon', "Tue", "Sat", "Sat", "Thu", "Fri", "Sat"])

In [None]:
data = np.random.randn(7,5)

In [None]:
data

In [None]:
days=="Sat"

In [None]:
data[days=="Sat"]

In [None]:
data[days=="Mon"]

In [None]:
data[days!="Mon"]

In [None]:
data[~(days=="Mon")]

In [None]:
data[(days=="Mon") | (days=="Sat")]

In [None]:
data[data<0]

In [None]:
data[data<0]=0

In [None]:
data

- Fancy Indexing
    - Fancy indexing is a term adopted by NumPy to describe indexing using integer arrays.
    - To select out a subset of the rows in a particular order, you can simply pass a list or ndarray of integers specifying the desired order

In [None]:
arr = np.empty((10,6))

In [None]:
for i in range(10):
    arr[i] = i

In [None]:
arr

In [None]:
arr[[2,1,4]]

In [None]:
arr[np.array([1,5,4,3,6,6,7])]

- Reshaping of arrays
    - Changing the shape of a given array


In [None]:
arr.shape

In [None]:
arr

In [None]:
arr.reshape((6,10))

In [None]:
arr.reshape((3,2,10))

- Change the data type of an array.

In [None]:
x = np.array([[2, 4, 6], [6, 8, 10]], np.int32)

In [None]:
y= x.astype(float)

In [None]:
y

- Joining and splitting of arrays
    - Combining multiple arrays into one, and splitting one array into many
        - np.concatenate takes a tuple or list of arrays as its first argument

In [None]:
x = np.array([1,2,3,4,5])
y = np.array([6,7,8,9,10])

In [None]:
np.concatenate([x,y])

In [None]:
np.concatenate([x,y,x,x,y])

In [None]:
arrSmall = np.array([[1,2,3],[4,5,6]])

In [None]:
arrSmall

In [None]:
np.concatenate([arrSmall,arrSmall])

In [None]:
np.concatenate([arrSmall,arrSmall], axis =1)

- For working with arrays of mixed dimensions, it can be clearer to use the np.vstack (vertical stack) and np.hstack (horizontal stack) functions
- The opposite of concatenation is splitting, which is implemented by the functions np.split


In [None]:
x = np.array([1,2,3])
y = np.array([[4,5,6],[7,8,9]])

In [None]:
np.vstack([x,y])

In [None]:
np.hstack([y,y])

In [None]:
x = np.hstack([x,x,x,x])

In [None]:
x

In [None]:
np.split(x,[3,6,7])

In [None]:
z,k = np.split(y,[0])

In [None]:
z

In [None]:
k

## Computation on NumPy Arrays: 
## Universal Functions

- Computation on NumPy arrays can be very fast, or it can be very slow. 
    - The key to making it fast is to use vectorized operations, generally implemented through NumPy’s universal functions (ufuncs). 

### UFuncs

- The vectorized approach is designed to push the loop into the compiled layer that underlies NumPy, leading to much faster execution.
- You can accomplish this by simply performing an operation on the array, which will then be applied to each element. 



- Any arithmetic operations between equal-size arrays applies the operation elementwise
- Arithmetic operations with scalars are propagating the value to each element

In [None]:
arr1 = np.random.randint(10,size= (10,10))
arr2 = np.random.randint(10,size= (10,10))

In [None]:
arr1

In [None]:
arr1 *125

In [None]:
arr1 + arr2

In [None]:
arr1 * arr2 - arr1 / ( arr2 +1)

- Transposing Arrays and inner matrix product

In [None]:
arr1

In [None]:
arr1.T

In [None]:
np.dot(arr1,arr1.T)

- Other operations


<img src="fig/ufunc_numpy.png" alt="table" width="700"/>
<img src="fig/ufunc_bin_numpy.png" alt="table" width="700"/>

In [None]:
np.add(arr1,arr2) #subtract, multiply, divide

In [None]:
np.sqrt(arr2)

In [None]:
np.max(arr1)

In [None]:
arr1

In [None]:
arr2

In [None]:
np.maximum(arr1, arr2)

- np.where 
    - Return elements chosen from x or y depending on condition.

In [None]:
a = np.arange(10)

In [None]:
a

In [None]:
np.where(a>5,0,10)

In [None]:
b = np.arange(10,20)

In [None]:
b

In [None]:
np.where(a%2==1,a,b)

## Mathematical and Statistical Methods

- A set of mathematical functions which compute statistics about an entire array or about the data along an axis are accessible as array methods. 


<img src="fig/stat_numpy.png" alt="table" width="500"/>

In [None]:
a.mean()

In [None]:
a.sum()

In [None]:
a.std()

In [None]:
a.var()

In [None]:
a.cumsum()

In [None]:
b.cumprod()

## Boolean Arrays

- Boolean values are coerced to 1 (True) and 0 (False) in the above methods. 
- Sum is often used as a means of counting True values in a boolean array


In [None]:
boola = np.random.randint(2, size=100)

In [None]:
boola

In [None]:
boola.sum()

In [None]:
bools = np.array([True,True,False,False, True])

In [None]:
bools.sum()

In [None]:
bools.any()

In [None]:
bools.all()

In [None]:
bools[:2].all()

## Sorting
- Like Python’s built-in list type, NumPy arrays can be sorted in-place using the sort method

In [None]:
boola.sort()

In [None]:
boola

In [None]:
arrsort =np.random.randn(4,3)

In [None]:
arrsort

In [None]:
arrsort.sort(0)

In [None]:
arrsort

In [None]:
arrsort.sort(1)

In [None]:
arrsort

### Unique and Other Set Logic

- NumPy has some basic set operations for one-dimensional ndarrays.

In [None]:
names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])

In [None]:
np.unique(names)

<img src="fig/unique_numpy.png" alt="table" width="500"/>