# Introduction to numpy

Datasets can come from a wide range of sources and a wide range of formats, including collections of documents, collections of images, collections of sound clips, collections of
numerical measurements, or nearly anything else. Despite this apparent heterogeneity, it will help us to think of all data fundamentally as arrays of numbers.


For example, images—particularly digital images—can be thought of as simply two-
dimensional arrays of numbers representing pixel brightness across the area. Sound
clips can be thought of as one-dimensional arrays of intensity versus time. Text can be
converted in various ways into numerical representations, perhaps binary digits representing the frequency of certain words or pairs of words. No matter what the data
are, the first step in making them analyzable will be to transform them into arrays of
numbers.

For this reason, efficient storage and manipulation of numerical arrays is absolutely
fundamental to the process of doing data science.Special‐
ized tools that Python has for handling such numerical arrays: the NumPy package
and the Pandas package

Numerical Python, or "Numpy" for short, is a foundational package on which many of the most common data science packages are built. Numpy provides us with high performance multi-dimensional arrays which we can use as vectors or matrices. NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.ndarrays are time and space-efficient multidimensional arrays at the core of numpy.

In some ways,
NumPy arrays are like Python’s built-in list type, but NumPy arrays provide much
more efficient storage and data operations as the arrays grow larger in size. Numpy array are much more efficient for storing and manipulating data than list. NumPy
arrays form the core of nearly the entire ecosystem of data science tools in Python, so
time spent learning to use NumPy effectively will be valuable no matter what aspect
of data science interests you. 

![images/numpy-array-xyz-axis.png](attachment:numpy-array-xyz-axis.png)

Python offers several different options for storing data in efficient, fixed-type data
buffers. The built-in array module (available since Python 3.3) can be used to create
dense arrays of a uniform type

In [None]:
import array
L = list(range(10))
A = array.array('i', L)
A

Much more useful, however, is the ndarray object of the NumPy package. While
Python’s array object provides efficient storage of array-based data, NumPy adds to
this efficient operations on that data.

In [None]:
import numpy as np
np.__version__

### NumPy Standard Data Types

![images/Screenshot%20from%202019-09-18%2014-34-12.png](attachment:Screenshot%20from%202019-09-18%2014-34-12.png)

### Creating Arrays from Python Lists

In [None]:
#1D array

a = np.array([5,6,7])
a

Remember that unlike Python lists, NumPy is constrained to arrays that all contain
the same type. If types do not match, NumPy will upcast if possible(here, integers are
upcast to floating point)

In [None]:
#2D array

b = np.array([(2,3.5,4),(9,7,6)])
b

If we want to explicitly set the data type of the resulting array, we can use the dtype
keyword

In [None]:
#3D array

c = np.ones((2,3,2),dtype = "float32")
c

In [None]:
### Exercise - change the above 3D array dtype to int32

Unlike Python lists, NumPy arrays can explicitly be multidimensional; here’s
one way of initializing a multidimensional array using a list of lists

In [None]:
# nested lists result in multidimensional arrays
d = np.array([range(i, i + 3) for i in [2, 4, 6]])
d

In [None]:
### Exercise - Check the shape  of all above arrays using command x.shape

### Create an array using np.arrange

The arange() function is used to get evenly spaced values within a given interval.
Values are generated within the half-open interval [start, stop]. For integer arguments the function is equivalent to the Python built-in range function, but returns an ndarray rather than a list.

In [None]:
import numpy as np
np.arange(5)

In [None]:
import numpy as np
np.arange(5.0)

In [None]:
### Exercise 

# Create an array filled with a linear sequence Starting at 0, ending at 20, stepping by 2

### Create an array using np.linspace

In [None]:
import numpy as np
np.linspace(0, 1, 5)

In [None]:
### Exercise 

# Create an array of ten values evenly spaced between 0 and 2

### Create an array using np.random.random

In [None]:
import numpy as np
np.random.random((3, 3))

In [None]:
import numpy as np
np.random.randint(0, 10, (3, 3))

In [None]:
### Exercise

# Create a 2 * 2 array of uniformly distributed random values between 0 and 1
# Create a 4 * 4 array of random integers in the interval [0, 100]

In [None]:
import numpy as np
np.random.seed(0) # seed for reproducibility
x1 = np.random.randint(10, size=6) # One-dimensional array
x2 = np.random.randint(10, size=(3, 4)) # Two-dimensional array
x3 = np.random.randint(10, size=(3, 4, 5)) # Three-dimensional array

In [None]:
print("x3 ndim: ", x3.ndim)
print("x3 shape:", x3.shape)
print("x3 size: ", x3.size)
print("dtype:", x3.dtype)
print("itemsize:", x3.itemsize, "bytes")
print("nbytes:", x3.nbytes, "bytes")

In [None]:
### Exercise 

### Find the ndim,shape,size,dtype,itemsize,nbytes for x1 and x2

### Create a 2-dimensional array of size 2 x 3

In [None]:
import numpy as np
a = np.array([[3, 4, 5], [6, 7, 8]], np.int32)
print(a)
print(a.shape)

### Array Indexing and Slicing

In [None]:
### find the 1st element of the 2nd row in a

a[1,0]

To index from the end of the array, you can use negative indices

In [None]:
a[-1]

modify values using any of the above index notation:

In [None]:
a[1,0] = 130
a

Keep in mind that, unlike Python lists, NumPy arrays have a fixed type. This means,
for example, that if you attempt to insert a floating-point value to an integer array, the
value will be silently truncated. Don’t be caught unaware by this behavior!

In [None]:
a[1,1] = 3.14 # this will be truncated!
a

Slicing - Syntax x[start:stop:step]

In [None]:
x = np.arange(10)

In [None]:
x[:5] # first five elements

In [None]:
x[5:] # elements after index 5

In [None]:
x[4:7] # middle subarray

In [None]:
x[::2] # every other element

In [None]:
x[1::2] # every other element, starting at index 1

A potentially confusing case is when the step value is negative. In this case, the
defaults for start and stop are swapped. This becomes a convenient way to reverse
an array

In [None]:
x[::-1] # all elements, reversed

In [None]:
x[5::-2] # reversed every other from index 5

In [None]:
### Exercise 

# reversed every element from index 5 of x

In [None]:
import numpy as np
a = np.array([[3, 4, 5], [6, 7, 8], [4,1,8],  [3,9,7]], np.int32)
a

In [None]:
### Separate the 2nd column of a

b = a[:,1]
print(b)

In [None]:
### Separate the 2nd column of a

b = a[:,1]
print(b)

In [None]:
a[0]

In [None]:
a[:2, :3] # two rows, three columns

In [None]:
a[:3, ::2] # all rows, every other column

In [None]:
a[::-1, ::-1]

In [None]:
### Exercise 

# Try out various slicing and indexing combinations of x1,x2,x3

### Creating copies of arrays

In [None]:
import numpy as np
z = np.array([[3, 4, 5], [6, 7, 8], [4,1,8],  [3,9,7]], np.int32)
a1 = z[:2,:2].copy()
a1

If we now modify this subarray, the original array is not touched:

In [None]:
a1[1,1] = 100

In [None]:
a1

In [None]:
z

In [None]:
### Exercise - See what happens to z if we do not use copy function for a1

### Reshape an 2 * 3 array into 3 * 2 array

In [None]:
import numpy as np
x = np.array([[2,3,4], [5,6,7]])
print("x shape: ",x.shape)
z = np.reshape(x, (3, 2))
print("z :", z)
print("z shape: ",z.shape)

In [None]:
grid = np.arange(1, 10).reshape((3, 3))
print(grid)

## Ones and Zeros

### Create a new array of 2*2 integers, without initializing entries.

The empty() function is used to create a new array of given shape and type, without initializing entries

In [None]:
np.empty([2,2], int)

### Let X = np.array([1,2,3], [4,5,6], np.int32). Create a new array with the same shape and type as X.

The empty_like() function is used to create a new array with the same shape and type as a given array.

In [None]:
X = np.array([[1,2,3], [4,5,6]], np.int32)
np.empty_like(X)

### Create a 3-D array with ones on the diagonal and zeros elsewhere.

The eye() function is used to create an 2-D array with ones on the diagonal and zeros elsewhere.

In [None]:
np.eye(3,3)

The identity array is a square array with ones on the main diagonal. The identity() function return the identity array.

In [None]:
np.identity(3)

### Create a new array of 3*2 float numbers, filled with ones.

In [None]:
np.ones([3,2],float)

In [None]:
import numpy as np
x = np.arange(4)
print(x)

In [None]:
x = x.reshape((2,2))
print(x)

The ones_like() function is used to get an array of ones with the same shape and type as a given array.

In [None]:
np.ones_like(x)

### Create a new array of 3*2 float numbers, filled with zeros.

The zeros() function is used to get a new array of given shape and type, filled with zeros.

In [None]:
import numpy as np
a = (3,2)
np.zeros(a)

In [None]:
np.zeros(6)

In [None]:
np.zeros((3,3,3))

In [None]:
import numpy as np
a = np.arange(4)
a = a.reshape((2, 2))
a

The zeros_like() function is used to get an array of zeros with the same shape and type as a given array.

In [None]:
np.zeros_like(a)

In [None]:
### Exercise

# Create a length-10 integer array filled with zeros
# Create a 3x5 floating-point array filled with 1s

### Create a 3*3 array of 5

The full() function return a new array of given shape and type, filled with fill_value.

In [None]:
import numpy as np
np.full((3, 3), 5)

In [None]:
### Exercise

# Create a 3x5 array filled with 3.14

### Create a 2*3 array

The asarray() function is used to convert an given input to an array.

In [None]:
import numpy as np
a = [[2, 3, 4],[1,2,3]]
b = np.asarray(a)
b.shape
print(b)

### Create a 2*2 matrix

The asmatrix() function is used to interpret the input as a matrix.

In [None]:
import numpy as np
x = np.array([[1,2], [3,4]])
n = np.asmatrix(x)
x[0,0] = 5
n

### Separate the diagonal from a array

The diag() function is used to extract a diagonal or construct a diagonal array.

In [None]:
import numpy as np
a = np.arange(12).reshape((4,3))
print(a)
np.diag(a)

### Make a filled wit ones and zeros

The tri() function is used to get an array with ones at and below the given diagonal and zeros elsewhere.tri : ndarray of shape (N, M) - Array with its lower triangle filled with ones and zero elsewhere; in other words T[i,j] == 1 for i <= j + k, 0 otherwise.

In [None]:
import numpy as np
np.tri(4, 4, 0, dtype=int)

### Create a continuous array from 2 * 3 array using ravel

The ravel() function is used to create a contiguous flattened array.

In [None]:
import numpy as np
x = np.array([[1, 2, 3], [4, 5, 6]])
print("x")
print(type(x))
print(x.shape)
print(x)
z = np.ravel(x)
print("z")
print(z.shape)
print(type(z))
print(z)

The flatten() function is used to get a copy of an given array collapsed into one dimension.

In [None]:
import numpy as np
y = np.array([[2,3], [4,5]])
y.flatten()

### Change the axes of the array using swapaxes

The swapaxes() function is used to interchange two axes of an array.

In [None]:
import numpy as np
a = np.array([[2,3,4]])
print(a.shape)
b = np.swapaxes(a,0,1)
print(b)
print(b.shape)

In [None]:
import numpy as np
y = np.array([[[1,2],[3,4]],[[5,6],[7,8]]])
print(y)
print(y.shape)

In [None]:
z = np.swapaxes(y,1,2)
print(z)
print(z.shape)

### Find the transpose of a array

Permute the dimensions of an array except that self is returned if self.ndim < 2.

In [None]:
import numpy as np
a = np.array([[2.,3.],[4.,5.]])
print("a",a)
print("a.T",a.T)

### Increase the dimension of a array

The expand_dims() function is used to expand the shape of an array. Insert a new axis that will appear at the axis position in the expanded array shape.  The number of dimensions is one greater than that of the input array.

In [None]:
import numpy as np
a = np.array([2, 4])
print(a)
print(a.shape)
b = np.expand_dims(a, axis=0)
print(b)
print(b.shape)

In [None]:
import numpy as np
a = np.array([2, 4])
print(a)
print(a.shape)
b = np.expand_dims(a, axis=1)
print(b)
print(b.shape)

### Squeeze Function

The squeeze() function is used to remove single-dimensional entries from the shape of an array.

In [None]:
import numpy as np
a = np.array([[[0], [2], [4]]])
print(a)
print(a.shape)
b = np.squeeze(a)
print(b.shape)

### Concatenate two arrays of two different shape

The concatenate() function returns an ndarray of the provided type that satisfies requirements.

In [None]:
import numpy as np
x = np.array([[3, 4], [5, 6]])
y = np.array([[7, 8]])
print("x: ", x)
print("x shape:", x.shape)
print("y : ", y)
print("y shape:", y.shape)
z = np.concatenate((x,y), axis=0)
print("z", z)
print("z shape", z.shape)

In [None]:
import numpy as np
x = np.array([[3, 4]])
y = np.array([[7, 8]])
print("x: ", x)
print("x shape:", x.shape)
print("y : ", y)
print("y shape:", y.shape)
z = np.concatenate((x,y), axis=1)
print("z", z)
print("z shape", z.shape)

In [None]:
### Exercise

## concatenate along the first axis and second axis  - grid = np.array([[1, 2, 3],[4, 5, 6]]) 

### Stacking

The stack() function is used to join a sequence of arrays along a new axis.The axis parameter specifies the index of the new axis in the dimensions of the result. For example, if axis=0 it will be the first dimension and if axis=-1 it will be the last dimension.

In [None]:
import numpy as np
x = np.array([2, 3, 4])
print(x)
print(x.shape)
print("\n")
y = np.array([3, 4, 5])
print(y)
print(y.shape)
print("\n")
z = np.stack((x, y))
print(z)
print(z.shape)
print("\n")
t = np.stack((x,y),axis = -1)
print(t)
print(t.shape)

Stack 1-D arrays as columns into a 2-D array

In [None]:
import numpy as np
x = np.array((3,4,5))
y = np.array((4,5,6))
z = np.column_stack((x,y))
d = np.dstack((x,y))
print(z)

The dstack() is used to stack arrays in sequence depth wise (along third axis).
This is equivalent to concatenation along the third axis after 2-D arrays of shape (M,N) have been reshaped to (M,N,1) and 1-D arrays of shape (N,) have been reshaped to (1,N,1). 

In [None]:
import numpy as np
x = np.array((3, 5, 7))
y = np.array((5, 7, 9))
z = np.dstack((x,y))
print(z)

The hstack() function is used to stack arrays in sequence horizontally (column wise).
This is equivalent to concatenation along the second axis, except for 1-D arrays where it concatenates along the first axis.

In [None]:
import numpy as np
x = np.array((3,5,7))
y = np.array((5,7,9))
z = np.hstack((x,y))
print(z)

In [None]:
import numpy as np
x = np.array([[3], [5], [7]])
y = np.array([[5], [7], [9]])
z = np.hstack((x,y))
print(z)

In [None]:
### Exercise

y = np.array([[99],[99]])
grid = np.array([[9, 8, 7],[6, 5, 4]])

# Find np.hstack([grid, y])

The vstack() function is used to stack arrays in sequence vertically (row wise).
This is equivalent to concatenation along the first axis after 1-D arrays of shape (N,) have been reshaped to (1,N).

In [None]:
import numpy as np
x = np.array([3, 5, 7])
y = np.array([5, 7, 9])
z = np.vstack((x,y))
print(z)

In [None]:
import numpy as np
x = np.array([[3], [5], [7]])
y = np.array([[5], [7], [9]])
z = np.vstack((x,y))
print(z)

In [None]:
### Exercise 

x = np.array([1, 2, 3])
grid = np.array([[9, 8, 7],[6, 5, 4]])

#Find np.vstack([x, grid])

### Splitting of arrays

The opposite of concatenation is splitting, which is implemented by the functions
np.split , np.hsplit , and np.vsplit .

In [None]:
x = [1, 2, 3, 99, 99, 3, 2, 1]
x1, x2, x3 = np.split(x, [3, 5])
print(x1, x2, x3)

In [None]:
grid = np.arange(16).reshape((4, 4))
grid

In [None]:
upper, lower = np.vsplit(grid, [2])
print(upper)
print(lower)

In [None]:
left, right = np.hsplit(grid, [2])
print(left)
print(right)

### Array arithmetic

In [None]:
x = np.arange(4)
print("x=",x)
print("x + 5 =",x + 5)
print("x - 5 =",x - 5)
print("x * 2 =",x * 2)
print("x / 2 =", x / 2)
print("x // 2 =", x // 2)# floor division
print("-x= ", -x)
print("x ** 2 = ", x ** 2)
print("x % 2 = ", x % 2)

In [None]:
### Exercise 

# Find -(0.5*x + 1) ** 2

In [None]:
np.add(x, 2)

![images/Screenshot%20from%202019-09-18%2015-51-00.png](attachment:Screenshot%20from%202019-09-18%2015-51-00.png)

In [None]:
### Exercise

# Apply above all operators on x 

### Absolute value

In [None]:
x = np.array([-2, -1, 0, 1, 2])

In [None]:
abs(x)

In [None]:
np.absolute(x)

In [None]:
np.abs(x)

In [None]:
x = np.array([3 - 4j, 4 - 3j, 2 + 0j, 0 + 1j])
np.abs(x)

### Multidimensional aggregates

In [None]:
M = np.random.random((3, 4))
print(M)

In [None]:
M.sum()

In [None]:
M.min(axis=0)

In [None]:
M.max(axis=1)

### Aggregate Functions

![images/Screenshot%20from%202019-09-18%2015-59-54.png](attachment:Screenshot%20from%202019-09-18%2015-59-54.png)

In [None]:
### Exercise

# Apply above all functions on various arrays

### What is the broadcast property?

Broadcasting is simply a
set of rules for applying binary ufuncs (addition, subtraction, multiplication, etc.) on
arrays of different sizes.

In [None]:
import numpy as np
a = np.array([[2], [3], [4]])
print(a.shape)
b = np.array([5, 6, 7])
print(b.shape)
c = a+b
print(c)
print(c.shape)

In [None]:
import numpy as np
a = np.array([0, 1, 2])
b = np.array([5, 5, 5])
a + b

Broadcasting allows these types of binary operations to be performed on arrays of different sizes. We can think of this as an operation that stretches or duplicates the value 5 into the
array [5, 5, 5] , and adds the results.

In [None]:
a + 5

![images/b1.png](attachment:b1.png)

In [None]:
### Exercise 

# Create an 3 * 3 array of ones and add a into it

In [None]:
a = np.arange(3)
print(a)
b = np.arange(3)[:, np.newaxis]
print(b)

In [None]:
a + b

Rules of Broadcasting

Broadcasting in NumPy follows a strict set of rules to determine the interaction
between the two arrays:
    
• Rule 1: If the two arrays differ in their number of dimensions, the shape of the
one with fewer dimensions is padded with ones on its leading (left) side.

• Rule 2: If the shape of the two arrays does not match in any dimension, the array
with shape equal to 1 in that dimension is stretched to match the other shape.

• Rule 3: If in any dimension the sizes disagree and neither is equal to 1, an error is
raised.

In [None]:
### Exercise 

M = np.ones((2, 3))
a = np.arange(3)

# Find M + a


N = np.ones((3, 2))
b = np.arange(3)

# Find N + b

### Comparison Operators

In [None]:
x = np.array([1, 2, 3, 4, 5])

In [None]:
x < 3 # less than

In [None]:
x > 3 # greater than

In [None]:
x <= 3 # less than or equal

In [None]:
x >= 3 # greater than or equal

In [None]:
x != 3 # not equal

In [None]:
x == 3 # equal

![images/Screenshot%20from%202019-09-18%2016-25-00.png](attachment:Screenshot%20from%202019-09-18%2016-25-00.png)

In [None]:
### Exercise 

# Find (2 * x) == (x ** 2)
# Apply below given functions on X
# Create a 2D array of size 3 * 4 and find 
# x < 6
#np.count_nonzero(x < 6)
#np.sum(x < 6)
#np.sum(x < 6, axis=1)
#np.any(x > 8)
#np.all(x < 10)


### Sorting

In [None]:
x = np.array([2, 1, 4, 3, 5])
np.sort(x)

A related function is argsort , which instead returns the indices of the sorted
elements:

In [None]:
x = np.array([2, 1, 4, 3, 5])
i = np.argsort(x)
print(i)

Sorting along rows or columns

In [None]:
rand = np.random.RandomState(42)
X = rand.randint(0, 10, (4, 6))
print(X)

In [None]:
# sort each row of X
np.sort(X, axis=1)

In [None]:
# sort each column of X
np.sort(X, axis=0)

### Delete

In [None]:
import numpy as np
arr = np.array([[0,1,2], [4,5,6], [7,8,9]])
print(arr)
print("\n")
b = np.delete(arr, 1, 0)
print(b)
print("\n")
c = np.delete(arr, 1, 1)
print(c)

### Insert

The insert() function is used to insert values along the given axis before the given indices.

Syntax - numpy.insert(arr, obj, values, axis=None)

In [None]:
import numpy as np
x = np.array([[0,0], [1,1], [2,2]])
print(x)
print("\n")
y = np.insert(x, 2, 4)
print(y)
print("\n")
z = np.insert(x, 2, 4, axis=1)
print(z)
print("\n")

### Append

The append() function is used to append values to the end of an given array.

Syntax - numpy.append(arr, values, axis=None)

In [None]:
import numpy as np
a = np.append ([0, 1, 2], [[3, 4, 5], [6, 7, 8]])
print(a)

### Resize

The resize() function is used to create a new array with the specified shape.
If the new array is larger than the original array, then the new array is filled with repeated copies of a.

In [None]:
import numpy as np
a = np.array([[1,2], [3,4]])
print(a)
print(a.shape)
print("\n")
b = np.resize(a, (3,2))
print(b)
print(b.shape)

### Unique

The unique() function is used to find the unique elements of an array.Returns the sorted unique elements of an array. 

In [None]:
import numpy as np
np.unique([0,1,2,0,2,3,4,3,0,4])