# NumPy Overview

Here we present a brief intro to NumPy. As pandas is built upon numpy, we exclude many operations of numpy here as those operation will be handle with pandas.

NumPy is one of the most widely-used powerful python libraries. NumPy (and Pandas) are fundamental Python libraries that provide essential tools for data manipulation, analysis and mathematical operations. Here we will briefly discuss about the common functions of this library. 

NumPy is a numerical library and contains multi-dimensional arrays and matrices. This library offers a large number of mathematical, algebraic, and transformation functions. 


## NumPy Arrays

There are a large number of NumPy data types. A frequently used data type is `n-dimensional array`, which is a grid of values, **all of the same type**, and is indexed by a tuple of nonnegative integers. 

We can create difference arrays by using the `array` function or other built-in functions. The code below shows how to create 1-D array (i.e., vector) and 2-D array (i.e matrix) as well as inpect the arrays. 

In [3]:
import numpy as np 

# 1-D array

a = np.array([1, 2, 3])   # Create: Create a rank 1 array
print(type(a))            # Type: Prints "<class 'numpy.ndarray'>"
print(a.shape)            # Shape: Prints "(3,)"
print(a[0], a[1], a[2])   # Access elements: Prints "1 2 3"
a[0] = 5                  # Updates: Change an element of the array
print(a)                  # Prints "[5, 2, 3]"

# 1-D array
b = np.array([[1,2,3],[4,5,6]])    # Create a rank 2 array
print(b.shape)                     # Prints "(2, 3)"
print(b[0, 0], b[0, 1], b[1, 0])   # Prints "1 2 4"



<class 'numpy.ndarray'>
(3,)
1 2 3
[5 2 3]
(2, 3)
1 2 4


### Arrays with all elements as zeros, ones, and constant

In [4]:
a = np.zeros((2,2))   # Create an array of shape 2x2 of all zeros
print(a)              

b = np.ones((1,2))    # Create an array of shape 1x2 all ones
print(b)              

c = np.full((2,2), 7)  # Create an arry of shape 2x2 with constant value 7
print(c)               

[[0. 0.]
 [0. 0.]]
[[1. 1.]]
[[7 7]
 [7 7]]


### Indentity Matrix

In [5]:
d = np.eye(3)         # Create a 3x3 identity matrix
print(d)              

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


### An array with a set sequence

In [6]:
a = np.arange(0, 20, 2)
print(a)

# create an array of even space between the given range of values
a = np.linspace(0, 1, 5)
print(a)

[ 0  2  4  6  8 10 12 14 16 18]
[0.   0.25 0.5  0.75 1.  ]


### Random Arrays

In [7]:
# #set a random seed (optional)
np.random.seed(0)  # this will make sure you get same sequence of random numbers. useful for testing

# create a 3x3 array with mean 0 and standard deviation 1 in a given dimension
a = np.random.normal(0, 1, (3,3))
print(a)    

e = np.random.random((2,2))  # Create an array filled with random values
print(e)   

x1 = np.random.randint(10, size=6) #one dimension
print(x1)
x2 = np.random.randint(15, size=(3,4)) #two dimension
print(x2)


[[ 1.76405235  0.40015721  0.97873798]
 [ 2.2408932   1.86755799 -0.97727788]
 [ 0.95008842 -0.15135721 -0.10321885]]
[[0.79172504 0.52889492]
 [0.56804456 0.92559664]]
[8 9 4 3 0 3]
[[ 5 14  0  2]
 [ 3  8  1  3]
 [13  3  3 14]]


### Dimension, Length, Size of an array

In [8]:
x3 = np.random.randint(10, size=(3,4,5)) #three dimension

print("x3 ndim:", x3.ndim)  #number of dim
print("x3 shape:", x3.shape) #length in each dimension
print("x3 size: ", x3.size) # total num of elements


x3 ndim: 3
x3 shape: (3, 4, 5)
x3 size:  60


# Operations on arrays

We can apply many operations on arrays. E.g., 
* Perform element-wise multiplication and matrix multiplication 
* Find the transpose of a matrix (an operator that switches the rows and columns) 
* Indentify the trace of a matrix (the sum of the diagonal elements) 
* Reshape the matrix and 
* Find the slice of a matrix with boolean arrays.

### Pointwise addition and multiplication

In [9]:
import numpy as np 

A = np.array([[1, 2, 3], [4, 5, 6]], dtype= np.float64)
B = np.array([[1, 3, 4], [3, 2, 1]], dtype= np.float64)
C = np.array([1,2,3], dtype = np.float64) 
print(A)
print(B)
print(C)
print(A + B)
print(A * B)



[[1. 2. 3.]
 [4. 5. 6.]]
[[1. 3. 4.]
 [3. 2. 1.]]
[1. 2. 3.]
[[2. 5. 7.]
 [7. 7. 7.]]
[[ 1.  6. 12.]
 [12. 10.  6.]]


### Matrix multiplication alternative

In [10]:
np.dot(A, C)

array([14., 32.])

### Transpose of a matrix

Transpose: Columns become rows and rows become columns.

In [11]:
A.T

array([[1., 4.],
       [2., 5.],
       [3., 6.]])

### Trace of an Array

Trace is the sum along the diagonals of the array

In [12]:
np.trace(A)

6.0

### Reshape an array

In [13]:
#reshape the array
X = np.arange(9)
X.reshape((3,3))


array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

### Accessing array elements with boolean vector

In [14]:
#Use boolean to return a specific row
A[[True, False],]

#Or column
A[:, [True, False, False]]

array([[1.],
       [4.]])

# Common Numpy Operations for Data Science

Some of the numpy operations that are frequently used in data science are as follows: 

1. Reshape the matrix
1. Flatten a matrix
1. Change the type of elements
1. Create a copy of A with a specified range
1. Concatenate 2 matrices
1. Add the elements in the same rows or columns to create a new vector
1. Shuffle the elements in the array

In [35]:
A = np.arange(9)
print(A)

# 1. reshape
A = A.reshape((3,3))
print(A)

# 2. Flatten a matrix by row
X = A.flatten()
print(X)

# 2. Flatten a matrix by column
X = A.flatten('F')
print(X)

# 3. Change the type of the elements
A.astype(np.float64)

# 4. Create a copy of A
newA = np.copy(A)

# 4. Create a copy of A with every entry in the matrix, the lowest is 0 and highest is 5
# The element that does not satisfy the condition will become the lowest or highest value
np.clip(A, 0.0, 5)

# 5. Add the matrix B to columns of matrix A 
D = B.T
np.concatenate([A,D], axis = 1)
np.concatenate([A,D.reshape(2, 3)], axis = 0)

print("-" * 10)

# 6. Add the elements in the same row to create a new vector
x = np.sum(A, axis = 1)  # axis = 1 represents column wise operation
print(x)
x = np.sum(A, axis = 1)  # axis = 0 represents row wise operation
print(x)

# 6. Take the average the elements in the same row to create a new vector
x = np.average(A, axis = 1) 
print(x)
x = np.average(A, axis = 0) 
print(x)


print("-" * 10)
x = np.arange(10)
print(x)

# 7. Shuffle the elements in the array
np.random.shuffle(x)
print(x)

# Or matrix
print(A)
np.random.shuffle(A)
print(A)



[0 1 2 3 4 5 6 7 8]
[[0 1 2]
 [3 4 5]
 [6 7 8]]
[0 1 2 3 4 5 6 7 8]
[0 3 6 1 4 7 2 5 8]
----------
[ 3 12 21]
[ 3 12 21]
[1. 4. 7.]
[3. 4. 5.]
----------
[0 1 2 3 4 5 6 7 8 9]
[4 0 2 7 9 6 1 8 3 5]
[[0 1 2]
 [3 4 5]
 [6 7 8]]
[[3 4 5]
 [0 1 2]
 [6 7 8]]


# Importing and exporting data in NumPy

To run the following code, create your own `nptest.txt` and `nptest.csv`.

In [49]:
arr = np.random.random((3,3))  # Create an array filled with random values
print(arr)  

[[0.17949026 0.1709866  0.46345098]
 [0.87457296 0.94411975 0.60825287]
 [0.59665541 0.78364425 0.5000263 ]]


Save the the array to two files: txt and csv. 

In [52]:
np.savetxt('nptest.txt',arr,delimiter=' ') # writes to a text file
np.savetxt('nptest.csv',arr,delimiter=',') # writes to a CSV file

Read the array from the saved files. 

In [53]:
data = np.loadtxt('./nptest.txt') # read From a text file
print(data)

[[0.17949026 0.1709866  0.46345098]
 [0.87457296 0.94411975 0.60825287]
 [0.59665541 0.78364425 0.5000263 ]]


In [54]:
data = np.genfromtxt('nptest.csv',delimiter=',') # read from a CSV file
print(data)

[[0.17949026 0.1709866  0.46345098]
 [0.87457296 0.94411975 0.60825287]
 [0.59665541 0.78364425 0.5000263 ]]
