
## Python for Data Analysis book

Chapter 4

Introduction to Numpy

--  Then we'll talk about your exam!
 

 
## Numpy:  Numerical Python

Numpy is a core library built and used as the backbone of most high level
scientific and numerical analysis packages.

Numpy provides data structures and functionality 
 

![Jupyter Notebooks](jupyter_start.png)

![Environments](env.png)

 
ndarray: an efficient multi-dimensional 

array providing fast array-oriented arithmetic operations and flexible broadcasting capabilities.
 
Mathematical functions for fast operations on entire arrays of data without having to write loops

 

 
Tools for reading/writing array data to disk and working with memory-mapped files
 
Linear Algebra, random number generation, Fourier transform capabilities
 
A C API for connecting NumPy with libraries written in C, C++ and FORTRAN
 

Numpy stores data efficiently by keepign it in contiguous blocks of memory, independent of 
other Python objects


In [4]:
# Import the numpy library, aliasing it to np..  Not sure why we always do that
import numpy as np

In [5]:
data = np.random.randn(2,3)

In [6]:
data

array([[-0.46216476, -0.12961292, -0.38355299],
       [-0.04468236, -0.18326396, -0.66955045]])

In [7]:
# multiply each item in the array by 10
data * 10

array([[-4.62164761, -1.29612919, -3.83552988],
       [-0.44682358, -1.83263959, -6.69550451]])

In [8]:
# Add data to data (adding two multi dimension arrays)
data + data

array([[-0.92432952, -0.25922584, -0.76710598],
       [-0.08936472, -0.36652792, -1.3391009 ]])

In [9]:
# size of dimensions
data.shape

(2, 3)

In [None]:
# Data type stored in array
# Rows, Columns

data.dtype


From the book, they refer to a 'ndarray' object multiple ways:

1. Array
2. NumPy array
3. ndarray


In [None]:
#Create arrays
list_data1 = [6, 7.5, 8, 0, 1]
ndarray_data1 = np.array(list_data1)

In [None]:
ndarray_data1

In [None]:
ndarray_data1.shape

In [None]:
ndarray_data1.dtype

In [None]:
# Create a multi dimensional array
list_data2 = [[1, 2, 3, 4], [5, 6, 7, 8]]
ndarray_data2 = np.array(list_data2)

In [None]:
ndarray_data2

In [None]:
ndarray_data2.dtype

In [None]:
ndarray_data2.ndim

In [None]:
ndarray_data2.shape


dtype is an 'inferred' meta data attribute of the array

numpy tries to guess based on the data in the array




Numpy has a suite of functions for creating new arrays


In [None]:
# Create an array with a lot of zeros
np.zeros(10)

In [None]:
np.zeros((3, 6))

In [None]:
np.empty((2,3,2))

 
# ******  WARNING ***** 

It's not safe to assume that np.empty will return an array of all zeros.
In some cases, it may return uninitialized 'garbage' values.
Unitialized garbage means: Whatever was already in memory.
 

In [None]:
# arange:  numpy , array value of the built in range function
for x in range(10):
    print(x)

In [None]:
np.arange(15)


* array
* asarray
* arange
* ones,
* ones_like # duplicates the structure/shape of an existing array
* zeros,
* zeros_like
* empty, 
* empty_like
* full,
* full_like
* eye, identity # Create a square N * N identity matrix (1s on the diagonal, 0s elsewhere)



In [None]:
np.eye(12)

In [None]:
np.identity(12)

In [None]:
"""
You can explicitly tell an ndarray what data type to have inside of it.
done with a parameter for the different functions
"""
np.array([1,2,3], dtype=np.float64)

In [None]:
np.array([1,2,3], dtype=np.float64).dtype

In [None]:
np.array([1,2,3], dtype=np.int32)

In [None]:
np.array([1,2,3], dtype=np.int32).dtype

 
## Numpy supports a wide range of data types, but you don't have to memorize each and everyone

* int (8/16/32/64)
* float (16, 32, 64, 128)
* complex (64, 128, 256) - complex type represented by two numbers
* bool
* object
* string_ (fixed size ascii, S abbreviation , S10)
* unicode_ (fixed size Unicode, U abbreviation, U12)
 

In [None]:
"""
We can explicitly cast/convert some dtypes to others
"""
int_arr = np.array([1,2,3,4])
int_arr

In [None]:
int_arr.dtype

In [None]:
float_arr = int_arr.astype(np.float64)
float_arr


In [None]:
float_arr.dtype

 
I'm skipping a bunch of information on 'slicing' the array and other utilities like 'resizing'

Please read chapter 4.1 in your book.  That material CAN be used in a quiz.
 


## Fun functions for arrays

# UNARY Functions

Functions that act on one ndarray

* exp - calculate the exponential of each number
* sqrt -  square root 
* exp2 - 2**x
* abs, fabs
* square
* log, log10, log2, log1p
* sign
* ceil
* floor
* rint
* modf
* isnan
* isfinite, isinf
* cos, cosh, sin, sinh, tan, tanh
* arccos, archosh, arcsin, arcsignh, arctan, arctanh
* logical_not
 

In [None]:
int_arr2 = np.arange(12, dtype=np.int32)
int_arr2

In [None]:
np.exp2(int_arr2)

In [None]:
np.exp2(int_arr2).dtype

In [None]:
np.exp2(int_arr2).astype(np.int32)


# BINARY Functions

Functions that act on TWO ndarrays

* add
* subtract
* multiply
* divide, floor_divide
* power
* maximum, fmax - (fmax ignore NaN)
* minimum, fmin - (fmin ignores NaN)
* mod - modulus
* copysign - copies sign of second argument to first
* greater, greater_equal, less, less_equal, equal, not_equal
* logical_and, logical_or, logical_xor

 

In [None]:
first_arr = np.array([1,2,3,4])
second_arr = np.array([1,2,3,4])
np.add(first_arr, second_arr)

In [None]:
first_arr = np.array([1,2,3,4])
second_arr = np.array([10,20,30,40])
np.subtract(first_arr, second_arr)

In [None]:
np.subtract(second_arr, first_arr)

In [None]:
first_arr

In [None]:
second_arr

In [None]:
third_arr = np.subtract(second_arr, first_arr)

In [None]:
third_arr


## Fun visualization, if we get this far.


![Install Matplotlib](matplotlib.png)


In [None]:
import matplotlib.pyplot as plt
points = np.arange(-5, 5, 0.01)
xs, ys = np.meshgrid(points, points)

In [None]:
xs

In [None]:
ys

In [None]:
z = np.sqrt(xs ** 2 + ys **2)

z

In [None]:
plt.imshow(z, cmap=plt.cm.gray)
plt.colorbar()

In [None]:
new_xs = np.arange(10)
new_xs

In [None]:
plt.plot(new_xs)
plt.ylabel('some numbers')
plt.show()

In [None]:
np.power(new_xs, 2)

In [None]:
plt.plot(np.power(new_xs, 2))
plt.ylabel("More Numbers")
plt.show()

In [None]:
t = np.arange(0., 5., 0.2)

# red dashes, blue squares and green triangles
"""
t, t, "r--"  # x=t, y=t, red dashes
t, t**2, "bs" # x = t, y = t squared, blue square
t, t**3, g^ # x = t, y = t ^3, green triangles]
"""
plt.plot(t, t, 'r--', t, t**2, 'bs', t, t**3, 'g^')
plt.show()

In [None]:
def f(t):
    return np.exp(-t) * np.cos(2*np.pi*t)

t1 = np.arange(0.0, 5.0, 0.1)
t2 = np.arange(0.0, 5.0, 0.02)

plt.figure()
plt.subplot(211)
plt.plot(t1, f(t1), 'bo', t2, f(t2), 'k')

plt.subplot(212)
plt.plot(t2, np.cos(2*np.pi*t2), 'r--')
plt.show()

In [None]:
from matplotlib import pyplot as plt
import numpy as np
fig = plt.figure()
ax = fig.add_axes([0,0,1,1])
ax.axis('equal')
langs = ['C', 'C++', 'Java', 'Python', 'PHP']
students = [23,17,35,29,12]
ax.pie(students, labels = langs,autopct='%1.2f%%')
plt.show()

In [None]:
from mpl_toolkits import mplot3d
import numpy as np
import matplotlib.pyplot as plt
x = np.outer(np.linspace(-2, 2, 30), np.ones(30))
y = x.copy().T # transpose
z = np.cos(x ** 2 + y ** 2)

fig = plt.figure()
ax = plt.axes(projection='3d')

ax.plot_surface(x, y, z,cmap='viridis', edgecolor='none')
ax.set_title('Surface plot')
plt.show()

https://www.tutorialspoint.com/matplotlib/matplotlib_3d_contour_plot.htm