![data-x](http://oi64.tinypic.com/o858n4.jpg)

---
# Numpy Data X - BKHW 

**Author:** Kunal Desai and Ikhlaq Sidhu 1/22/2017, midified June 2017

**License Agreement:** Feel free to do whatever you want with this code

___

# Introduction to NumPy

# What is NumPy:  

NumPy is the fundamental package for scientific computing with Python. It contains among other things:

* a powerful N-dimensional array object
* sophisticated (broadcasting) functions
* tools for integrating C/C++ and Fortran code
* useful linear algebra, Fourier transform, and random number capabilities


# NumPy contains an array object that is "fast"


<img src="https://github.com/ikhlaqsidhu/data-x/raw/master/imgsource/threefundamental.png">


It stores:
* location of a memory block (allocated all at one time)
* a shape (3 x 3 or 1 x 9, etc)
* data type / size of each element

The core feauture that NumPy supports is its multi-dimensional arrays. In NumPy, dimensions are called axes and the number of axes is called a rank.

In [20]:
# written for Python 3.6
import numpy as np

In [50]:
# Creating a NumPy Arracy - simplest possible
# We use a list as an argument input in making a NumPy Array

list1 = [1, 2, 3, 4]
data = np.array(list1)
data

array([1, 2, 3, 4])

In [55]:
# it could be much longer
list2 = range(10000)
data = np.array(list2)
data

array([   0,    1,    2, ..., 9997, 9998, 9999])

In [56]:
# data = np.array(1,2,3,4, 5,6,7,8,9) # wrong
data = np.array([1,2,3,4,5,6,7,8,9]) # right
data

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [57]:
#accessing elements - similar to slicing Python lists:
print(data[:])
print (data[0:3])
print (data[3:])
print (data[::-2])

[1 2 3 4 5 6 7 8 9]
[1 2 3]
[4 5 6 7 8 9]
[9 7 5 3 1]


## Arrays are like lists, but different

In [78]:
# Arrays are faster and more efficient

x = list(range(10000))
%timeit y = [i**2 for i in x]
print (y[0:5])


100 loops, best of 3: 2.73 ms per loop
[0 1 2 3 4]


In [79]:
z = np.array(x)
%timeit y = z**2
print (y[0:5])

The slowest run took 12.43 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 6.89 µs per loop
[0 1 2 3 4]


In [85]:
# Arrays are different than lists in another way:
# x and y are lists
x = list(range(5))
y = list(range(5,10))
print ("x = ", x)
print ("y = ", y)
print ("x+y = ", x+y)

x =  [0, 1, 2, 3, 4]
y =  [5, 6, 7, 8, 9]
x+y =  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[0, 1, 2, 3, 4, 0, 1, 2, 3, 4]


In [87]:
# now lets try with NumPy arrays:
xn = np.array(x)
yn = np.array(y)
print (xn)
print (yn)
print ("xn + yn = ", xn + yn)

[0 1 2 3 4]
[5 6 7 8 9]
xn + yn =  [ 5  7  9 11 13]
[0 2 4 6 8]


In [84]:
# if you need to join to numpy arrays, try hstack, vstack, column_stack, or concatenate
print (np.hstack((xn,yn)))
print (np.concatenate((xn,yn)))

[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]


In [90]:
# An array is a sequence that can be manipulated easily
# All elements of an array should have the same type
# An arithmatic operation is applied to each element individually
# When two arrays are added, they must have the same size; corresponding elements 
# are added in the result

print (3* x)
print (3 * xn)


[0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4]
[ 0  3  6  9 12]


Creating arrays with 2 axis:


In [96]:
# This list has two dimensions
list3 = [[1, 2, 3],
 [4, 5, 6]]

In [97]:
# data = np.array([[1, 2, 3], [4, 5, 6]])
data = np.array(list3)
print (data)

[[1 2 3]
 [4 5 6]]


In [105]:
# You can also transpose an array Matrix
print ('Transpose: \n', data.T, '\n')
print ('Transpose: \n', np.transpose(data))

# print (list3.T) # note, this would not work

Transpose: 
 [[1 4]
 [2 5]
 [3 6]] 

Transpose: 
 [[1 4]
 [2 5]
 [3 6]]


Remember that every time you declare an np.array, the argument must be in the form of a Python array. Ranges are a great tool to create these arrays.

In [113]:
#Creates array from 0 to before end: np.arange(end)
# See that you don't have to make a list first

# A range is an array of consecutive numbers
# np.arange(end): 

np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [114]:
#Array increasing from start to end: np.arange(start, end)
np.arange(10, 20)

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])

In [115]:
#Array increasing from start to end by step: np.arange(start, end, step)
# The range always includes start but excludes end
np.arange(1, 10, 2)

array([1, 3, 5, 7, 9])

Here is a quick example of a NumPy array and some helpful methods:

In [122]:
# Reshape is used to change the shape
a = np.arange(0, 15)
a = a.reshape(3, 5)
# a = np.arange(0, 15).reshape(3, 5)  # same thing
print (a)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]]


In [123]:
# If you want to know the shape, use 'shape'
a.shape

(3, 5)

In [124]:
# ndim tells us the number of dimensions of the array
a.ndim

2

In [125]:
#dtype.name tells us what type is each element in the array
a.dtype.name

'int64'

In [126]:
# And for total size:
a.size

15

In [139]:
# sum, min, max, .. are easy
print (a)
print (a.sum())
print ((0+14)*15/2)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]]
105
105.0


In [132]:
print (a.sum(axis=0))
print (a.sum(axis=1))

[15 18 21 24 27]
[10 35 60]


To get the cumulative product:

In [134]:
print (np.arange(1, 10))
print (np.cumprod(np.arange(1, 10)))

[1 2 3 4 5 6 7 8 9]
[     1      2      6     24    120    720   5040  40320 362880]


To get the cumulative sum:

In [141]:
print (np.arange(1, 10))
np.cumsum((np.arange(1, 10)))

[1 2 3 4 5 6 7 8 9]


array([ 1,  3,  6, 10, 15, 21, 28, 36, 45])

You can also compare arrays

In [34]:
# Does this array have any elements that are "0"?
print np.arange(1, 10) == 0
np.any(np.arange(1, 10) == 0)

[False False False False False False False False False]


False

In [35]:
# Does this array have anny elements that are "1"?
print np.array([1, 1, 1, 1]) == 1
np.all(np.array([1, 1, 1, 1]) == 1)

[ True  True  True  True]


True

Creating a 3D array:

In [36]:
a = np.arange(0, 24).reshape(2, 3, 4)



Printing these arrays is fairly easy as well, just use the normal print command from Python

In [37]:
print(a)

[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]


# Basic Operations

One of the coolest parts of NumPy is the ability for you to run operations on top of arrays. Here are some basic operations:

In [38]:
a = np.arange(11, 21)
b = np.arange(0, 10)
print "a = ",a
print "b = ",b
print a + b

a =  [11 12 13 14 15 16 17 18 19 20]
b =  [0 1 2 3 4 5 6 7 8 9]
[11 13 15 17 19 21 23 25 27 29]


In [39]:
a * b

array([  0,  12,  26,  42,  60,  80, 102, 126, 152, 180])

In [46]:
a ** 2

array([121, 144, 169, 196, 225, 256, 289, 324, 361, 400])

You can even do things like matrix operations

In [None]:
a.dot(b)

In [47]:
# Matrix multiplication
c = np.arange(1,5).reshape(2,2)
print "c = "
print c
d = np.arange(5,9).reshape(2,2)
print "d = "
print d


c = 
[[1 2]
 [3 4]]
d = 
[[5 6]
 [7 8]]


In [48]:
print d.dot(c) 
print d

[[23 34]
 [31 46]]
[[5 6]
 [7 8]]
