<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#1.-Importing-Python-Libraries" data-toc-modified-id="1.-Importing-Python-Libraries-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>1. Importing Python Libraries</a></span></li><li><span><a href="#Numpy-Arrays-vs.-Python-Lists" data-toc-modified-id="Numpy-Arrays-vs.-Python-Lists-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Numpy Arrays vs. Python Lists</a></span></li><li><span><a href="#Numpy-Data-Types" data-toc-modified-id="Numpy-Data-Types-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Numpy Data Types</a></span></li><li><span><a href="#Other-ways-to-create-Numpy-arrays" data-toc-modified-id="Other-ways-to-create-Numpy-arrays-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Other ways to create Numpy arrays</a></span></li><li><span><a href="#Applying-Operations-to-Arrays" data-toc-modified-id="Applying-Operations-to-Arrays-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Applying Operations to Arrays</a></span></li><li><span><a href="#Reshaping" data-toc-modified-id="Reshaping-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Reshaping</a></span></li><li><span><a href="#Broadcasting" data-toc-modified-id="Broadcasting-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>Broadcasting</a></span></li></ul></div>

## 1. Importing Python Libraries

Part of the reason why Python is such a powerful tool for data science is that other people have written and optimized functions and wrapped them into **libraries** that we can bring into our own work.

![numpy](https://raw.githubusercontent.com/donnemartin/data-science-ipython-notebooks/master/images/numpy.png)

[NumPy](https://www.numpy.org/) is the fundamental package for scientific computing with Python. It's most known for its efficiency when working with arrays. Today we'll cover some basic array manipulations in Numpy.

**Attributes of arrays:** Determining the size, shape, memory consumption, and data types of arrays

**Indexing of arrays:** Getting and setting the value of individual array elements

**Slicing of arrays:** Getting and setting smaller subarrays within a larger array

**Reshaping of arrays:** Changing the shape of a given array

**Joining and splitting of arrays:** Combining multiple arrays into one, and splitting one array into many


To use a package in your current workspace type `import` followed by the name of the library as shown below.

In [2]:
import numpy as np

That worked because numpy is [included with Anaconda](https://docs.anaconda.com/anaconda/packages/py3.7_osx-64/), so numpy was installed when you installed Anaconda. Other packages will need to be installed before you can use them. Many packages have standard import aliases. We effect this aliasing by using the Python keyword `as`. For numpy, the standard alias is `np`.

![](https://qph.fs.quoracdn.net/main-qimg-8868f07e6ddb4f294bad22ca348d1e2d)

## Numpy Arrays vs. Python Lists 

An array is a data structure that stores values of same data type. In Python, this is the main difference between arrays and lists. While python lists can contain values corresponding to different data types, arrays in python can only contain values corresponding to same data type.

**Advantages of Numpy**
1. It's faster. You don't need for loops to iterate over the array like you do with lists and there is only 1 data type in each array so Python doesn't need to spend time type checking.
2. Uses less memory. Python list objects are pointers with 4B/pointer and 16B+ for numerical objects. Arrays have no pointers and the type and itemsize is the same for each column. 

In [3]:
my_list = [5, 10, 15]

#use numpy.array for numbers. Look at the docstring 

my_array = np.array([5, 10, 15])

print(my_list)
print(my_array)

[5, 10, 15]
[ 5 10 15]


In [4]:
for i in range(len(my_list)):
    my_list[i] *= 3
my_list

[15, 30, 45]

In [5]:
my_array = my_array * 3 
my_array

array([15, 30, 45])

## Numpy Data Types

Numerical types: 
* integers(int)
* unsigned integers(uint)
* floating point(float)

Other data types:
* booleans(bool)
* Strings 
* Datetime 
* Python objects 


## Other ways to create Numpy arrays

In [6]:
'''To create an array of strings'''
str_array = np.char.array(['Bob', 'Bill', 'Joe'])
str_array

chararray(['Bob', 'Bill', 'Joe'], dtype='<U4')

In [16]:
'''arange creates an array from 1-20 taking steps of 2'''
my_array = np.arange(1, 20, 2)
my_array

array([ 1,  3,  5,  7,  9, 11, 13, 15, 17, 19])

In [8]:
'''linspace creates a floating point array from 1-20 with 5 elements'''
my_array = np.linspace(1, 20, 5)
my_array

array([ 1.  ,  5.75, 10.5 , 15.25, 20.  ])

In [11]:
'''creating a multidimensional array'''
multi_array = np.array([(1,2,3), (4,5,6)])
multi_array

array([[1, 2, 3],
       [4, 5, 6]])

In [12]:
'''creating an array of random floats from 0-1'''
#np.set_printoptions
rand_array = np.random.random((2,3))
rand_array

array([[0.09347606, 0.62731745, 0.01302436],
       [0.37232324, 0.17525368, 0.25664876]])

In [13]:
'''creating an array of random integers'''
rand_array = np.random.randint(0,10,5)
rand_array

array([4, 1, 0, 1, 8])

## Applying Operations to Arrays
* array.min(), .max()
* array.mean(), .var(), .std()

In [None]:
rand_array. #tab

## Reshaping 

In [17]:
print(my_array)

[ 1  3  5  7  9 11 13 15 17 19]


In [23]:
my_array = my_array.reshape(5,2)  

In [24]:
my_array.size 

10

In [25]:
my_array.shape #what happened here? 

(5, 2)

In [26]:
my_array.dtype

dtype('int64')

In [27]:
my_array.itemsize 

8

In [29]:
print(my_array)

[[ 1  3]
 [ 5  7]
 [ 9 11]
 [13 15]
 [17 19]]


In [31]:
my_array.sum(axis=0) #1 is vertical 0 is horizontal

array([45, 55])

## Broadcasting 
The term broadcasting describes how numpy treats arrays with different shapes during arithmetic operations. Subject to certain constraints, the smaller array is “broadcast” across the larger array so that they have compatible shapes.
Two arrays can be broadcast together if their dimensions have the same value or if one of the dimensions have a value of 1.

In [32]:
a = np.array([10, 20, 30]) #1x3 dimension array
b = 5
print(a)
print(b)

[10 20 30]
5


In [33]:
c = a + b
print(c)

[15 25 35]


In [34]:
a = np.array([[10, 20, 30], [40, 50, 60]]) #2d array
print(a)
print(b)


[[10 20 30]
 [40 50 60]]
5


In [35]:
c = a + b
print(c)

[[15 25 35]
 [45 55 65]]


In [37]:
v.shape

(3,)

In [36]:
v = np.array([12, 24, 36])   
w = np.array([45, 55])     
  
# To compute an outer product we first  
# reshape v to a column vector of shape 3x1 
# then broadcast it against w to yield an output 
# of shape 3x2 which is the outer product of v and w 
print(np.reshape(v, (3, 1)) * w) 


[[ 540  660]
 [1080 1320]
 [1620 1980]]


In [None]:
x = np.array([[12, 22, 33], [45, 55, 66]]) 
  
# x has shape  2x3 and v has shape (3, ) 
# so they broadcast to 2x3, 
print(x + v) 

# Add a vector to each column of a matrix X has 
# shape 2x3 and w has shape (2, ) If we transpose x 
# then it has shape 3x2 and can be broadcast against w  
# to yield a result of shape 3x2. 
  
# Transposing this yields the final result  
# of shape  2x3 which is the matrix. 
print((x.T + w).T) 

In [None]:
# Another solution is to reshape w to be a column 
# vector of shape 2X1 we can then broadcast it  
# directly against X to produce the same output. 
print(x + np.reshape(w, (2, 1))) 

In [None]:
# Multiply a matrix by a constant, X has shape  2x3. 
# Numpy treats scalars as arrays of shape();  
# these can be broadcast together to shape 2x3. 
print(x * 2) 