# 1. Introduction

# Using Jupyter Notebooks
Jupyter is an online computational notebook that enables modular coding in Python, R, and Julia (among others). There are four important actions you will be using throughout the course via Jupter:

## Creating code blocks
You can create code blocks in four ways:
1. Press `Ctrl+Alt+Enter` on your keyboard
2. Click on the 'Insert code cell' that appears at the bottom of your current cell
3. Click on the 'Add code cell' that appears at the bottom of your notebook page
4. Click on the plus `+` symbol that appears at the top-right of your current cell

You can move these blocks further up or down by clicking on the three horizontal dots (the menu) that appear on the top-right of your current cell and selecting either 'Move cell up' (`Ctrl+Shift+$\uparrow$`) or 'Move cell down' (`Ctrl+Shift+$\downarrow$`).

## Running code blocks
Once you have written some code in a block, you can run the code in one of three ways:
1. Click on the 'Play' icon that appears on the top-left of your current cell
2. Click on the menu at the top-right of your current cell and select 'Run cell'
3. Press 'Shift+Enter' on your keyboard. 

You also have the option of running all the code that you have written prior to or after your current cell by clicking on the three horizontal dots and selecting either 'Run all above' or 'Run selected cell and all below'. Using notebook environments like Datalore is nice as all variables that have been defined do not have to be redefined in subsequent codeblocks (unless, of course, you would like to change their values).

In [5]:
x = 2
y = 9.9
x**y;

## Using libraries
A Python library is a "bundle" of useful functions, usually used for the same general purpose (plotting figures, performing statistical analysis, scraping data from the web, etc.). In DataLore, you can use a library by importing it into your code block using the `import python_lib as p_lib` syntax. Once a library is imported, you can use it in your current code block and all following code blocks. 

Here, we are using the `numpy` library for scientific computations. Note that we are *not* changing the values of $x$ and $y$.

In [7]:
import numpy as np
x = 3.2
y = 5
np.power(x,y);

## Creating and using markdown cells
Creating markdown cells is a useful method of organizing your code into readable blocks. In DataLore, you can create a markdown cell in one of three ways:
1. Create a new code cell and press `Ctrl+M`
2. Create a new code cell. On the top-right of the cell, click on the icon in between `+` and the "trash" icon. Select the 'Convert to markdown' option
3. At the bottom of your current cell, select 'More cell types' and select the 'Markdown' option

You can modify markdown text, use numbered or point-form lists, or insert code snippets and equations using the options available on the top-left of the markdown cell. [A handy markdown cheatsheet can be found here](https://www.markdownguide.org/cheat-sheet/).

Let's try creating some markdown text!

# One hash for section headings 
## Two hashes for subsection headings 
### Three for subsubsection headings
...and four is too many

Here are some other useful markdown functions:
* You can use "* " or "- " to list items
* Backticks `your code here` can be used to highlight code
* Create bold or italic text by hitting `Ctrl+b` or `Ctrl+i` respectively. Alternatively, you can wrap your text with two asterisks for **bold font**, or one asterisk for *italic font*.
* Skip one line to begin a new paragraph
* Type as usual for other text. 

Double-click on this cell to view these functions (and more) in action!

# Introduction to Python
Python is zero-indexed programming language used throughout engineering due to its readability and multiple uses. Here are some things that you can achieve using Python:
1. Scrape and load data from the web 
2. Perform basic mathematical computation
3. Organize, filter, and analyze complex datasets 
4. Create interactive charts and other visualizations

In this session, we will learn to identify Python data types, how to write basic Python code, use the DataLore web interface, load Python libraries containing useful functions, and use Python's Pandas, and Matplotlib library.

# Python Basics
## Commenting using Python
Commenting your code not only helps you clarify your code, but also makes it more readable to other individuals using it. Leaving brief comments also enables you to explain your code to yourself and prevent mistakes from being made. In Python, comments are indicated using a hash sign `#`.
## Printing in Python 
Printing in Python is a relatively straightforward step you can take to help identify potential errors in your code or to view the output of your computations.
## Performing basic computations in Python
Python has several in-built functions that allow you to perform mathematical computations via the in-built `math` library. Some functions that you might find useful throughout this course include `math.sqrt()`, `math.exp()`, `math.log()`, `math.sin()`, `math.cos()` and `math.tan()`. The full list of functions in this library can be found at [this link](https://docs.python.org/3/library/math.html).

In this next cell we find the square root of 2 using the `math.sqrt()` function.

In [8]:
# Let's perform some basic computations using the math library
import math
math.sqrt(2)

1.4142135623730951

We can also suppress the output of the function by using a semicolon ";" or by assigning the output to a variable (here we use $x$):

In [9]:
math.sqrt(2);

If you would still like to view the value of the square root of 2, you can use the `print()` function. 

In [13]:
x = math.sqrt(2)
print("Value of x: ", str(x))

Value of x:  1.4142135623730951


You can format your output using the `format` method alongside the `print()` function:

In [20]:
print('{0} = {1:1.5f} and {2} = {3:2.3f}').format('y', math.sqrt(2), 'z', 9.3**5)

{0} = {1:1.5f} and {2} = {3:2.3f}


AttributeError: AttributeError: 'NoneType' object has no attribute 'format'

# Break 

**Cat Fact**: Did you know the cat is the mammal with the greatest difference between dilated and undilated pupils?

Take some time to stand up and stretch, then head over to tab **2. Python Intro I**

# 2. Python Intro I

# An Introduction to the Python Programming Language I
Let's go at top speed over some basics. Press 'Shift+Enter' to execute a cell, or click the top left play button.
## Functions
A function is a block of reusable code that is only when it is called. It can take any number of input parameters, or none at all. LIkewise, it can return any number of outputs and a combination of data types, or none at all. A Python function *does not* have to return a value. However, when returning more than one output, all outputs will be grouped together in a tuple. 

The Python function is defined using the `def` keyword. Let's try running a few Python functions to get an idea of how this works.

`add_numbers` is a function that takes two numbers and adds them together.

In [21]:
def add_numbers(x, y):
    return x + y

add_numbers(1, 2)

a = 3
b = 1.1
def minus_numbers():
    return a-b
c = minus_numbers()
print(c)

1.9


Let's update `add_numbers` to take an optional 3rd parameter. Using `print` allows printing of multiple expressions within a single cell.

In [25]:
def add_numbers(x,y,z=None):
    if z==None:
        return x+y
    else:
        return x+y+z

#print(add_numbers(1, 2))
print(add_numbers(1, 2, 3))

6


Flag parameters are optional parameters that do not require a value assigned to them when the function is called. If a value is assigned to a flag parameter, it will take on the value assigned to it when the function is called. Otherwise, it takes the default parameter value assigned to it when the function was originally defined. Let's update `add_numbers` to take an optional flag parameter.

In [27]:
def add_numbers(x, y, z=None, flag=False):
    if (flag==False):
        print('Flag is false!')
    if (z==None):
        return x + y
    else:
        return x + y + z
    
print(add_numbers(1, 2, flag=True))

3


We can also assign function outputs to variables. Let's assign the output of the function `add_numbers` to variable `a`.

In [29]:
def add_numbers(x,y):
    return x+y

a = add_numbers
a(2,2.4)

4.4

## Data types
Let's quickly review several key data types in Python. You can identify the data type of a value using the `type()` function.

### Integers
The `int` data type are integers, or whole numbers without decimals.

In [None]:
# integers
type(1)

int

### Floats
The `float` data type is a number containing one or more decimals.

In [None]:
# float
type(1.0)

float

### Lists
The `list` data structure is used to store multiple items of the same data type. Lists are mutable (can be altered).

In [30]:
x = [1, 'a', 2, 'b', 6.33]
type(x)

list

In [32]:
b = 'integer'
type(b)

str

Use `append` to append an object to a list.

In [31]:
x.append(3.3)
print(x)

[1, 'a', 2, 'b', 6.33, 3.3]


Here is an example of how to loop through each item in the list.

In [33]:
for item in x:
    print(item)

1
a
2
b
6.33
3.3


In [34]:
len(x)

6

Or using the indexing operator:

In [35]:
i=0
while i != len(x):
    print(i)
    print(x[i])   # print the i-th element of list x
    i = i + 1   # add one to i

0
1
1
a
2
2
3
b
4
6.33
5
3.3


Now try re-writing the while loop above as a for loop!

In [39]:
# write for loop here
# for loops in Python are not inclusive at the second parameter
for i in range(0,len(x)):
    print("current element: ", str(x[i]))

current element:  1
current element:  a
current element:  2
current element:  b
current element:  6.33
current element:  3.3


You can also use the `+` operator to concatenate lists.

In [None]:
[1,2] + [3,4]

[1, 2, 3, 4]

You can use `*` to repeat lists.

In [40]:
[1,"a","c"]*3

[1, 'a', 'c', 1, 'a', 'c', 1, 'a', 'c']

Use the `in` operator to check if something is inside a list. This operator returns a boolean value (true or false).

In [42]:
5 in x

False

Let's iterate from 0 to 999 and return the even numbers in a list.

In [None]:
my_list = []
for number in range(0, 1000):
    if number % 2 == 0:
        my_list.append(number)
my_list

### Strings
Now let's look at strings. The `string` data type is associated with words. In Python, a string is stored as a list of characters. Use bracket notation to slice a string.

In [47]:
# strings
type('This is a string in Python')

x = 'This is a string in Python'
print(x[0]) #first character
print(x[0:1]) #first character, but we have explicitly set the end character
print(x[0:2]) #first two characters

T
T
Th


You can use this method to return the last element of the string.

In [48]:
x[-1]

'n'

You can also select unique orders of string elements using slicing. This slicing method will return the slice starting from the 4th element from the end and stopping before the 2nd element from the end.

In [49]:
x[-4:-2]

'th'

<br>
This is a slice from the beginning of the string and stopping before the 3rd element.

In [51]:
x[3:]

's is a string in Python'

<br>
And this is a slice starting from the 4th element of the string and going all the way to the end.

In [None]:
x[3:]

's is a string'

You can also reverse the order of the string. This syntax is applicable for any list or array.

In [52]:
x[::-1]

'nohtyP ni gnirts a si sihT'

You can also concatenate two (or more) strings with the `+` operator.

In [None]:
firstname = 'Christopher'
lastname = 'Brooks'

print(firstname + ' ' + lastname)
print(firstname*3)
print('Chris' in firstname)

Christopher Brooks
ChristopherChristopherChristopher
True


Make sure you convert objects to strings before concatenating!

In [55]:
'''
# Incorrect
str_wrong = 'Chris' + 2
print(str_wrong)
'''
# Correct
str_right = 'Chris ' + str(2)
print(str_right)

Chris 2


In contrast, you can split a string using `split`, which returns a list of all the words in a string, or a list split on a specific character.

In [58]:
firstname = 'Christopher Arthur Hansen Brooks'.split(' ')[0] # [0] selects the first element of the list
lastname = 'Christopher Arthur Hansen Brooks'.split(' ')[2] # [-1] selects the last element of the list
#print(firstname)
print(lastname)

Hansen


Python also has a built-in method for convenient string formatting.

In [None]:
sales_record = {
'price': 3.24,
'num_items': 4,
'person': 'Chris'}

sales_statement = '{} bought {} item(s) at a price of {} each for a total of {}'

print(sales_statement.format(sales_record['person'],
                             sales_record['num_items'],
                             sales_record['price'],
                             sales_record['num_items']*sales_record['price']))

Chris bought 4 item(s) at a price of 3.24 each for a total of 12.96


### Tuples
The `tuple` data structure is used to store multiple items in a single variable. Items can the the same or different data types. Tuples are an immutable data structure (cannot be altered).

In [None]:
x = (1, 'a', 2, 'b')
type(x)

tuple

### Dictionaries
`Dictionaries` are data types that associate keys with values.

In [60]:
x = {'Christopher Brooks': 'brooksch@umich.edu', 'Bill Gates': 'billg@microsoft.com'}
x['Bill Gates'] # Retrieve a value by using the indexing operator

'billg@microsoft.com'

In [61]:
x['Kevyn Collins-Thompson'] = 'kct@cornell.edu'
x['Kevyn Collins-Thompson']

'kct@cornell.edu'

You can iterate over all of the keys in a dictionary:

In [None]:
# only returns dictionary keys
for name in x:
    print(name)

Christopher Brooks
Bill Gates
Kevyn Collins-Thompson


You can also iterate over all values in a dictionary...

In [None]:
# returns values assc the keys
for email in x.values():
    print(email)

brooksch@umich.edu
billg@microsoft.com
None


...or over all items in the dictionary.

In [63]:
for name, email in x.items():
    print(name, " : ", email)
    #print(email)

Christopher Brooks  :  brooksch@umich.edu
Bill Gates  :  billg@microsoft.com
Kevyn Collins-Thompson  :  kct@cornell.edu


### Sequences
You can unpack a `sequence` data structure into different variables:

In [None]:
x = ('Christopher', 'Brooks', 'brooksch@umich.edu')
fname, lname, email = x

In [None]:
fname

'Christopher'

In [None]:
lname

'Brooks'

<br>
Make sure the number of values you are unpacking matches the number of variables being assigned.

In [None]:
x = ('Christopher', 'Brooks', 'brooksch@umich.edu', 'Ann Arbor')
fname, lname, email = x

ValueError: ValueError: too many values to unpack (expected 3)

### None, NoneTypes and Functions
Some other data types that might be useful to know are the `NoneType` and the `function` data types. We have seen the function data type prior to this. The `NoneType` is a Python object that has no value. Do not confuse this with `None`, which is the default return value for a Python function that has no explicit return value.

In [3]:
# NoneType
type(None)

NoneType

In [None]:
type(add_numbers)

function

## Dates and Times
Python provides an easy and intuitive method to format dates and times using the `time` and `datetime` libraries. The `time` library is primarily used to measure the length of time that a certain number of lines of code or function takes to run. It also provides a quick way of identifying start, end, and current clock times. On the other hand, the `datetime` library allows more detailed formatting of time (years, months, etc.). 

Let's being by importing these two libraries into our code.

In [None]:
import datetime as dt
import time as tm

The `time()` function returns the *current* time in seconds since the Epoch.

In [None]:
tm.time()

We use the `datetime.fromtimestamp()` function to convert the timestamp to datetime.

In [None]:
dtnow = dt.datetime.fromtimestamp(tm.time())
dtnow

Here are some other handy `datetime` attributes:

In [None]:
dtnow.year, dtnow.month, dtnow.day, dtnow.hour, dtnow.minute, dtnow.second # get year, month, day, etc.from a datetime

For example, the `timedelta()` function returns a duration expressing the difference between two dates.

In [None]:
delta = dt.timedelta(days = 100) # create a timedelta of 100 days
delta

Next, the `date.today()` function returns the current local date.

In [None]:
today = dt.date.today()
print(today - delta) # the date 100 days ago
print(today > today-delta) # compare dates

# Break 

**Python Fact**: Pythons (specifically, ball pythons) are the most popular pet snake subspecies in the US. Unfortunately, well-intentioned (but misguided) python releasing in the Florida Everglades has caused them to be classified as an invasive species!

If you're yawning as widely as a starved python, now's a good time to take a quick water break! We'll head on to **3. Python Intro II** in a bit.

# 3. Python Intro II

# An Introduction to the Python Programming Language II
## Python Objects
Python is an object-oriented programming language. This means that many things (such as dates, strings, etc.) are objects that have their own properties. You can also define your own objects using the `class` keyword.

Here an a quick example of a class in Python that we call `Person`. This class has two properties (a name and a location) and two functions (`set_name` and `set_location`) that allows you (the user) to set the object's properties.

In [6]:
class Person:
    department = 'School of Information' # a class variable

    # setter functions
    def set_name(self, new_name): #a method
        self.name = new_name
    def set_location(self, new_location):
        self.location = new_location

    # getter function
    def get_name(self):
        return self.name 
    

In [8]:
# Making the object and setting its properties
mark = Person()
mark.set_name('Mark Zuckerberg')
mark.set_location('Ann Arbor, MI, USA')
print('{} live in {} and works in the department {}'.format(mark.name, mark.location, mark.department))

Mark Zuckerberg live in Ann Arbor, MI, USA and works in the department School of Information


In [9]:
mark.get_name()

'Mark Zuckerberg'

### Try making your own class!
You are a product manager for the new IKEA lamp. Write a class for the lamp that contains the following information about the lamp:
1. Name of the product
2. Product height
3. Base width
4. Colors
5. Projected cost of production

Your class should include setter and getter functions so the user has both access to the product information, and can set their own desired product features. 

In [12]:
# define your IKEA lamp here
class ikea_lamp():
    company = 'Ikea' # a class variable

    # setter functions
    def set_name(self, new_name): #a method
        self.name = new_name
    def set_height(self, new_height):
        self.height = new_location
    def set_colors(self, colors):
        self.colors = colors

    # getter function
    def get_colors(self):
        return self.colors 
    

In [13]:
laamp = ikea_lamp()
laamp.set_name('Laampe')
laamp.set_colors(['purple', 'blue', 'maroon'])

In [14]:
laamp.get_colors()

['purple', 'blue', 'maroon']

## The `map()` function
The `map()` function returns a `map` object that allows you to:
1. Apply a given function (`min`, `max`, etc.) to each item of a given iterable (lists, tuples, arrays)
2. Return a `map` object, which itself can be iterated over

You can pass any number of iterables to the `map()` function. Let's take a look at the example below.

In [21]:
store1 = [10.00, 11.00, 12.34, 2.34]
store2 = [9.00, 11.10, 12.34, 2.01]
cheapest = map(min, store1, store2)
print(cheapest)

<map object at 0x7f1fd6543520>


<br>
Now let's iterate through the map object to see the values.

In [4]:
for item in cheapest:
    print(item)

9.0
11.0
12.34
2.01


### Challenge: Can you find the store that stocks the cheaper option for each of the four items?
Write your code below!

In [None]:
# Write your code below

## Lambdas
In Python, a lambda can be thought of as a 'small' or 'short-cut' function. It can take any number of parameters, but can only have one output or return value.

Here's an example of a lambda that takes in three parameters and adds the first two.

In [16]:
my_function = lambda a, b, c : a + b
my_function(1, 2, 5)

3

Try writing a lambda that performs addition only if $b$ is greater than 5.

In [None]:
my_function = lambda a, b : a + b if b > 5 else b = 0
my_function(1,4)
print(my_function)
my_function(3,7)
print(my_function)
        

## List comprehensions
List comprehensions are a readable way of iterating over a list or array to select only items in the list that meet a certain condition. The main use of list comprehensions are to create a subset of items from an known list. The syntax is as follows:

In [None]:
# Do not run this line of code, for demonstration purposes only
newlist = [x for x in list if condition == True]

Now let's try it out!

In [None]:
my_list = [number for number in range(0,1000) if number % 2 == 0]
my_list

# Break 

**Python Fact**: Ball pythons can suffer from obesity - pet pythons only need to be fed every 1-2 weeks, and overfeeding them can lead to weight gain!

Speaking of food, get a quick snack or drink and we'll proceed to  **4. Python CSV**.

# 4. Python CSV

# Reading and Writing CSV files
Let's import our datafile `mpg.csv`, which contains fuel economy data for 234 cars.
* `manufacturer` automobile manufacturer
* `model` model of car
* `displ` engine displacement in liters
* `year` model year
* `cyl` number of cylinders
* `trans` type of transmission
* `drv` f = front-wheel drive; r = rear-wheel drive; 4 = 4wd
* `cty` city mpg
* `hwy` highway mpg
* `fl` fuel (e = ethanol E85, d = diesel, r = regular, p = premium, c = CNG)
* `class` car classification

In [3]:
import csv

%precision 2   
with open('mpg.csv') as csvfile:
    mpg = list(csv.DictReader(csvfile))

mpg[:3] # The first three dictionaries in our list.

Let's see how the data is organized, and how many dictionaries (CSV file rows) the dataset contains.

`csv.Dictreader` has read in each row of our csv file as a dictionary. `len` shows that our list is comprised of 234 dictionaries.

In [2]:
len(mpg)

234

`keys` gives us the column names of our csv.

In [None]:
mpg[0].keys()

This is how to find the average `cty` fuel economy across all cars. All values in the dictionaries are strings, so we need to convert these values to floats.

In [None]:
sum(float(d['cty']) for d in mpg) / len(mpg)

Similarly, this is how to find the average `hwy` fuel economy across all cars.

In [None]:
sum(float(d['hwy']) for d in mpg) / len(mpg)

You can use the `set` function to return unique values for the number of cylinders the cars in our dataset have.

In [None]:
cylinders = set(d['cyl'] for d in mpg)
cylinders

Here's a more complex example where we are grouping the cars by number of cylinders, and finding the average `cty` mpg for each group.

In [None]:
CtyMpgByCyl = []

for c in cylinders: # iterate over all the cylinder levels
    summpg = 0
    cyltypecount = 0
    for d in mpg: # iterate over all dictionaries
        if d['cyl'] == c: # if the cylinder level type matches,
            summpg += float(d['cty']) # add the cty mpg
            cyltypecount += 1 # increment the count
    CtyMpgByCyl.append((c, summpg / cyltypecount)) # append the tuple ('cylinder', 'avg mpg')

CtyMpgByCyl.sort(key=lambda x: x[0])
CtyMpgByCyl

Here, we use `set` again to return the unique values for the class types in our dataset.

In [None]:
vehicleclass = set(d['class'] for d in mpg) # what are the class types
vehicleclass

And here's an example of how to find the average `hwy` mpg for each class of vehicle in our dataset.

In [None]:
HwyMpgByClass = []

for t in vehicleclass: # iterate over all the vehicle classes
    summpg = 0
    vclasscount = 0
    for d in mpg: # iterate over all dictionaries
        if d['class'] == t: # if the cylinder amount type matches,
            summpg += float(d['hwy']) # add the hwy mpg
            vclasscount += 1 # increment the count
    HwyMpgByClass.append((t, summpg / vclasscount)) # append the tuple ('class', 'avg mpg')

HwyMpgByClass.sort(key=lambda x: x[1])
HwyMpgByClass

## Try this out!
Modify the code for the prior example where we grouped cars by number of cylinders. Instead, group cars by their manufacturer and find the *maximum* number of cylinders each manufacturer's car has. Identify the manufacturer(s) that produce cars with the most number of cylinders.

In [None]:
# Write your code here.

# Break 

**Blob fish Fact**: While other fish species use gas bladders to remain buoyant, blob fish achive this by simply being - you guess it - a blob! They have no bones or muscles - they get by simply by floating around in the deep sea and eating whatever they encounter.

If you're feeling like a blob fish on land, it's time to get up and stretch, then move onward to **5. Python NumPy**.

# 5. Python NumPy

# Numerical Python (NumPy)
Numpy is one of the fundamental packages that you will be using a lot throughout this semester. It's mainly used for scientific computing. It provides convenient and efficient ways to manipulate data in arrays, as well as conduct vector operations. You can refer to its [Documentation page](https://numpy.org/doc/stable/) for more detailed information.

Before we can achieve anything remotely productive, we first need to import the Numpy package.

In [17]:
import numpy as np

## Creating Arrays
Numpy provides a number of different ways to initialize and populate arrays. The first method is by first creating a list and converting it to a Numpy array. You can click on the 'raw' button to see the actual list.

In [20]:
mylist = [1, 2, 3]
x = np.array(mylist)
x

You can also just pass in a list directly.

In [21]:
y = np.array([4, 5, 6])
y

To create a multidimensional array (a matrix), you can pass in a list of lists.

In [22]:
m = np.array([[7, 8, 9], [10, 11, 12]])
m

Use the `shape` method to find the dimensions of the array which will be output as (rows, columns)

In [25]:
# shape returns a tuple
m.shape[1]

3

The `arange` function returns evenly-spaced values within a given interval.

In [29]:
n = np.arange(0, 30, 2) # start at 0 count up by 2, stop before 30
print(n)

[ 0  2  4  6  8 10 12 14 16 18 20 22 24 26 28]


Numpy's `reshape` returns an array with the same data with a new shape.

In [32]:
n = n.reshape(5, 3) # reshape array to be 3x5
n

The `linspace` function returns evenly-spaced numbers over a specified interval.

In [34]:
# linspace always returns arrays that contains floats
o = np.linspace(0, 4, 9) # return 9 evenly spaced values from 0 to 4
print(o)

[0.  0.5 1.  1.5 2.  2.5 3.  3.5 4. ]


The `resize` function changes the shape and size of array in-place. That is, it modifies the original array.

In [35]:
o.resize(3, 3)
o

Numpy `ones` returns a new array of given shape and type, filled with ones.

In [45]:
np.ones((1,3)).T

The `zeros` function returns a new array of given shape and type, filled with zeros. The first parameter can be an integer (1D array) or a tuple (a matrix). The second parameter is *optional*; it uses the `dtype` keyword that allows the user to indicate if they would like to specify an array or a matrix of integers or floats.

In [48]:
np.zeros((2, 3), dtype=int)

### Try this out!
Try to create a Numpy 2D float matrix of size (3,6). Next, in a for loop, populate each row with identical arrays of 6 values that are evenly-spaced from 0 to 3.

In [53]:
# Write your code here
sixes = np.zeros((6,6), dtype=float)
for i in range(0, sixes.shape[0]):
    sixes[i,:] = np.linspace(0,3,6)
print(sixes)

[[0.  0.6 1.2 1.8 2.4 3. ]
 [0.  0.6 1.2 1.8 2.4 3. ]
 [0.  0.6 1.2 1.8 2.4 3. ]
 [0.  0.6 1.2 1.8 2.4 3. ]
 [0.  0.6 1.2 1.8 2.4 3. ]
 [0.  0.6 1.2 1.8 2.4 3. ]]


The `eye` function returns a 2-D array with ones on the diagonal and zeros elsewhere.

In [52]:
np.eye(3)

Numpy `diag` extracts a diagonal or constructs a diagonal array.

In [54]:
np.diag(sixes)

Create an array using repeating list (or see `np.tile`)

In [56]:
np.array([1, 2, 3] * 3).reshape(3,3)

You can repeat elements of an array using `repeat`.

In [58]:
np.repeat([1, 2, 3], 3).reshape(3,3)

## Combining Arrays
Sometimes, it might be useful to combine arrays that contain the same (or similar) information. Let's look at a couple of ways to combine arrays using Numpy.

First, we initialize a Numpy matrix of ones.

In [59]:
p = np.ones([2, 3], int)
p

Use `vstack` to stack arrays in sequence vertically (row wise, or on top of each other).

In [60]:
np.vstack([p, 2*p])

Use `hstack` to stack arrays in sequence horizontally (column wise, or beside each other).

In [61]:
np.hstack([p, 2*p])

## Operations
### Basic Operations
The basic operations in Numpy are your usual suspects- use `+`, `-`, `*`, `/` and `**` to perform element-wise addition, subtraction, multiplication, division and power.

In [62]:
x = np.array([1,2,3])
y = np.array([4,5,6])
print(x + y) # elementwise addition     [1 2 3] + [4 5 6] = [5  7  9]
print(x - y) # elementwise subtraction  [1 2 3] - [4 5 6] = [-3 -3 -3]

[5 7 9]
[-3 -3 -3]


In [None]:
print(x * y) # elementwise multiplication  [1 2 3] * [4 5 6] = [4  10  18]
print(x / y) # elementwise divison         [1 2 3] / [4 5 6] = [0.25  0.4  0.5]

In [63]:
print(x**2) # elementwise power  [1 2 3] ^2 =  [1 4 9]
print(np.power(x,2))

[1 4 9]
[1 4 9]


### The dot product
The dot product (implemented using the `dot()` function) is a concise way of conducting element-wise multiplication and addition.  

$ \begin{bmatrix}x_1 \ x_2 \ x_3\end{bmatrix}
\cdot
\begin{bmatrix}y_1 \\ y_2 \\ y_3\end{bmatrix}
= x_1 y_1 + x_2 y_2 + x_3 y_3$

In [64]:
x.dot(y) # dot product  1*4 + 2*5 + 3*6

32

In [None]:
z = np.array([y, y**2])
print(len(z)) # number of rows of array

### Transposing arrays 
Let's look at transposing arrays. Transposing permutes the dimensions of the array. Let's take a look at how transposing affects our new array `z`.

In [65]:
z = np.array([y, y**2])
z

The shape of array `z` is `(2,3)` before transposing.

In [66]:
z.shape

(2, 3)

Use `.T` to get the transpose.

In [67]:
(z.T).shape

(3, 2)

The number of rows has swapped with the number of columns!

In [None]:
z.T.shape

Use the Numpy `.dtype` method to see the data type of the elements in the array.

In [68]:
z.dtype

dtype('int64')

Use `.astype` to cast all elements in the array to a specific type.

In [69]:
z = z.astype('f')
z.dtype

dtype('float32')

## Math Functions
Numpy has many built-in math functions that can be performed on arrays. We will demonstrate them on a new array `a`.

In [71]:
a = np.array([-4, -2, 1, 3, 5])

In [72]:
a.sum()

3

In [73]:
a.max()

5

In [74]:
a.min()

-4

In [76]:
np.abs(a).mean()

3.0

In [77]:
a.std()

3.2619012860600183

`argmax` and `argmin` return the index of the maximum and minimum values in the array.

In [78]:
a.argmax()

4

In [79]:
a.argmin()

0

## Indexing / Slicing
In Numpy, we can use bracket notation to get the value at a specific index. Remember that indexing in Python starts at 0. We explore indexing with Numpy using a new array `s`.

In [None]:
s = np.arange(13)**2
s

In [None]:
s[0], s[4], s[-1]

Use `:` to indicate a range. `array[start:stop]`

Leaving `start` or `stop` empty will default to the beginning/end of the array.

In [None]:
s[1:5]

<br>
Use negatives to count from the back.

In [None]:
s[-4:]

A second `:` can be used to indicate step size where `array[start:stop:stepsize]`

Here we are starting 5th element from the end, and counting backwards by 2 until the beginning of the array is reached.

In [None]:
s[-5::-2]

Now, let's look at a multidimensional array.

In [None]:
r = np.arange(36)
r.resize((6, 6))
r

We can still use bracket notation to slice, but this time, we add one more dimension to it where the first dimension is the row, and the second dimension is the column: `array[row, column]`

In [None]:
r[2, 2]

Similarly, we can use `:` to select a range of rows and/or columns

In [None]:
r[3, 3:6]

Here we are selecting all the rows up to (and not including) row 2, and all the columns up to (and not including) the last column.

In [None]:
r[:2, :-1]

This is a slice of the last row, and only every other element.

In [None]:
r[-1, ::2]

We can also perform conditional indexing. Here we are selecting values from the array that are greater than 30. (Also see `np.where`)

In [None]:
r[r > 30]

Here we are assigning all values in the array that are greater than 30 to the value of 30.

In [None]:
r[r > 30] = 30
r

## Copying Data
Be careful with copying and modifying arrays in NumPy! Let's walk through some common errors using the `r` array.

`r2` is a slice of `r`

In [None]:
r2 = r[:3,:3]
r2

Set this slice's values to zero (`[:]` selects the entire array)

In [None]:
r2[:] = 0
r2

Note that `r` has also been changed!

In [None]:
r

To avoid this, use `r.copy` to create a copy that will not affect the original array

In [None]:
r_copy = r.copy()
r_copy

Now when `r_copy` is modified, `r` will not be changed.

In [None]:
r_copy[:] = 10
print(r_copy, '\n')
print(r)

## Iterating Over Arrays
Let demonstrate how to iterate over Numpy arrays. We start by creating a new 4-by-3 array of random numbers 0 to 9.

In [None]:
test = np.random.randint(0, 10, (4,3))
test

Iterate by row:

In [None]:
for row in test:
    print(row)

Iterate by index:

In [None]:
for i in range(len(test)):
    print(test[i])

Iterate by row and index:

In [None]:
for i, row in enumerate(test):
    print('row', i, 'is', row)

We can use `zip` to iterate over multiple iterables.

In [None]:
test2 = test**2
test2

In [None]:
for i, j in zip(test, test2):
    print(i,'+',j,'=',i+j)

# Break 

**Dumbo Octopus Fact**: The dumbo octopus is officially the cutest deep-sea octopus. No, this is not actually a fact, but I dare you to fight me on this - Google 'Dumbo octopus' and you'll know why.

We'll shortly move on to **6. Pandas**, so take a breather before we learn another Python package!