## Introduction to Python

This notebook contains information on the basics of programming in Python. If you have programmed in any other language before, it should be fairly easy to pick up, but otherwise the notebook aims to teach the main programming concepts required to use Python for data science the rest of the class.

There are no data science related concepts here, but having an understanding of variables, the control flow, and data structures as well as the syntax of Python will be very helpful moving forward.

Original Author: J. Pickard 
Last Modified by J.Gryak on 09/06/22

In [None]:
# Basic Python Functions
print('Hello world')

# Fundamentals of Programming

The goal of using python is to allow us to explore the implementation of concepts covered in earlier sessions. To do this, we need to cover the main parts of any program:

1. Variables
2. Data Structures
3. Control Flow

With these 3 concepts, we will cover the minimum concepts of programming to explore concepts in data science and machine learning.

## Variables

The purpose of a variable is to store data in the program. A variable stores the most smallest amount of data that we can work with in a programming language. Three main types of variables Python uses are:

1. Numbers (Integers or floats)
2. Boolean (True/False)
3. Strings (Words and lists of characters)

Since we will be working with data, we will primarily be interested in working with numbers. Python automatically handles setting the type of variable when it is created. Every type of variable should allow for:

1. Assignment (create a variable and assign it a value)
2. Modification (reassign the value)
3. Reference (access the value)

In the case of a number, the above operations can be accomplished with standard arithmetic operations (=,+,-,/,*). The names of variables must start with letters (not numbers).

### Assignment

Variable assignment creates a new variable, assigning a section of computer memory to save it, setting a value to the memory, and setting a name that references the memory that can be used in the program. When a variable is assigned, the variable name is on the left side of `=` and the value is on the right side.

### Modification

Modifying a variable is similar to assigning a value to a variable that already exist. Using the name of a preexisting variable, a new value for the variable is saved. It is best practice for the new value of the variable to be the same type of data as the original (i.e. if a variable is initialized as `x=5`, then when `x` is modified its value should be set to another number). Similar to assignment, when modifying the value of a variable, the variable name is on the left of `=` and the new value is on the right.

### Reference

Referencing a variable accesses the value of the variable and passes the saved data to a new piece of the program. When referencing the variable, the variable will appear either on the right side of `=`, within a function call or in the condition of a control section in the code.

In [None]:
# Assignment of a number
x = 5          # x is the name of a new variable
              # 10 is saved in memory as the value of x

print(x)       # The value of x is passed to the print function
              # This is a reference to the variable x

In [None]:
# Assignment of a string
y = "Michigan"          # y is the name of a new variable
              # The string "MIDAS" is saved in memory
              
print(y)       # The value of y is passed to the print function
              # This is a reference to the variable y

In [None]:
# Modification of a number
x =  10         # x is already created as a variable
              # the old value 10 is erased and 5 is saved as the new value
print(x)

In [None]:
# Modification of a string
y = 'Go Blue'

print(y)
y = y + '!'
print(y)

In [None]:
# Referencing a number
print('x+1=' + str(x+1))
print('x-1=' + str(x-1))
print('x*2=' + str(x*2))
print('x/2=' + str(x/2))

In [None]:
## Reference a number
# i == j is True if variables i and j have the same value and false otherwise
# i != j is True if variables i and j have different values and false otherwise
print('x=' + str(x))
print('x==5: ' + str(x==5))
print('x==6: ' + str(x==6))
print('x!=5: ' + str(x!=5))
print('x!=6: ' + str(x!=6))
print('x<6: ' + str(x>6))
print('x>6: ' + str(x<6))

In [None]:
# Referencing a string
print(y[0:2])   # Prints only the first 2 characters of the string
print(y[3:7])   # Prints characters 2-6 of the string
z = y           # References y and assigns value of y to z
print(z)        # Reference to z

In [None]:
# References and assignment
x = x + 1
print(x)
x = x - 1
print(x)
x = x * 2
print(x)
x = x / 2
print(x)

## Data Structures

A data structure is like a larger variable. Data strauctures are used to store larger amounts of information than single variables. The most common type of data structure we will use is an array, which stores a variable at each index in the array. Similar to a variable, data structures must provide the following:

1. Assignment
2. Modification
3. Reference

Each of these fill the same role as they did for single variables.

An array is a continuous segment of memory, and each section of memory stores the same information as one single variable. Sections of an array can be accessed for the purpose of assignment, modification or reference by using `[index]`, where `index` is a number referencing a specific position in the array. This is similar to how we referenced elements of a string in the pervious cell.

The below cell of code shows an example of using arrays similar to how variables were used in the pervious section.

In [None]:
# Assign arr to be an array of numbers
arr = [0,1,2,3]
print('arr=' + str(arr))

# Modify a single element in arr
arr[0] = 5
print('arr=' + str(arr))

# Reference arr
arr2 = arr
print('arr=' + str(arr))

# Reference an element in arr
arr_element = arr[0]
print('arr_element=' + str(arr_element))

# Reference a set of elements in arr
arr_element2 = arr2[1]
print('arr_element=' + str(arr_element2))

## Control Flow

Control flow is the order in which lines of code are accessed. When a program begins, the top line of code is always executed first, and each line of code directs which line will be next. We will see control flow in the form of:
1. If/Else
2. Loops
3. Functions

TIP: In Python, the indentation of a line of code can be used to determine how that line gets accessed. All lines of code at the same level of indentation constitute a code block, and all code in a code block executes in order.

### If/Else

In an `if` statement, when the condition is true, the following lines of indented code are executed, but when the condition is not true, the execution step will jump to the next `if`, `elif` (else if), or `else` statement. For a set of `if` statements, once a section has been executed, the code skips checking the remaining `if`, `elif`, or `else` conditions. The condition being checked must take the form of an expression that evaluates to `True` or `False`.

For example, the conditions could be `x < 5`, `x > 5` and `x == 5` (`x` equals 5) must always be either `True` or `False`, and the truth value of these statments depends on the value of `x`. These values can be combined using `and` and `or` qualifiers.

For example, `x < 5 or x == 5` is `True` when either `x=5` or `x` has a value less than 5. Sometimes this condition is true, and sometimes it is false, making it a good condition to check.

However, the statments `x < 5 and x > 5` and `x == 5 or x != 5` are both bad predicates. The first one is bad because it is always `False.` No number can be both smaller and greater than 5. The second predicate is bad because it is always `True`. Every number is either equal to 5 or not equal to 5, so no matter the value of `x`, the first predicate is `False` and the second predicate is `True`.

In Python, we use the following commands to check relationships among variables:
* Greater than: `>`
* Less than: `<`
* Greater than or equal to: `>=`
* Less than or equal to: `<=`
* Equal: `==` (note how this is different than variable assignment with `=`)
* Not equal `!=`

In [None]:
# TODO: Fill in the conditions of the if statemet block
x = 5
if x == 5:
    print("x==5 is true")
if x < 6:
    print("x<6 is true")
else:
    print("else: neither condition was true")

Note: The `if` `elif` and `else` lines had no indentation, but the `print` lines were indented. This represents how the conditions must be checked before running the `print` lines.

### Loops in Python

A loop in Python is a set of code that will execute multiple times as long as a condition is `True`. The conditions are formed similarly to the condition statements used above for `If/Else` statements.

Below we see a `while` loop and a `for` loop. The below loops have different syntax, but will produce the same result.

In the `for` loop, the variable, in this case `i`, will always begin at 0 and be increased by 1 at each iteration through the loop. It is common in most programming languages, including Python, that loops and indexes begin at 0 rather than 1. This is similar to how arrays are accessed.

In [None]:
# TODO: Fill in a while loop printing the values 0 through 4
x = 0
while x < 5:        # Check the condition
  print(x)             # Execute line 1 in the loop: print value of x
  x = x + 1            # Execute line 2 in the loop: increase counter

In [None]:
# TODO: Fill in a for loop printing the values 0 through 4
for x in range(5):               # For loop conditions have a different syntax
  print(x)                   # Execute line 1 in the loop: print the value of x

### Functions

A function is a section of code that will be executed each time it is called. We have been using the `print` function already to print the values of each variable.

Functions are useful because they allow us to write code once but use it many times. When the function is declared, no code is actually executed, it is just stored in memory for later use.

A function declaration always has the syntax `def function_name(parameters)`. The function name can be any string, but it is best to name it something that refers to the code it executes. The parameters are a set of variables that will be used within the function.

A function call has the syntax `function_name(variables)`. This will execute the function code with the variable values passed as parameters.

In [None]:
# Function declaration
def sum_elements(arr):
  # TODO: fill in the function body
  total = 0
  for i in arr:
    total = total + i
  z = 10
  return total, z 


In [None]:
# Calling the function

# Set array
x = [0,1,2,3,4, 12, 13, -6]
# Call function and assign value to array_sum variable
array_sum = sum_elements(x)
print(array_sum)
print(z_value)

# Python Libraries

A library is a set of code with built in functions that we can utilize. The main libraries we will be using are `NumPy`, which stores arrays as matrix data structures and has a series of built in functions for scientific computing, and `Matplotlib`, which we will use to generate plots.

The first step is to import each library. If you run code on your machine, you may need to download it, but CoLab handles installing all libraries for us.

In [None]:
# Import libraries
import numpy as np
import matplotlib.pyplot as plt

### NumPy Matrices
A `NumPy` matrix works similar to an array, and they are created using similar syntax. The following cell creates a 2D matrix and prints the matrix as well as shape and size, 2 attributes of the matrix data structure.

In [None]:
# Create a numpy matrix
x = np.array([[1,2,3],[4,5,6]])
print(x)
print(x.shape)
print(x.size)

Matrixes can be accessed similar to arrays with `[index]` syntax, but also entire rows or columns of the matrix may be accessed with slices.

In [None]:
# Slicing
for i in x:
    print(i)

In [None]:
# Access rows of x
for i in range(len(x)):
  print(x[i,:])

In [None]:
# Access columns of x
for i in range(len(x[0])):
  print(x[:,i])

Individual elements of `x` can also be accessed one at a time with the following 2 types of syntax.

In [None]:
# Indexing
for i in x:
    for j in i:
        print(j)

In [None]:
for i in range(len(x)):
  for j in range(len(x[0])):
    print(x[i,j])

`NumPy` has a series of built in functions for quickly accessing and modifying matrices. The matrices can be reshaped, multiplied, have norms computed, and many other operations done to them.

In [None]:
# Reshape a matrix
print(x)
y = x.reshape(6,1)
print(y)


In [None]:
# Matrix multiplication: z = x * y
y = np.array([1,2,3])
z = np.matmul(x,y)
print(z)

In [None]:
# Compute the norm of a matrix or vector
# Valid norms:
#   1. fro: Frobenius
#   2. nuc: nuclear

# TODO: Fill in a norm for the 'ord' argument
np.linalg.norm(x, ord='fro')

### Additional `NumPy` Functions
The `NumPy` library has many more functions than we can cover here, but the following cell contains a few basic functions that you may be interested in using. For a reference of all capibilities and built in functionality of `NumPy` please see: https://numpy.org/

In the remaining code, we may use new functions from `NumPy` and the above is a valuable reference. Learning to quickly read and evaluate and use new libraries based on documentation like this is an important part of programming.

In [None]:
# TODO: Create a new array
x = np.array([1,2,3,5,1,4,2]) # Create a new array

# Functions for NumPy arrays
print(np.max(x))    # Maximum value in the array
print(np.min(x))    # Minimum value in the array
print(np.argmax(x)) # Location of the maximum value in the array 
print(np.argmin(x)) # Location of the minimum value in the array
print(np.mean(x))   # Average or mean value in the array
print(np.median(x)) # Median value in the array
print(np.std(x))    # Standard deviation in the array

# Matplotlib.pyplot

The following cells use `NumPy` to create arrays and apply a math function `sin, log, exponential` to them. These results are then plotted using the `plt.plot()` function.

In [None]:
# Special Functions in NumPy
x = np.arange(0.01,10,0.01)
y = np.log(x)
plt.plot(x,y)

# TODO: Fill in plotting arguments
plt.title("Title here")
plt.xlabel("x label")
plt.ylabel("y label")

In [None]:
x = np.arange(0,10,0.01)
y = np.exp(x)

# TODO: Fill in plotting arguments
plt.plot(x,y)
plt.title("")
plt.xlabel("")
plt.ylabel("")

In [None]:
x = np.arange(0,3*np.pi, 0.01)
y = np.sin(x)

# TODO: Fill in plotting arguments
plt.plot(x,y)
plt.title("")
plt.xlabel("")
plt.ylabel("")

In [None]:
# Generate random data
x = np.random.rand(10, 3)
print(x)

# Scatter plot
plt.scatter(x[:,0],x[:,2])

# TODO: Fill in plotting arguments
plt.title("")
plt.xlabel("")
plt.ylabel("")