![Data Applications](https://www.durhamtech.edu/themes/custom/durhamtech/images/durham-tech-logo-web.svg) 

## Introduction
This lecture provides an overview of some of the technical requirements needed to complete the course as well as a python review.

---

# Table of Contents

### Jupyter Overview
#### <a href='#1'>Useful Links</a>
#### <a href='#2'>Introduction to Jupyter Notebooks</a>
#### <a href='#3'>Cell Types</a>
* Markdown 
* Code
    1. Running One Cell
    2. Other Run Options

#### <a href='#4'>Tips and Tricks</a>

### Python Review
#### <a href='#5'>Programming Basics</a>
#### <a href='#6'>Python as a Calculator</a>
#### <a href='#7'>Variable Assignment</a>
#### <a href='#8'>String Manipulation</a>
#### <a href='#9'>Loops</a>
* For 
* While

#### <a href='#10'>If Statements</a>
#### <a href='#11'>Try/Except Clauses</a>
#### <a href='#12'>Data Structures</a>
* Lists 
* Sets
* Dictionaries

#### <a href='#13'>Functions</a>
* Higher Order Functions

#### <a href='#14'>Random</a>
* User Input
* Importing Packages
* Return/Print

#### <a href='#15'>Weekly Readings/Videos</a>
#### <a href='#16'>Extra Practice</a>


<a id='1'></a>
# Useful Links
1. Anaconda/Jupyter Basics:
    - https://www.edureka.co/blog/python-anaconda-tutorial/
2. Tableau (BEN CAN YOU PUT SOMETHING HERE????
???
???
???
???
)
3. Github (Recommended that you have an account to download our lectures from):
    - https://readwrite.com/2013/09/30/understanding-github-a-journey-for-beginners-part-1/
    - https://readwrite.com/2013/10/02/github-for-beginners-part-2/
4. Learning Python (resources):
    - https://www.datacamp.com/courses/intro-to-python-for-data-science
    - [Great Python Exercises for Practice](https://www.practicepython.org
)
5. Matt and Ben's Website: https://www.nolansmithsolutions.com
6. Course's Github Account: https://github.com/NolanSmithSolutions/Lectures


<a id='2'></a>
# Introduction to Jupyter Notebooks

From the [Project Jupyter Website](https://jupyter.org/):

* *__Project Jupyter__ exists to develop open-source software, open-standards, and services for interactive computing across dozens of programming languages.*

* *__The Jupyter Notebook__ is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more.*

# Cell Types

## Markdown cells

Where you write text, equations, matrices, example code, images... virtually anything you don't need to have evaluated by the programming language.

Example with python code
```python
# example function
def return_middle_name(first,middle,last):
    return middle
```

An internal link to the bottom section in the notebook using HTML:

## <a href='#bottom'>Link: Take me to the bottom of the notebook</a>

___

**Find a lot of useful Markdown commands here:** https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet

___

## Code Cells
In them you can interactively run Python commands

### Run a single block of code
Cell -> Run Cells

or

ctrl + enter

### Other Running Options

- Run All
- Run All Below
- Run All Above

(Options listed under the cell button on the top taskbar)

In [None]:
print('hello world!')
print("Its so good to see you")

# Comment out code in a cell by using a hashtag at the beginning of the line

In [None]:
# Lines evaluated sequentially, a cell displays output of last line only unless a print statement is used

# Not printed out at the bottom - not the last line
9+1

# Printed out at the bottom - below is the last uncommented line
5+3

In [None]:
# Because each line is evaluated sequentially, an error in a previous line that is run will
# error out the cell and prevent correct lines after it from running (unless advanced coding techniques
# such as a try/except are used)

print('Line 1 works as this is correct code')
print('This line will not print as there is an error in it', a)
print('If the line above this did not have an error, it would have printed')

In [None]:
# You can still manually run code after a cell block that has an error above it
print('This cell can possibly run independently of the ones above if you manually run it after an error above')

In [None]:
# Watch out for code that gets stuck in an infinite loop - you will have to manually stop programs with
# the stop button above on the taskbar that has the square icon on it
while True:
    continue

In [None]:
# While cells run independently, variables exist independently of the code block

tmp_str = 'This sentence is stored in memory'
print(tmp_str)

In [None]:
# The same variable "temp_str" exists outside of the cell it was defined in
print(tmp_str)

<a id='4'></a>
# Tips and Tricks

In [181]:
# Basic Unix commands
# http://mally.stanford.edu/~sr/computing/basic-unix.html

print('This tells you what directory you currently are in')
%pwd

This tells you what directory you currently are in


'/Users/Matthew/Documents/Consulting/Lectures/Intro'

In [182]:
print('\nThis lists the files in the current directory you are in')
%ls

print('\nOther useful commands are "cd", "mkdir", "mv", and "cd"')


This lists the files in the current directory you are in
Introduction.ipynb                 anaconda_install_instructions.pdf

Other useful commands are "cd", "mkdir", "mv", and "cd"


In [176]:
print('Timing just one operation')
# Time this operation
%time 
[x for x in range(300000)];

print('\nTiming several operations')
# Time several runs of the same operation
%timeit [x for x in range(300000)];

Timing just one operation
CPU times: user 8 µs, sys: 2 µs, total: 10 µs
Wall time: 36 µs

Timing several operations
24.5 ms ± 2.57 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [None]:
def our_function(x):
    """This is a multi line comment for a function.  This is often used to describe the function you have just written
    ie: This function prints out the line number and a phrase x amount of times"""
    for i in range(int(x)):
        print(str(i+1)+' I love python')

our_function(6)

## Shortcuts
4. Enter selection mode / Cell mode (Esc / Return)
1. Insert cells (press A or B in selection mode)
2. Delete / Cut cells (press X in selection mode)
3. Mark several cells (Shift in selection mode)
6. Merge cells (Select, then Shift+M)
7. The folder .ipynb-checkpoints

## Printing to pdf 
File -> Print Preview. Then Print that page as a Pdf (Ctrl + P).

## -------------PRACTICE-------------
1. Edit the code below to see how long it takes to run 50 iterations of "our_function"

In [None]:
our_function(100)

2. Create a folder named "Test" in your current directory. Then navigate to that folder on Jupyter Notebooks as well as pull it up on your computer.

3. Delete the blank cell below this cell.  Then, recreate it using an insert cell as a markdown cells and type in a header that says "Hello".

4. Print to PDF this page and open up the lecture there.

# Review of Python Topics

### Why Python?
Python has experienced incredible growth over the last couple of years, and many of the state of the art Machine Learning libraries being developed today have support for Python (scikit-learn, TensorFlow etc.)  Additonally, python has many uses outside of the data science world that specialized languages such R don't have quite as readily available.

![](https://149351115.v2.pressablecdn.com/wp-content/uploads/2017/09/growth_major_languages-1-1024x878.png)

Source: https://stackoverflow.blog/2017/09/06/incredible-growth-python/

### Check what Python distribution you are running

In [None]:
!which python

In [None]:
# Check that it is Python 3
import sys # import built in package
print(sys.version)

<a id='5'></a>
# Programming Basics

Programs are simply a sequence of instructions that can be executed by a computer to perform a specific task by manipulating values.  Values come in a few different types:

1. Strings: "Hello", "I love coding"
2. Floats: 5.7, 3.14, -2.99996
3. Integers: -8, 12, 100016
4. Booleans: True, False

In [None]:
# Print the type of different values

print(2, type(2))

print(2.1, type(2.1))

print("2", type("2"))

print("too", type("too"))

print(True, type(True))

In [None]:
# While 8 and 8.0 are technically the same value, they are stored as different variable types by python
type(8) == type(8.0)

In [None]:
#Another way to check if a value is a certain type
print (isinstance(8,int))
print (isinstance(23.7,int))

<a id='6'></a>
# Python as a Calculator
1. (+) operator (addition)
2. (-) operator (subtraction)
3. (*) operator (multiplication)
4. (/) operator (floating point division)
 - divides the first number number by the second, evaluating to a number with a decimal point even if the numbers divide evenly
 
5. (//) operator (floor division) 
 - divides the first number by the second and then rounds down, evaluating to an integer.
6. (%) operator (modulo) 
 - evaluates to the positive remainder left over from division.

Parentheses may be used to group subexpressions together; the entire expression is evaluated in PEMDAS (Parentheses, Exponentiation, Multiplication / Division, Addition / Subtraction) order.

In [None]:
# Addition
5.8 + 12

In [None]:
# Subtraction
12 - 8

In [None]:
# Multiplication
7*77

In [None]:
# Floor division
8//3

In [None]:
# Floating point division, note the difference
8/3

In [None]:
# Rounding, again note the difference
round(8/3)

<a id='7'></a>
# Variable Assignment

An assignment statement consists of a name (often a variable) and an expression (often an equation or function). It changes the state of the variable/name by evaluating the expression to the right of the = sign and binding its value to the name on the left.

This is not to be confused with the double equals sign (==) which checks for equality between two variables/values.

In [None]:
# The variable "a" is now assigned with 4 and "b" is assigned to 2
a = 4; b = 2
print(b**a) # ** is exponentiation

In [None]:
a = 4; b = 2
print(b**a) # ** is 'to the power'

In [None]:
# Note the use of parenthesis!!!!
print(a%(b+1))  # modulus operator = remainder

In [154]:
# Boolean checks
# Refer to this document for more https://www.lotame.com/what-is-boolean-logic/
a = True; b = False; c = True
print(a and b)
print(a or b)
print(a != b) # not equal to

False
True
True


In [186]:
# Changing variables
x = 5
print(x)

# Add to x
x = x +1
print(x)

# Another way
x += 1
print(x)

# Multiply x
x = x * 3
print(x)

# Another way
x *= 8
print(x)

5
6
7
21
168


In [None]:
# Conditional programming
if 5 == 5:
    print('correct')
else:
    print('incorrect...')

<a id='8'></a>
# String Manipulation

In [184]:
# Strings and slicing with the alphabet
alphabet = "abcdefghijklmnopqrstuvwxyz"

In [None]:
print(alphabet[1]) # zero indexed, meaning everything starts out at 0
print(alphabet[0]) # ie: 0 is the first number if you listed out every letter in the string

In [None]:
print (type(alphabet))

In [None]:
print (len(alphabet))

In [None]:
print(alphabet)

In [None]:
print (alphabet[1:4:1]) # start:stop:step, start at 1, stop at 4, move to the next 1 object

In [None]:
print (alphabet[::3]) # every 3 letters

In [None]:
print (alphabet[::-1]) # the step is -1, indicating that it traverses the alphabet in opposite order

In [188]:
print (alphabet[len(alphabet)-1:1:-4])

zvrnjf


In [None]:
# Triple quotes are useful for multiple line strings
y = '''Sally sells 
Seashells by 
The sea shore.'''
print (y)

In [None]:
# tokenize by space
words = y.split(' ')
print (words)

In [None]:
# remove break line character
[w.replace('\n','') for w in words]

<a id='9'></a>
# Loops
A loop statement allows users to execute a statement or group of statements many times.  An example can be seen in this for loop logic diagram below.

## Loop Types

#### For 
Executes a sequence of statements multiple times and abbreviates the code that manages the loop variable.

#### While 
Repeats a statement or group of statements while a given condition is TRUE. It tests the condition before executing the loop body.

## Loop Controls

#### Continue
Causes the loop to skip the remainder of its body and immediately retest its condition prior to reiterating.

#### Pass
The pass statement in Python is used when a statement is required syntactically but you do not want any command or code to execute.

#### Break
Terminates the loop statement and transfers execution to the statement immediately following the loop.

source: https://www.tutorialspoint.com/python/python_loops.htm

In [164]:
# For loop, this goes through each element in the animals list and says hello to each animal
animals = ['Lemurs', 'Elephants', 'Dogs', 'Cats', 'Zebras']
for animal in animals:
    if animal == 'Cats':
        #end loop if Cats comes up
        break
    print('hello', animal[:-1])

print('\n') # blank line prints out

# While loop
count = 2
while count < 100:
    if count >= 30 and count <= 50:
        count = count + 10
        # Doesn't print out counts between 30 and 50
        continue
    print('I am', count, 'years old')
    count = count + 10

print('\n')

# Loop with enumerated index and items
# This assigns an index number to the list while still allowing you to traverse through the elements
for index,animal in enumerate(animals):
    print ("animal ", index," is ", animal[:-1] )

hello Lemur
hello Elephant
hello Dog


I am 2 years old
I am 12 years old
I am 22 years old
I am 52 years old
I am 62 years old
I am 72 years old
I am 82 years old
I am 92 years old


animal  0  is  Lemur
animal  1  is  Elephant
animal  2  is  Dog
animal  3  is  Cat
animal  4  is  Zebra


<a id='10'></a>
# If Statements

This statement eveluates whether the conditional expression is true or false.  When the condition is true, the statement executes otherwise an "else" statement can provide an alternative statement for Python to execute.  "elif" (else if) is used as an additional if statement within the same loop logic.

In [157]:
age = 20

if age < 15:
    print('can not drive')
elif age == 15:
    print('has a temp license')
elif age >= 16:
    #nested if statement
    if age > 21:
        print('most likely has a drivers license')
    else:
        print('potentially has a drivers license')
else:
    print('there is an error with the age') #wont ever be evaluated since all ages are covered with our if/elif statements

potentially has a drivers license


<a id='11'></a>
## Try and Except Clauses

Python allows you to write code that can allow errors that don't break your code with try and except clauses which are structured as follows:

```python
try:
    <try suite>
except <exception class> as <name>: 
        <except suite>
```

In [167]:
people = ['Beatrice','Adam','Mary','Matt','Jane','Jordan','David','Hannah','Laura']

#This except clause avoids an error if the person does not have at least 6 letters
for person in people:
    try:
        print(person[5],'is the 6th letter in', person+"'s name")
    except Exception as e:
        print(person, "does not have at least 6 letters in their name")
        
print('\nNotice how this block of code errors out')
for person in people:
    print(person[5],'is the 6th letter in', person+"'s name")

i is the 6th letter in Beatrice's name
Adam does not have at least 6 letters in their name
Mary does not have at least 6 letters in their name
Matt does not have at least 6 letters in their name
Jane does not have at least 6 letters in their name
n is the 6th letter in Jordan's name
David does not have at least 6 letters in their name
h is the 6th letter in Hannah's name
Laura does not have at least 6 letters in their name

Notice how this block of code errors out
i is the 6th letter in Beatrice's name


IndexError: string index out of range

## -------------PRACTICE-------------

1. Multiply 127 * 9853 then divide that by 311.  What is the remainder?

2. Assign the variables "test_var1" to the value 2072 divided by 96.  Print if this variable is equal to 6.  Then print what the variable test_var1 is equal to.

3. Using the "alphabet" variable, print the last 4 letters.

4. Edit the code below to use a pass statement to skip printing 32 and 42.  This one may be a bit challenging.

In [190]:
count = 2
while count < 100:
    if count >= 30 and count <= 50:
        count = count + 10
        continue
    print('I am', count, 'years old')
    count = count + 10

I am 2 years old
I am 12 years old
I am 22 years old
I am 52 years old
I am 62 years old
I am 72 years old
I am 82 years old
I am 92 years old


5. For the items in the list "groceries" below, only print the items that have the second letter as a vowel.  You may want to use the "in" keyword and the list of vowels provided below.

In [None]:
vowels = ['a','e','i','o','u']
groceries = ['apple','bananas', 'grapes', 'bread', 'milk', 'poptarts','coffee','cabbage','lettuce','cereal','spinach']


<a id='12'></a>
# Data Structures

## Lists 
Sequence of Python objects

In [125]:
#Create blank list
list1 = list()
print(type(list1))

# Empty list type
print(type([]))

<class 'list'>
<class 'list'>


In [126]:
# Append to list
list1.append('hello')
list1.append('world')
print(list1)

#remove item from list
list1.pop(1)
print(list1)

# Merge list
list1 = list1 + ['people']
print(list1)

# Multiply the list
list1 = list1 * 3
print(list1)

['hello', 'world']
['hello']
['hello', 'people', 'hello', 'people', 'hello', 'people']


In [None]:
# list of numbers
even_nbrs = list(range(0,20,2)) # range has lazy evaluation
print (even_nbrs)

In [None]:
# supports objects of different data types
z = [1,4,'c',4, 2, 6]
print (z)

# list length (number of elements)
print(len(z))

# it's easy to know if an element is in a list
print ('c' in z, 'd' in z)

print (z[2])  # print element in position #2

In [None]:
# lists can be sorted, 
# but not with different data types
z.sort()

In [None]:
#Get rid of different data type
z.pop(2)
z.sort()
z

In [None]:
print(z.count(4))  # how many times is there a 4

In [None]:
# print all even numbers up to an integer
for i in range(0,10,2):
    print (i)

In [None]:
# list comprehesion is like f(x) for x as an element of Set X

# S = {x² : x in {0 ... 9}}
S = [x**2 for x in range(10)]
print (S)

# All even elements from S

# M = {x | x in S and x even}
M = [x for x in S if x % 2 == 0]
print (M)

## Sets 
Collection of unique elements

In [128]:
# a set is not ordered
a = set([1, 2, 3, 3, 3, 4, 5,'a']); print (a)

{1, 2, 3, 4, 5, 'a'}


In [129]:
b = set('abaacdef'); print (b) # not ordered

{'d', 'a', 'c', 'e', 'b', 'f'}


In [130]:
print (a|b) # a or b

{1, 2, 3, 4, 5, 'd', 'a', 'c', 'e', 'b', 'f'}


In [131]:
print(a&b) # a and b

{'a'}


In [133]:
c = set([1,3,5,'c', 'e']) 
print ((a|b) - c) # (a or b) but not in c

{2, 4, 'd', 'a', 'f', 'b'}


In [134]:
a.remove(5); print (a) # removes the '5'

{1, 2, 3, 4, 'a'}


## Dictionaries: 
Key, Value pairs

In [169]:
# Dictionaries have many ways of creation
# Create a dictionary through assignment
b1 = {'Lemonade': 70, 'Automobile': 38, 'Tourism':94}              
print ("Use key 'Lemonade' to access value: ", b1['Lemonade'])

Use key 'Lemonade' to access value:  70


In [172]:
# Another way, start with empty dictionary
dictionary = {}
dictionary['Lemonade'] = 70
dictionary['Automobile'] = 38
print (dictionary['Lemonade'])

70


In [173]:
# traversing by key
# key is imutable, key can be number or string
for k in b1.keys():
    print (k)

Lemonade
Automobile
Tourism


In [174]:
# traversing by values
for v in dictionary.values(): print(v)

70
38


In [175]:
# traverse by key and value is called item
for k, v in b1.items():                # tuples with keys and values
    print (k,v)

Lemonade 70
Automobile 38
Tourism 94


## -------------PRACTICE-------------

1. For the items in the list "groceries" print out the even grocery items.  Then remove those items.  After that, add the item "pizza" to that list.

2. Print the 5 item of the list "groceries" and its length.

3. Create a dictionary with your top 5 movies as the keys and your rating out of 100 for them.  Then print out the movie titles one by one.

<a id='13'></a>
# Functions
Functions allow you to avoid repeating code by putting it into a format that you can use throughout your program.  A function is classified as:

operator(operand1 , operand2 , ... , operandN)

The function is evaluated by python as follows:
1. operator
2. operands
3. Applying the operator to the [evaluated] operands


In [141]:
# Say we want to multiply 8*7 and 5*3 and 6 * 12 but don't want to repeat the code to do so every time...
# We can then create a multiplier function that gives us a template to do so

def multiplier(a,b):
    """Multiplies a functions arguments against each other 
    and prints out the multiplication inputs and result"""
    print('I am multiplying',a,'times',b)
    answer = a*b
    return(answer)

print(multiplier(8,7))

print(multiplier(6,12))

I am multiplying 8 times 7
56
I am multiplying 6 times 12
72


In [None]:
# As always, functions exist within the scope of the whole notebook, not just the cell they are defined in
multiplier(5,3)

## Higher Order Functions

Functions can also take other functions as arguments.

In [152]:
def happy_greeter(gender,time):
    """Gives a happy greeting to a string inputted gender ("M","F",string of user input) and time of day string input"""
    greet="Good "+str(time)+", "
    if gender=="F":
        greet+="Ma'am"
    elif gender=="M":
        greet+="Sir"
    else:
        greet+=gender
    greet+="! "
    return greet

def mad_greeter(gender,time):
    """Gives a mad greeting to a string inputted gender ("M","F",string of user input) and time of day string input"""
    greet="Horrible "+str(time)+", "
    if gender=="F":
        greet+="Ma'am"
    elif gender=="M":
        greet+="Sir"
    else:
        greet+=gender
    greet+="! "
    return greet

def welcomer(greet_type, times):
    """Takes a string of the function (greet_type) and integer (times) to greet a group of people a certain amount of times"""
    return int(times)*greet_type

#Notice the first argument of the welcomer function is the mad_greeter function 
#that has the arguments "M" and "Morning"
welcomer(mad_greeter("M","Morning"),5)

'Horrible Morning, Sir! Horrible Morning, Sir! Horrible Morning, Sir! Horrible Morning, Sir! Horrible Morning, Sir! '

<a id='14'></a>
# Random
## User input

In [None]:
print ("What is your name?")
string = input()  # returns a string
print ("Your name is", string)

## Import packages

In [None]:
import math as ma

#Using the imported package using what we renamed it to
#Dot notation
ma.sqrt(100)

## Return vs. Print

Most functions that you define will contain a return statement. The return statement will give the result of some computation back to the caller of the function and exit the function. For example, the function square below takes in a number x and returns its square.

When Python executes a return statement, the function terminates immediately. If Python reaches the end of the function body without executing a return statement, it will automatically return None.

In contrast, the print function is used to display values in the Terminal. This can lead to some confusion between print and return because calling a function in the Python interpreter will print out the function's return value.

However, unlike a return statement, when Python evaluates a print expression, the function does not terminate immediately.

source: https://cs61a.org/lab/lab01/

In [None]:
import random

def infuriating_function(a):
    """This function prints a different value than it returns"""
    print("This is what I am printing", a)
    b = random.random()
    return b

stored_variable = infuriating_function(10)
print("This is what I am returning", stored_variable)

## -------------PRACTICE-------------

1. Create a function that asks for a person's age and store that variable as an integer called person_age.

2. Create a function called mult2 that takes as input a number, a multiplier that multiplies the number and an adder which adds that number to the previous multiplication.

<a id='15'></a>
# Weekly Readings/Videos

https://blog.trinket.io/why-python/
    
https://towardsdatascience.com/top-16-python-applications-in-real-world-a0404111ac23

<a id='16'></a>
# Extra Practice

<div id='bottom'></div>