# General Test and Review Advice
* The multiple choice portion of the exam will account for 40% of your mid-term grade, while the programming portion will account for 60%.
* The multiple choice portion is entirely closed book (taken through Respondus Browser) and primarily will assess your knowledge of major Python constructs and syntax.  Sessions 3-15 (and this review) are most pertinent for this portion.
* The programming and analysis portion will require some very basic data cleaning and summarization tasks to be performed.  You will have the option of submitting this using either .ipynb (recommended) or .py format.  Sessions 15-20 are the most pertinent for this portion.

# Topic 1: Flow of Control, Conditions, and Iteration
* A crucial aspect to programming is the order in which operations are executed (flow of control).
* Like most languages, default execution of statements in Python is linear and in sequential order.
* Python control statements enable one to subvert this standard flow of control. 
    * Selection statements provide the means to execute one or more statements should a condition or conditions be met.
    * Iteration statements repeat one or more statements (potentially with modification) should a condition or conditions be met.

## Selection Statements (Session 3)
* Three selection statements that execute code based on a condition—an expression that evaluates to either `True` or `False`: 
    * `if` performs an action if a condition is `True` or skips the action if the condition is `False`. 
    * `if`…`else` statement performs an action if a condition is `True` or performs a different action if the condition is `False`. 
    * `if`…`elif`…`else` statement performs one of many different actions, depending on the truth or falsity of several conditions. 
* Anywhere a single action can be placed, a group of actions can be placed (make sure you understand how indentation and suites work!). 

In [None]:
#A sample if...elif...else statement in action (see also Boolean Operators on Another Slide)
numwins = int(input('Enter the number of games your team won: '))
numloss = int(input('Enter the number of games your team lost: '))

if (numwins+numloss<1):
    print('Are you sure you played any games this season?')    
elif (numwins>numloss):
    print('Congratulations on the winning season.')
elif (numwins==numloss):
    print('50/50 is not that bad.')
else:
    print('A shame -- better luck next season.')

## Iteration Statements (Session 3)
* In Python, iteration statements perform an action so long as a condition(s) holds _or_ for a finite set of repititions connected to an **iterable** argument.
    * The former is addressed by the **while** statement.
    * The latter is addressed by the **for** statement.
* As with selection statements, a group of actions can be substituted for a single action.

## `while` Statement
* Repeats one or more actions while a condition(s) remains `True`. 
* To prevent an infinite loop, something in the `while` suite must change the loops evaluating condition to eventually becomes `False`. 
* Unlike its `for` loop, Python's `while` loop is relatively standard to programming. 

In [None]:
#sample while loop to find the largest power of 2 less than some provided number
capnum = int(input('Input a number: I will find the largest power of 2 less than your number: '))
product = 1
while product*2<capnum:
    product = product*2
print('The largest power of two less than your number is ' + str(product) + '.')

## `for` Statement
* The for loop repeat an action or several actions for every item in a sequence of items.
* Any **iterable** expression may be used with a `for` loop.
* It is akin to **for each** loops in other languages (ex: Java)
* Unlike the `while` loop, we generally don't worry about infinite loops with Python's `for` statement.
* "Vanilla" for loop behavior can be obtained through the use of the `range` generator.

In [None]:
#Remember that the range function's end-interval argument is NOT inclusive
#Also make sure you understand how the 1, 2, and 3 argument variations of range work.
print ('The positive odd numbers less than 100 are as follows:')
for i in range(1, 101, 2):
    print(i, end = ' ')
    

## More on Iterables
* The sequence to the right of the `for` statement’s in keyword must be an iterable. 
* One of the most common iterables is a list, which is a comma-separated collection of items enclosed in square brackets (`[` and `]`). 
* Remember that many other structures and expressions (ex: strings, can also be used)

In [None]:
#Using a list as an iterable
for number in [2, -3, 0, 17, 9]:
    total = total + number
print(total)    

In [None]:
#Using a string as an iterable
for char in 'apple':
    print(char)
    

## Other Elements Related to Topic 1 (Session 4)
* You should have a basic idea of the follow syntax and concepts:
    * `break` and `continue` statements 
    * `and`, `or`, and `not` operators
    *  /, //, and %
    * Operator precedence

In [None]:
inpcount = 0; sum = 0;
while True:
    curinput = int(input('Enter a minimum of three positive integers to sum (enter 0 to quit): '))
    if curinput>0:
        inpcount +=1;
        sum+=curinput
    if (inpcount >=3) and (curinput == 0):
        break;    
print(sum)

# Topic 2: Functions
* A function is a block of code devoted a particular task or set of tasks.
* It may or may not have input parameters, and may or may not return an output parameter.
* Functions can themselves be passed as arguments to other functions (Higher Order Functions)
* Many basic mathematical, programmatic, and logical operations already have an efficiently written implementation (AKA don't reinvent the wheel.)


## Function Basics (Session 5)
* Definition begins with the (**`def` keyword**, followed by the function name, a set of parentheses and a colon (`:`). 
* By convention function names should begin with a lowercase letter and in multiword names underscores should separate each word. 
* Required parentheses contain the function’s **parameter list**.
* Empty parentheses mean no parameters. 
* The indented lines after the colon (`:`) are the function’s **block** -- a special kind of suite.
* Functions are invoked using their name and appropriate expressions to be passed as parameters.


In [None]:
def sayhello():
    print("Hello.  It's nice to meet you.")

for i in range(0,3):
    sayhello()

## Functions and Scope (Session 5)
* Parameters exist only during the function call. 
* They are created on each call to a function that has parameters.
* They are then destroyed when the function returns its result to the caller. 
* A function’s parameters and variables defined in its block are all **local variables**.

In [None]:
def minval(value1, value2, value3):
    """Return the minimum of three values."""
    minval = value1
    if value2<minval:
        minval = value2
    if value3<minval:
        minval = value3
    return minval

print(minval(5, -10, 7))

## Other Details Relevant to Topic 2 (Sessions 5 and 6)
* You should have a basic idea about the following:
    * Methods (functions called from an object)
    * Importing functions from modules
    * Random number generation (e.g. `randrange`)
    * matplotlib (just its general purpose)
    * default parameter values

In [None]:
def computeArea(radius=1): #default assumption is a unit radius circle
    return 3.14159*radius**2

print(computeArea(10))
print(computeArea())


In [None]:
from random import randrange
inp = 'y'
while inp == 'y':
    print(f'Rolling the Die: {randrange(1,7)}')
    inp = input('Roll again? (y/n): ')

# Topic 3: Lists, Tuples, and Sets
* Lists, tuples, and sets are among the most fundamental data structures in Python
* All three store **collections** of elements, but each has specific properties that set it apart from the other two.
* All three structures can store _heterogenous_ objects -- that is, a mixture of different data types and objects.
* All three allow for easy construction using one of the other two types.

## List Basics (Sessions 6-7)
* Lists are the most fundamental **mutable** collection data type in Python -- that is, elements may be added, deleted, or modified at any time.
* Comma-separated elements within square brackets `[ele1, ele2, etc.]` provide the standard means of creating a list with a given set of elements
* An empty list may be created using empty square brackets
* Any iterable expression may be used in constructing a list (including tuples or sets)
* Simple arithmetic and assignment operators can be used to combine lists (ex: `+` to concatenate)
* See also section below regarding list comprehensions

In [None]:
list1 = [1, 2, 3, 4] #a list of integers
list2 = ['alpha','beta','gamma'] #a list of strings
list3 = [] #an empty list
listcomb = list1+list2+list3 #concatenate the three lists
print(listcomb)

In [None]:
mytuple = (1, 4, 9)
mylist = list(mytuple) #construct a list using tuple's elements
print(mytuple)
print(mylist)

## List Operations (Session 6)
* While you need not know every operation or method available to a list off-hand, you should at the minimum know the following:
    * `index` returns the first indexed location of an object in a list, while `in` may be used to test if a particular element is **(remember that `in` can be used with a variety of other data structures in Python as well!)**
    * `remove` deletes the first instance of a given item in a list.
    * `append` is used to add a single element to a list, while `extend` adds an iterable expression to a list element by element.
    * `copy` is used to create a shallow copy of a list such that its top-level objects are duplicates of the original list (compare this to assignment operator!)

In [None]:
mynums = [1, 1, 2, 3, 5, 8, 1]
print(mynums.index(3)) #find position of 3 in list
mynums.remove(1) #removes first instance of 1 in list
print(mynums)
mynums.append(13)
print(mynums)

In [None]:
mynumsV2 = mynums
mynumsV3 = mynums.copy()
mynums[0] = 10
mynumsV2[1] = 20
mynumsV3[2] = 30
print(mynums)
print(mynumsV2)
print(mynumsV3)

## Indexing via Slices (Session 7)
* Slices may be used to index a specific range of elements in a list (for access or assignment)
* Make sure you are comfortable with the three arguments  of slicing index syntax (**\[arg1:arg2:arg3\]**)
    * If present, the first argument indicates the initial index for the slice (default of 0)
    * If present, the second argument indicates the final index for the slice (exclusive, default is the length of the list)
    * If present, the third argument provides a step-size (default of 1)
    * Remember that negative step-sizes can be used for reverse indexing.
* Slices may be used (albeit to a limited extent) with other structures.

In [None]:
mysquares = [1, 4, 9, 16, 25, 36]
print(mysquares[3:]) #all elements at position 3+
print(mysquares[:3]) #all elements up to (but not including) position 3
print(mysquares[::-2]) #every other element in reverse order 


In [None]:
mywords = ['It', 'is', 'a', 'quiet', 'evening']
mywords[3:5] = ['bustling', 'morning'] #Using slice notation with assignment
print(mywords)

## List Comprehensions (Session 9)
* List comprehensions provide a crucial means for applying a iteration mechanism **and** conditions while constructing a list
* In constructing a new list, the simplest list comprehension format we can use is \<list_name\>=\[item `for` item in \<iterable expression\>\]
* We can include an `if` clause to include a condition that selects items from the iterable expression.
* List comprehensions with the appropriate constructors provide a convenient means of constructing _other_ collection types using comprehensions

In [None]:
mybase = [1, 2, 3, 5, 8, 13]
mytriples = [3*item for item in mybase] #list comprehension using another list
print(mytriples)

In [None]:
myevensquares = [newitem**2 for newitem in range(1,11) if newitem%2 == 0]
print(myevensquares)

In [None]:
mynumset = set([item//3 for item in range(0,20)]) #Creating a set using list comprehension
print(mynumset)


## Tuples (Session 7)
* Tuples are effectively immutable (cannot be changed after creation) lists.
* Tuples are constructed using either parentheses `()` or the `tuple` keyword.
* Single element tuples include a trailing `,`
* Multi-argument return arguments from functions are often handled using tuples.

In [None]:
mytuple1 = (1, 2, 3) #tuple constructed using standard parenthetical construction
mytuple2 = tuple(['a', 'b', 'c']) #tuple constructed from list
mytuple3 = (1,) #single element tuple
print(mytuple1)
print(mytuple2)
print(mytuple3)


In [None]:
mytuplecomb = mytuple1 + mytuple2
print(mytuplecomb) #creating a new tuple through concatenation is acceptable

In [None]:
mytuple[2] = 'z' #but modifying an existing tuple is not!

## Sets (Sessions 11-12)
* Set are **unordered** collections consisting of **unique values**. 
* Sets may only contain **immutable objects**, like strings, `int`s, `float`s and tuples that contain only immutable elements. 
* Unlike lists, sets do not support indexing and slicing. 
* Set creation is similar to that of lists or tuples, but uses curly braces.
    * **An empty set must be created using `set()` notation, since {} denotes an empty dictionary.**
* You should be familiar with the following operations in references to sets: `in`, `len`, `union`, `intersection`, `difference`, `add`, `remove`

In [None]:
workproj1 = {'Mon','Wed','Fri'} #set creation using braces
workproj2 = set(['Tue','Wed','Thu']) #set creation using constructor
busydays = workproj1.union(workproj2) #Note this is effectively taking the union!
print(busydays) #Note that the resulting set has arbitrary order of elements!
print('Wed' in busydays)
print('Sat' in busydays)


In [None]:
workproj2.remove('Thu')
workproj2.add('Fri')
print(workproj1)
print(workproj2)
proj1solo = workproj1.difference(workproj2) #all days in first set but not in second
print(proj1solo)

# Topic 4: Dictionaries
* A **dictionary** is an _unordered_ collection which stores **key–value pairs** that map immutable keys to values, just as a conventional dictionary maps words to definitions. 
* While each **key** is associated with a single **value**, values may themselves be any arbitrary structure or collection. 
* Key representation has some flexibility, but each must be _immutable_ and unique.

## Dictionary Basics (Session 10)
* Create a dictionary by enclosing in curly braces, `{pair1, pair2, etc.}`, a comma-separated list of key–value pairs, each of the form _key_: _value_.  
* Dictionaries are considered to be _unordered_ collections, and the convention is that code not be written that depends upon key order.
* Use bracket notation with a key-name to access the corresponding value or create a new pairing using the assignment operator.
* We can use dictionary method `keys` to return an iterable of all keys in a dictionary or `values` to return an iterable of all values.
* Alternatively, dictionary method `items` returns each key–value pair as a tuple.

In [None]:
SpringCourses = {} #empty dictionary
SpringCourses['Karem'] = ('CSE302','CSE310','CSE590') #Create a new key-value pair
SpringCourses['Narovsky'] = ('MAT403','MAT425') #Create another key-value pair
print(f'{"Instructor":>20}{"Course":>20}') 
for ins,courses in SpringCourses.items(): #Iterate over key-value pairs
    for course in courses: #Iterate over tuple of courses
        print(f'{ins:>20}{course:>20}')

## Views and Other Dictionary operations (Session 10-11)
* Methods `items`, `keys` and `values` each return a **view** of a dictionary’s data. 
    * When you iterate over a **`view`**, it “sees” the dictionary’s **current contents**—it does **not** have its own copy of the data.
* The safest way to retrieve keys in sorted order is to run the `sorted` function on the keys, which will return an in-order iterable based on element type.
* You should also be familiar with `len`, `del`, `get`, and `in` in the context of dictionaries.
    * Some operations have nuances with dictionaries that are not present with simpler collections (Ex: recall that `in` only looks at _keys_ and not _values_)

In [None]:
dayweek = {'Sun':1,'Mon':2,'Tue':3, 'Wed':3,'Thu':5,'Fri':6, 'Sat':7}
print('Sun' in dayweek)
print(1 in dayweek)
print(1 in dayweek.values())



In [None]:
dwitems = dayweek.items()
dayweek['Wed'] = 4
for dname,day in dwitems:
    print(f'Day {day}: {dname}')

# Topic 5: NumPy Arrays and Pandas Dataframes
* NumPy arrays provide the preferred high-performance and efficient Python array representation.
    * Array processing is much faster than list processing.
    * Arrays can have an arbtrary number of dimensions.
* Pandas `series` and `dataframes` are ideal for addressing heterogenous data and multiple variables in data
    * The Pandas structures are built using numPy arrays as a basis.
    * They generally include much more intuitive methods of accessing data and performing analysis.
* Note: most dataframe manipulation and analysis will be addressed in the programming portion of the test -- if you have not reviewed Session 20, it is **highly recommended** that you do so. 

## Array Basics (Session 12)

* NumPy arrays are usually generated from existing data structures using the **`array`** function, whose argument must be an `array` or other iterable.
* The result is a **new** `array` containing the argument’s elements.
* Array access is similar to that of lists (i.e. via brackets)
    * Slice notation is also available.
* The `.dtype` method indicates the data type of the array, while `ndim` and `shape` detail the dimensions and shape of the array

In [None]:
import numpy as np
myarray = np.zeros((4,5)) #a 4 by 5 array of zeros
for i in range(0,myarray.shape[0]):
    for j in (range(0,myarray.shape[1])):
        myarray[i][j] = i**2 + j #make array assignments based on looping values
print(myarray)        


## Other Array Operations (Session 13)
* At a minimum, you should be familiar with the following array concepts/operations:
    * Array **broadcasting**
    * How comparators (==, >, etc.) work with arrays
    * Numpy Standard calculation methods (`mean`, `max`, etc.) 


In [None]:
myarr1 = np.array([1, 4, 9 , 16, 25])
myarr2 = np.array([2, 2, 2 , 2, 2])
print(myarr1*myarr2) #multiplying  by a uniform array of equal size
print(myarr1*2) #broadcasting with a scalar has the same effect

In [None]:
print(myarr1>myarr2) #element-wise comparison between 2 arrays

In [None]:
print(myarr1.mean())

## pandas `Series` (Session 14)
* A pandas series is effectively an enhanced one-dimensional `array`
* It supports custom indexing, including even non-integer indices like strings
* Offers additional capabilities that make them more convenient for many data-science oriented tasks
    * `Series` may have missing data
    * Many `Series` operations ignore missing data by default
* The `pd.Series` constructor can be used to create a series from any iterable

In [None]:
import pandas as pd
myseries = pd.Series([10, 7.5, 5, 2.5, 0])
print(myseries)


In [None]:
print(myseries.describe())

## pandas `DataFrames` (Session 15)

* `Dataframes` provide an enhanced two-dimensional `array` structure.
* They offer additional operations and capabilities that make them fundamental for many data-science oriented tasks
* Support missing data (important for real world considerations)
* Each column in a `DataFrame` is a `Series`
* The two principal ways we have observed data-frame creation are via
    * The  `pd.DataFrame` constructor with a dictionary argument
    * Loading from a `csv` file using `pd.read_csv`

In [None]:
import pandas as pd
fiscal_dict = {'Tr Type': ['Purchase', 'Sales', 'Sales', 'Rental'], 'Item/Svc Code': ['001AB', '1CCC9', 'QX900','TM211'],
               'Credit/Debit': [-900.32, 2000.35, -540.53, -412.12], 'Units': [200, 1000, 15, 1]}
fiscal_df = pd.DataFrame(fiscal_dict) #dataframe with homogenous data
print(fiscal_df)

In [None]:
print(fiscal_df.describe())