# Module 3: Python Syntax, Data Types & Control Flow

## Python Syntax

- Syntax refers to the structure of the language (i.e., what constitutes a correctly-formed program). 


- If your Python code lacks proper syntax, it will not execute properly!!


A simple example from "__A Whirlwind Tour of Python__":

In [1]:
''' This is a multiline comment. You form multiline comments
by enclosing the desired text within triple single quotation marks'''

# set the midpoint
midpoint = 5

# make two empty lists
lower = []; upper = []

# split the numbers into lower and upper
for i in range(10):
    if (i < midpoint):
        lower.append(i)
    else:
        upper.append(i)
        
print("lower:", lower)
print("upper:", upper)

lower: [0, 1, 2, 3, 4]
upper: [5, 6, 7, 8, 9]


### Note the following:

- Comments in Python are most frequently demarcated by the pound sign ('#') character. Everything following a pound sign in a line of Python code is ignored by the Python interpreter.


- It is possible to add multiline comments to your code via the use of triple single quotes on both ends of your commentary.


- Multiple Python statements can be illiterated within a single line of code. To do this, separate the statements using a semicolon(';'). However, for clarity/readability, the practice of placing multiple statements on a single line is generally discouraged.


- Indentation (rather than special characters or brackets) is used to delimit subsections (aka __blocks__) of Python code. In Python, each line of code that __precedes__ a subsection/block must end with a colon (':')


- All indented code within a subsection/block __MUST__ be preceded by the exact same number of blank spaces. If you fail to adhere to this rule, your code will not execute properly!!


- Parentheses are used for both grouping statements/equations and for invoking Python functions.


- Python __methods__ are built-in functions that are inherently associated with a Python object, e.g., __lower.append()__, __upper.append()__.


- Methods are accessed via the syntax 'object.method()'


- Methods can both provide information about an object AND enable the modification of the object "in place"

In [2]:
# use the dir() function to get a list of methods for an object

f1 = 4.798
dir(f1)

['__abs__',
 '__add__',
 '__bool__',
 '__class__',
 '__delattr__',
 '__dir__',
 '__divmod__',
 '__doc__',
 '__eq__',
 '__float__',
 '__floordiv__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getformat__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__int__',
 '__le__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__neg__',
 '__new__',
 '__pos__',
 '__pow__',
 '__radd__',
 '__rdivmod__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rfloordiv__',
 '__rmod__',
 '__rmul__',
 '__round__',
 '__rpow__',
 '__rsub__',
 '__rtruediv__',
 '__set_format__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__sub__',
 '__subclasshook__',
 '__truediv__',
 '__trunc__',
 'as_integer_ratio',
 'conjugate',
 'fromhex',
 'hex',
 'imag',
 'is_integer',
 'real']

In [3]:
# the list of methods will vary depending on the data type of the object
dir(lower)

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__imul__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__rmul__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'append',
 'clear',
 'copy',
 'count',
 'extend',
 'index',
 'insert',
 'pop',
 'remove',
 'reverse',
 'sort']

## Introspection

- Use a question mark ('?') either before or after a variable to get basic info on its type, structure and content.

In [2]:
# example of introspection using variable from earlier example
lower?

## Binary Math and Comparison Operators

__Math__: +, -, \*, /, // (floor, i.e, drop the remainder), \** (exponent), % (modulo, i.e., returns the remainder of division)

__Comparisons__: <, >, <=, >=, & (logical AND), | (logical OR), ^ (exclusive OR), == (equals), != (does not equal)

__Particular to Python__: is, is not (when testing whether two Python variables reference the exact same Python object)

### Some Simple Math Examples:

In [5]:
5 + 2

7

In [6]:
8 - 17

-9

In [28]:
# multiplication
3 * 9

27

In [29]:
# division
33/11

3.0

In [7]:
# "floor", i.e., divide and drop the remainder
8//5

1

In [8]:
# exponent
2**5

32

In [56]:
# modulo
9 % 7

2

### Some Simple Comparison Examples

In [32]:
5 > 7

False

In [33]:
5 < 7

True

In [34]:
# is 5 greater than (2 + 3)?
5 >= 2 + 3

True

In [35]:
# is 5 less than or equal to (7-4)?
5 <= 7 - 4

False

In [36]:
# are both 7 and 5 greater than 4?
7 & 5 > 4

True

In [37]:
# are both 7 and 5 greater than 6?
7 & 5 > 6

False

In [3]:
5 & 4

4

In [38]:
# is either 7 or 5 greater than 6?
7 | 5 > 6

True

In [42]:
# exclusive OR: at least one argument must be TRUE but not both
True ^ False

True

In [5]:
# exclusive OR: at least one argument must be TRUE but not both
True ^ True

False

In [43]:
# exclusive OR: at least one argument must be TRUE but not both
False ^ False

False

In [44]:
# is 5 equal to (2 + 3)?
5 == (2 + 3)

True

In [1]:
# is 5 NOT equal to (2 + 3)?
5 != (2 + 3)

False

## Basic Data Types

- integers (int)


- floating point (float)


- strings


- dates & times


Data types are generally defined implicitly when you declare a variable

In [9]:
# define an integer
x = 516
type(x)

int

In [1]:
# define a float
y = 516.7
type(y)

float

In [2]:
# define a string
z = "this is a simple string"
type(z)

str

In [52]:
# Adding an int to a float yields a float
newvar = x + y
type(newvar)

float

In [3]:
# Strings can be concatenated using the '+' operator
newstring = z + "- and this is an extension to the string"
print(newstring)

this is a simple string- and this is an extension to the string


In [57]:
# Not possible to add a int or float to a string
z + y

TypeError: can only concatenate str (not "float") to str

In [3]:
# dates + time manipulation is made easy via the pre-built datetime module
from datetime import datetime, date, time
dt = datetime(2011, 10, 29, 20, 30, 21)
print(dt)
print(dt.day)
dt.minute

2011-10-29 20:30:21
29


30

## Lists

- A list is a Python __data structure__.


- Unlike a simple data type, a data structure is a __collection__ of data values, the elements of which may or may not share the same data type.

In [8]:
# how to create an empty list
emptylist = []

# example of a list containing data
list1 = [1,2,3,4,5]
list2 = [6,7,8,9,10]

# concatenation of two lists via the '+' operator
list3 = list1 + list2

print(list3)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


In [10]:
# example of a list containing mixed data
mixedlist = [1, 'far', 3.14159, 'pi']

print(type(mixedlist[2]))
print(type(mixedlist[3]))

<class 'float'>
<class 'str'>


## Indexing: How to access data in a list

Elements in a list can be accessed via their corresponding index value

Index values start at 0(zero) and end at (1 - length of list). This is known as "__zero-based counting__" and is a remnant of the early days of computer programming.

So the index values of the __mixed__ list shown above are 0, 1, 2, 3

In [11]:
# example of list indices  

print(mixedlist[0])
print(type(mixedlist[0]))
      
print(mixedlist[3])
print(type(mixedlist[3]))

1
<class 'int'>
pi
<class 'str'>


## What happens if we try to use an invalid index?

In [6]:
# invalid index
mixed[4]

IndexError: list index out of range

## Slicing: Extracting a portion of a list


In [20]:
slist = [1,2,3,4,5]

# display the first two items from the list
print slist[0:2]

[1, 2]

In [21]:
# display items 3 and 4 from the list
print(slist[2:4])

[3, 4]


In [22]:
# display the first 3 items from the list
print(slist[:3])

[1, 2, 3]


In [23]:
# display all items from the list starting with the second item
print(slist[1:])

[2, 3, 4, 5]


## Built-In Functions

In [4]:
# Some of the most frequently used Python built-in functions
print(newstring)

this is a simple string- and this is an extension to the string


In [5]:
# get the length of an object
print(len(newstring))

63


In [61]:
# sum a list of integers
listsum = sum([1, 2, 3, 4])
print(listsum)

10


In [65]:
# get the max of a list
listmax = max([1, 2, 3, 4])
print(listmax)

4


In [66]:
# get the min of a list
listmin = min([1, 2, 3, 4])
print(listmin)

1


In [63]:
# change data type of content of a variable when assigning it to a new variable
# changing a data type is referred to as "type casting"
newint = int(newvar)
type(newint)

int

In [64]:
# change an integer to float
newfloat = float(x)
type(newfloat)

float

In [6]:
# range() automatically generates a sequence of values
# often used to facilitate iteration in a loop statement
rx = range(3)
print(rx[0])
print(rx[1])
print(rx[2])

0
1
2


In [7]:
# as with a list, range valid indices go from 0(zero) to (1 - length of range)
# so if we attempt to access an index value that is too large relative to the data object
# we will get an error message
print(rx[3])

IndexError: range object index out of range

A reference guide to Python's built in functions is posted in Canvas

https://docs.python.org/3/library/functions.html

More usage to come in the upcoming weeks !!

## Control Flow

Three basic control flow constructs:

- if - then - else

- while loop

- for loop


### if - then - else
    
Logic:
&nbsp;&nbsp;__if__ some condition is true, do something; <br />
&nbsp;&nbsp;__else if__ some other condition is true, do something else <br />
&nbsp;&nbsp;__else__ (i.e., if none of the preceding conditions are true), perform a "catchall" task

In [73]:
# if then else example taken from "A Whirlwind Tour of Python"
# Note that 'elif' is Python syntax for "else if"
x = -15

if x == 0:
    print(x, "is zero")
elif x > 0:
    print(x, "is positive")
elif x < 0:
    print(x, "is negative")
else:
    print(x, "is unlike anything I've ever seen...")

-15 is negative


### while loop

Logic:
    iterate until a condition is met
    

In [67]:
# while loop example from "A Whirlwind Tour of Python"
i = 0
while i < 10:
    print(i, end=' ')
    i += 1

0 1 2 3 4 5 6 7 8 9 

### for loop

Logic: 
    iterate a finite number of times, e.g., by sequentially iterating through the elements of a list or a finite range of numeric values

In [2]:
# a simple for loop
for i in range(10):
    print(i, end=' ')

0 1 2 3 4 5 6 7 8 9 

### Nested for loops

Logic: iterate a finite number of times across each of TWO iterators

In [6]:
# a nested for loop
# let's start with a list of lists
nlist = [[1, 2, 3], ['A', 'B', 'C'], [4, 5], ['D', 'E']]

# a new list we will fill with the individual items of the 'nlist' lists
flist = []

# nested for loop: 'for x in nlist' extracts a sublist from 'nlist'
# 'for y in x' extracts each item from that sublist
for x in nlist:
    for y in x:
        flist.append(y)
print(flist)


[1, 2, 3, 'A', 'B', 'C', 4, 5, 'D', 'E']


### Using 'break' and 'continue' to control your loops

- use 'break' to entirely escape from the loop

- use 'continue' to skip ahead to the next iteration of the loop

In [70]:
# From "A Whirlwind Tour of Python": generating all Fibonacci numbers < 100
a, b = 0, 1
amax = 100
L = []

# iterate until a > amax; when a > amax, break out of the while loop
while True:
    (a, b) = (b, a + b)
    if a > amax:
        break
    L.append(a)

print(L)

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]


In [69]:
# 'continue' example from "A Whirlwind Tour of Python"
for n in range(20):
    # if the remainder of n / 2 is 0, skip the rest of the loop
    if n % 2 == 0:
        continue
    print(n, end=' ')

1 3 5 7 9 11 13 15 17 19 

## Python Classes and Objects

- Object oriented programming is a method of software engineering that relies on the concepts of __objects__ and __classes__.


- A __class__ is basically a "schematic" that defines the attributes and class-specific software functions (known as __methods__) that pertain to a type of user-defined data __object__. When following an object oriented approach to software engineering, we typically start by thinking about the types of data we plan to work with / manipulate, and then create __class__ definitions that will support that data. Once a __class__ is defined, we can create __instances__ of a class whenever we instantiate a new data item that pertains to the __class__ we've defined.


- A __class variable__ is a variable that is shared by all instances of a given class. A very common example of a class variable would be a counter that is used to keep track of the number of objects of a given class that have been instantiated so far.


- An __instance variable__ is a variable whose data value is specific to a given instance of a particular class. 


- __An example__: Let's say we want to create a pre-defined collection of data attributes + associated software functions that can be used for storing and manipulating information about single family houses. We might want to include information such as the street address of the home, the square footage of the house, the room count, the number of bathrooms, the type of materials comprising the exterior of the structure (e.g., brick, wood, cement blocks, etc.) the square footage of the land on which the house is situated, an indicator that tells us whether or not the house has an attached garage, an attribute that indicates the number of vehicles that are legally allowed to be parked/stored on the property, etc. All of these attributes can be encapsulated within a single __class__ that we define for purposes of managing data related to single family homes. Once we've defined our class, we can create a new __instance__ of the class for each house for which we are collecting / managing information.


#### What might such a class definition look like in Python?

In [1]:
'''Common base class for all single family houses'''
class SF_House:
    '''Common base class for all single family houses'''
    
    # define a class variable that will be incremented each time a new 
    # SF_House object is created
    SFH_Count = 0

    # define the structure of the attributes for the class
    def __init__(self, address, sqft, room_count, num_baths, ext_material,
                garage, vehicles):
        # define how user-supplied data values will be assigned 
        # to each instance variable within the class
        self.address = address
        self.sqft = sqft
        self.room_count = room_count
        self.num_baths = num_baths
        self.ext_material = ext_material
        self.garage = garage
        self.vehicles = vehicles
        # increment the counter each time a new instance of the class is created
        SF_House.SFH_Count += 1
        
    # define a class-specific method that will display the count
    # of the number of SF_House objects we've created so far
    def displayCount(self):
        print ("Total Houses %d" % SF_House.SFH_Count)
        
    # define a class-specific method that will display the address
    # and square footage of a house
    def displayAddressAndSqft(self):
        print ("Address : ", self.address,  ", sqft: ", self.sqft)

Now that we've defined our class specification for single family home data, let's create two instances of that class. Note that each instance we are creating represents a new Python data __object__.

In [4]:
# create a new object of class SF_House
h1 = SF_House("734 East 29th Street", 1475, 5, 2, "Brick", 1, 3)

# create another object of class SF_House
h2 = SF_House("22 Acacia Avenue", 2642, 6, 3, "Wood and Stucco", 1, 5)

Now let's use the methods we included within the SF_House class to extract some content from the new objects we've created:

In [13]:
# Display the address and square footage of the second SF_House object we created
h2.displayAddressAndSqft()

# how many SF_House objects have we created so far?
# use the displayCount() method
h1.displayCount()

Address :  22 Acacia Avenue , sqft:  2642
Total Houses 2


Access the doc string for the class we've created:

In [3]:
print ("SF_House.__doc__:", SF_House.__doc__)

SF_House.__doc__: Common base class for all single family houses


Display all attributes of an SF_House object:

In [5]:
print(h1.__dict__)

{'address': '734 East 29th Street', 'sqft': 1475, 'room_count': 5, 'num_baths': 2, 'ext_material': 'Brick', 'garage': 1, 'vehicles': 3}


Display the all methods of a class we've created using the __dir__ function. Note how all Python-provided methods contain double undescore characters both before and after the method name. We must include these double underscore characters when attempting to utilize these methods. 

Also note how the class-specific functions we've defined appear within this list.

In [9]:
dir(SF_House)

['SFH_Count',
 '__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'displayAddressAndSqft',
 'displayCount']

__IMPORTANT__: Object oriented programming serves as the basis for Python’s data types and data structures. For example, when we assign a value to a variable in Python, we are actually assigning that value to an instance of a class that contains many pre-built functions (referred to as “methods”) and whose structure defines how the variable can be utilized within our code.

## Proper Use of Comments in Code

- Comments should be applied within your code in a concise and consistent manner


- A good rule of thumb is to __include an explanatory comment at the start of each code block__ as well as for individual lines of code that may be difficult for other Python users to interpret / understand


__*\**NEVER, EVER leave any Python program you create free of comments!! Such behavior is considered both lazy and sloppy and will not earn you the respect of your peers or superiors.*__


### Which of these two code samples would you prefer to work with?

In [None]:
# CODE SAMPLE 1
# -----------------

# first check the length of the list to ensure it is > 1
    if len(input) < 2:
        return input
        
    # use the first item in the list as the pivot value
    # use a for loop to extract all instances of the pivot value
    pivot = [x for x in input if x == input[0]]
    
    # for loop to extract all items < pivot; sort that new list via recursion
    smallers = sortwithloops([x for x in input if x < input[0] ])
    
    # for loop to extract all items > pivot; sort that new list via recursion
    largers = sortwithloops([x for x in input if x > input[0] ])

In [None]:
# CODE SAMPLE 2
# ----------------------

avg_lat_long_by_ward_df = df[df['longitude'] != 0]. \
groupby(['ward'])['latitude','longitude'].mean().reset_index()
avg_lat_long_by_ward_df.columns=['ward','avg_latitude','avg_longitude']

avg_lat_long_by_lga_df = df[df['longitude'] != 0]. \
groupby(['lga'])['latitude','longitude'].mean().reset_index()
avg_lat_long_by_lga_df.columns=['lga','avg_latitude','avg_longitude']

avg_lat_long_by_region_df = df[df['longitude'] != 0]. \
groupby(['region'])['latitude','longitude'].mean().reset_index()

avg_lat_long_by_region_df.columns=['region','avg_latitude','avg_longitude']


avg_lat_ward_dict = dict(zip(list(avg_lat_long_by_ward_df['ward']),list(avg_lat_long_by_ward_df['avg_latitude'])))
avg_long_ward_dict = dict(zip(list(avg_lat_long_by_ward_df['ward']),list(avg_lat_long_by_ward_df['avg_longitude'])))

### "Hands On" Exercises

1. Write a loop that accepts a string and produces a new string with the characters in reverse order. For example: 'analytics' would become 'scitylana'. Make use of Python's __input__ function to capute the string from a user.

In [None]:
# using input() to capture a user provided string 
word = input("Input a word")
word

2. Write a loop that accepts a string and produces a list composed of items that each have at least 2 sequential characters from the original string, with the exception of the last element of the list which will contain any remaining individual character from the original string. For example: 'analytics' would become ['an', 'al', 'yt', 'ic', 's']