# Python Introduction through jupyter Notebook

Python is a popular programming language used to solve numerous problems.  Python is an open source language like PHP and is licensed uder the Python Software Foundation License, a GPL-compatible license.

## python

* Python is dynamically-typed, you do not need to declare variable type prior to execution.

* Python is whitespace-delimited, meaning that you usually won't use braces to denote code blocks, but will use tabs instead.  This makes debugging more difficult.

* No semi-colon to end statements is necessary.

* Function names and variables names should use underscores to separate words and not mixedCase or camelCase.  So variable_name instead of variableName.  See https://www.python.org/dev/peps/pep-0008/

* Python is pass-by-reference.  If you copy a variable or object, changes to the copy also change the original, even within functions.

* You can "require_once" or import/include other files using the `import` command.

## jupyter (ipython) Notebook

* Within jupyter Notebook, code is written in cells.  Execute a "cell" by clicking Run or pressing Shift-Enter or various other key combos.  See Help -> Keyboard Shortcuts.

* Run code in small increments to build a larger program.  This is excellent for data mining where a program can be built piece-by-piece to manipulate data.

* Variables and other executed code remains in memory as long a the kernel is running.

* Autosave feature is helpful but remember to save your code and version control it.

* Like Vi, jupyter Notebook has a command mode, making code creation and execution easy and mouse-free for speed.



## Additional Info:

See the "Help" menu!

IPython Notebook:  https://nbviewer.jupyter.org/github/ipython/ipython/blob/3.x/examples/Notebook/Index.ipynb 

Language Reference: https://docs.python.org/3/reference/

The Python Tutorial: https://docs.python.org/3/tutorial/index.html#tutorial-index


## 1.1 Basic Python Data Types and Operations

Vanilla python has various standard data types that are familiar from other languages, like Integer, Long Integer, Floating Point, Boolean, Character, and String.  Here are some basic examples:


In [1]:
#Integer
num = 9
print(num, type(num))

#Floating Point
val = 6.7       
print(val, type(val))

#Boolean
sb = True           
print(sb, type(sb))

#String
sentence = "A python string."
print(sentence, type(sentence))

# Execute this code by selecting "Cell" -> "Run Cells" or by pressing Shift-Enter

9 <class 'int'>
6.7 <class 'float'>
True <class 'bool'>
A python string. <class 'str'>


Basic math works like you might expect from other languages.

In [2]:
#Basics
addnum = num + 4
subnum = num - 1
multnum = num * 3       
divnum = num / 2

#Other operators
num += 2           
num2 = num       
num *= 3          

print(addnum,subnum,multnum,divnum,num2)

# or 

print("addnum is: ", addnum)



13 8 27 4.5 11
addnum is:  13


The math module provides some additional functions.

In [3]:
import math

#A built-in for pi
y = 3*math.pi            
print(math.sin(y))       
print(math.tanh(y))      

#Built-in for infinity
i = math.inf
print(math.isinf(i))

#Not a number
n = math.nan             
print(math.isnan(n))


#Other functions
x = 9
print(math.sqrt(x))      
print(math.pow(x,2))     
print(math.exp(x))               
print(math.factorial(x)) 

#floating point
y = 0.2
#Ceiling
print(math.ceil(y))   
#Floor
print(math.floor(y))
#Truncate
print(math.trunc(y))

3.6739403974420594e-16
0.9999999869751758
True
True
3.0
81.0
8103.083927575384
362880
1
0
0


Logical Operators - Use for testing within conditionals and the like

In [4]:
trval = True
flval = False

print(trval and flval)
print(trval or flval)
print(trval and not flval)  

False
True
True


Selected string operations and more

In [5]:
school = "UWSP"

#Length
print(len(school)) 

#Type cast length to a string instead of int
print("Length of string ", school, " is now a string ", str(len(school)))

#change to upper or lower case
print(school.upper())
print(school.lower())

#last three characters
print(school[1:])                   

school_dept = "Computing and New Media Technologies"

#split on a space
split_dept = school_dept.split(' ')
print(split_dept[1])

#Replacement
print(school_dept.replace("New","Old"))

#Concatenate
print(school + " " + school_dept)

#Search for something in the string
print(school_dept.find("Media"))
print("Comput" in school_dept)

#Make 79 equals signs to look like a double line.
print("".join(["="]*79))

4
Length of string  UWSP  is now a string  4
UWSP
uwsp
WSP
and
Computing and Old Media Technologies
UWSP Computing and New Media Technologies
18
True


## 1.2 Sequence and Map Data Types: Lists and Dictionaries

This section covers the list object and dictionaries. Tuples are not covered here, see https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences for more information on tuples.

A list object - Similar to numbered array in PHP.

In [6]:
fiblist = [3,5,8,13,21,34]
print(type(fiblist))
print(fiblist)

#Create a list with the list and range functions
another_list = list(range(0,10,2))
print(another_list)

#Length of list (number of elements)
print(len(fiblist))

#Third element
print(fiblist[2])                

#First two elements
print(fiblist[:2])          

#Final three elements
print(fiblist[2:])              

#Add (sum) all elements in the list.
print(sum(fiblist))

#Append another element, this time the number 55 to the list called fiblist.
fiblist.append(55)

print(fiblist)

#Remove last element and print it.
print(fiblist.pop())

#List after pop() function call
print(fiblist)

#Glue two lists together to form a new list.
print(fiblist + [11,13,15])

#Make four copies of the list
print(fiblist * 4)

fiblist.insert(1,2)            
print(fiblist)

#Reverse the list
fiblist.sort(reverse=True)
print(fiblist)

<class 'list'>
[3, 5, 8, 13, 21, 34]
[0, 2, 4, 6, 8]
6
8
[3, 5]
[8, 13, 21, 34]
84
[3, 5, 8, 13, 21, 34, 55]
55
[3, 5, 8, 13, 21, 34]
[3, 5, 8, 13, 21, 34, 11, 13, 15]
[3, 5, 8, 13, 21, 34, 3, 5, 8, 13, 21, 34, 3, 5, 8, 13, 21, 34, 3, 5, 8, 13, 21, 34]
[3, 2, 5, 8, 13, 21, 34]
[34, 21, 13, 8, 5, 3, 2]


A list can also contain strings:

In [7]:
word_list = ['news','sports','weather']
print(word_list)
print(type(word_list))

#Is the word "weather" in the list?
print("weather" in word_list)

#Same as above, but nicer to read:
print("Is the word weather in the list?", "weather" in word_list)

print("Third element")
#access third element
print(word_list[2])               

#show first two elements
print(word_list[:2])       

#show last two elements
print(word_list[2:])             

['news', 'sports', 'weather']
<class 'list'>
True
Is the word weather in the list? True
Third element
weather
['news', 'sports']
['weather']


More advanced manipulation of lists

In [8]:
#Insert a new element
word_list.append("events")            

separator = " "
print(separator.join(word_list))

#No one reads the news, so remove it.
word_list.remove("news")           
print(word_list)



news sports weather events
['sports', 'weather', 'events']


Working with dictionaries - Similar to named index array in PHP

In [9]:
states = {}
states['MA'] = "Massachusetts"
states['ME'] = "Maine"
states['MI'] = "Michigan"
states['MO'] = 'Missouri'
states['MS'] = "Mississippi"
states['MT'] = "Montana"
states['IL'] = "Illinois"
states['WI'] = "Wisconsin"

print(states)
#Return keys
print(states.keys())          
#Return values
print(states.values())          
#Return total number of key:value pairs
print(len(states))              

#Retrieve the value for the following key:
print(states.get('MT'))
print("WI" in states)

#Make a dictionary from two lists
keys = ['cloudy','sunny','rain','snow']
values = [9,3,2,7]
sky = dict(zip(keys, values))
print(sky)
print(sorted(sky))     # sort based on keys

#Other sorting
from operator import itemgetter
#Sort by key
print(sorted(sky.items(), key=itemgetter(0)))  
#Sort by value
print(sorted(sky.items(), key=itemgetter(1)))   

{'MA': 'Massachusetts', 'ME': 'Maine', 'MI': 'Michigan', 'MO': 'Missouri', 'MS': 'Mississippi', 'MT': 'Montana', 'IL': 'Illinois', 'WI': 'Wisconsin'}
dict_keys(['MA', 'ME', 'MI', 'MO', 'MS', 'MT', 'IL', 'WI'])
dict_values(['Massachusetts', 'Maine', 'Michigan', 'Missouri', 'Mississippi', 'Montana', 'Illinois', 'Wisconsin'])
8
Montana
True
{'cloudy': 9, 'sunny': 3, 'rain': 2, 'snow': 7}
['cloudy', 'rain', 'snow', 'sunny']
[('cloudy', 9), ('rain', 2), ('snow', 7), ('sunny', 3)]
[('rain', 2), ('sunny', 3), ('snow', 7), ('cloudy', 9)]


## 1.3 Control and Flow Statements: Conditionals and Loops

Various forms of control flow exist, similar to other languages.  Basic if syntax follows, along with a for loop and a while loop.  Notice the lack of curly braces and the use of indenting.

In [10]:
test_num = 34

if test_num % 2 == 0:
    print("test_num =", test_num, "is even")
else:
    print("test_num =", test_num, "is odd")

if test_num > 0:
    print("test_num =", test_num, "is positive")
elif test_num < 0:
    print("test_num =", test_num, "is negative")
else:
    print("test_num =", test_num, "is not positive or negative")

test_num = 34 is even
test_num = 34 is positive


In [11]:
#Recall word_list was defined above.

search_words = ['weather','cat video','squirrel','sports']
#Generally stay away from for loops if possible.
#Don't do this to find words:
#for i in search_words:
#    print("Is the word", i, "in the list?", i in word_list)
 
#Instead, to find search words in the list:
print("Results using search for word")
print([word for word in search_words if word in word_list])

#Or cast to a set and use intersection
word_set = set(word_list)
search_set = set(search_words)
print("Using intersection:")
print(word_set.intersection(search_set))

#Get length of each word
length_list = [len(word) for word in word_set]
print(length_list)

#Recall sky variable defined above, this is similar.
#Here's a different way to define a similar dictionary.  Looks a lot like a JavaScript object.
colors = {'Red': 3, 'Orange': 2, 'Yellow': 4, 'Green': 10}
color_names = [k for (k,v) in colors.items()]
print(color_names)

Results using search for word
['weather', 'sports']
Using intersection:
{'weather', 'sports'}
[7, 6, 6]
['Red', 'Orange', 'Yellow', 'Green']


In [12]:
# using while loop

ranged_list = list(range(-20,20))
print(ranged_list)

i = 0
while (ranged_list[i] < 1):
    i = i + 1
    
print("First non-negative number greater than zero:", ranged_list[i])


[-20, -19, -18, -17, -16, -15, -14, -13, -12, -11, -10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
First non-negative number greater than zero: 1


## 1.4 User-Defined Functions

You can create your own functions in Python.  Functions can be named or unnamed.  When unnamed, the lambda function is used, as in the first example set.

In [13]:
line_repeat = lambda c: "".join([c]*79)

print(line_repeat("+"))

line_repeat_num = lambda c,n: "".join([c]*n)

print(line_repeat_num("*",51))

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
***************************************************


In [14]:
import math

precip_list = [0.0,0.1,-2,math.nan,0.34,0.54,0.2,0,-0.2]

#Remove precipitation values that are not a number or are below zero
#Note the use of a default value for an argument
def remove_below_zero(precip_list, sort_nums=False):    
    final_list = []
    for item in precip_list:
        if not math.isnan(item) and item >= 0.0:
            final_list.append(item)
            
    if sort_nums:
        final_list.sort()
    return final_list



print(remove_below_zero(precip_list))    

[0.0, 0.1, 0.34, 0.54, 0.2, 0]


## 1.5 Working with Files

A common data mining task is to work with files.  There are other methods for doing so through libraries such as the read_csv() function in pandas, but Python can also work with files.

In [15]:
states = {}
states['MA'] = "Massachusetts"
states['ME'] = "Maine"
states['MI'] = "Michigan"
states['MO'] = 'Missouri'
states['MS'] = "Mississippi"
states['MT'] = "Montana"
states['IL'] = "Illinois"
states['WI'] = "Wisconsin"

with open('states.txt', 'w') as f:
    f.write('\n'.join('{0},{1}'.format(k,v) for k,v in states.items()))
    
with open('states.txt', 'r') as f:
    for line in f:
        fields = line.split(sep=',')
        print('State=',fields[1],'(',fields[0],')')

State= Massachusetts
 ( MA )
State= Maine
 ( ME )
State= Michigan
 ( MI )
State= Missouri
 ( MO )
State= Mississippi
 ( MS )
State= Montana
 ( MT )
State= Illinois
 ( IL )
State= Wisconsin ( WI )
