# Programming structures and Python basics


Spring 2018 - Prof. Dantong Yu,   The script was created by Foster Provost

Teacher Assistant: Dantong YU


***

This notebook shows examples of Python code, including built-in functions, packages and programming structures useful for Data Science and Business Analytics. This notebook is a modified version of one created by [Rob Moakler](https://github.com/rmoakler/learning-data-science/blob/master/Spring%202016/Hands-on/Module%201%20-%20Python%20and%20IPython%20Notebooks/IPython%20Notebook%20Tour.ipynb).  This should be review for you all!  At the bottom there are some pointers to a few resources.

## Python code

### 1. Variables, operations and data types
Variables are used to store data. 
The data can be of a variety of types: 

- Integer numbers
- Floating (decimal numbers)
- Strings

Let's create three variables, one of each type, with 3 different names:

In [3]:
some_integer = 5
some_float = 7.12
some_string = "Student"

We can print out these variables. Remember we need to run the previous cell first!

In [4]:
print (some_integer)
print (some_float)
print (some_string)

5
7.12
Student


What if I want to print some text and then some numbers? One easy way to do this is to realize that printing will always **want** string data. 

If you have data that is not a string (like an integer or float), you can **convert** it to a string: 

In [5]:
print ("My integer is " + str(some_integer) + ".")
print ("My float converted into integer is " + str( int(some_float) ) + ".")

My integer is 5.
My float converted into integer is 7.


What else can we do with our variables? We can do basic math: **operations**.

In [6]:
print ("sum " + str( some_integer + some_float ))
print ("multiplication " + str ( some_integer * some_float ))
print ("quotient " + str( some_integer / some_float ))
print ("power " + str( 10**some_integer ))

sum 12.120000000000001
multiplication 35.6
quotient 0.7022471910112359
power 100000


We can store this as a new variable and print it:

In [7]:
my_sum = some_integer + some_float
print ("Sum variable: " + str( my_sum ))

Sum variable: 12.120000000000001


There are also other **data structures**:

- Lists (sometimes refered to as "arrays", but look up the difference)
- Dictionaries
- Sets


In [8]:
some_list = [0,0,1,2,3,3,4.5,7.6]
some_dictionary = {'student1': '(929)-000-0000', 'student2': '(917)-000-0000', 'student3': '(470)-000-0000'}
some_set = set( [1,2,4,4,5,5] )

print ("This is a list:  " + str(some_list))
print ("This is a dictionary:  " + str( some_dictionary ))
print ("This is a set:  " + str( some_set ))

This is a list:  [0, 0, 1, 2, 3, 3, 4.5, 7.6]
This is a dictionary:  {'student1': '(929)-000-0000', 'student2': '(917)-000-0000', 'student3': '(470)-000-0000'}
This is a set:  {1, 2, 4, 5}


How can we use  **individual** elements? 

In Python (and generally by computer science convention), we count elements of a _list_ starting from zero! To get the first item we should look in the 0th space:


In [10]:
print (some_list[2])

1


Adding things to the end of the list is "appending" them:

In [12]:
some_list.append(500)
print (some_list)

[0, 0, 1, 2, 3, 3, 4.5, 7.6, 500, 500]


How can we retrieve an element (**VALUE**) of a _dictionary_ ?  Use its **"KEY"** !! 

In [14]:
print (some_dictionary['student1'])

(929)-000-0000


### 2. Create functions

Functions allow us to execute predefined operations and to define our own operations that will be available later.  They encapsulate procedures. If we know we are likely to execute some operation many times, we may want to save time, to avoid repeated code, and often to clarify what we are doing.  To do this, we would define a function.  

(*If you haven't thought this through before, consider the drawback of repeated code: what if later you realize that you need to fix something in that code block.  You'd have to go through and fix it everywhere.*) 

For example, consider having to calculate the area of a circle.

In [15]:
def area_of_a_circle(radius):
    area = 3.1416 * radius * radius
    return area

In [16]:

circle_area = area_of_a_circle(5)
print ("Area of a circle with radius 5 is: " + str( circle_area))


Area of a circle with radius 5 is: 78.54


Can you see what is going on here? My function that I helpfully named `"area_of_a_circle"` takes one **argument** that we will call radius. It then uses this radius to get the area and then *returns* it. Now, whenever I want to get the area of some circle, I simply call `area_of_a_circle()` and place the radius in the middle of the parentheses.

### 3. Loops / iterations

For data analysis we do a lot of repetitive things. This doesn't mean we need to do a ton of copy and pasting, though. We can use **loops** to make this easy. As a very simple example, what if we wanted to square each number from 1 to 5?

In [17]:
for number in [1, 2, 3, 4, 5]:
    print (number * number)

1
4
9
16
25


Let's use the function we did before. Remember this is a function that can only be used in **this notebook** 

( unless we write a **"script"** file, but we'll see that later... ):

In [18]:
for number in [1, 2, 3, 4, 5]:
    print ("Area of circle with radius " + str(number) + " is: " + str( area_of_a_circle(number) ))

Area of circle with radius 1 is: 3.1416
Area of circle with radius 2 is: 12.5664
Area of circle with radius 3 is: 28.2744
Area of circle with radius 4 is: 50.2656
Area of circle with radius 5 is: 78.54


### 4. Conditionals and comparisons

Sometimes we need to check something before deciding what to do next. For example,

In [19]:
def is_best_prof(name):
    if name == "Foster":
        return True
    else:
        return False

In [20]:
print (is_best_prof("Foster"))

True


In [21]:
print (is_best_prof("John"))

False


In [22]:
my_prof = "Foster"
if is_best_prof(my_prof):
    print("You're going to have a great semester!")
else:
    print("Well, Good Luck!")

You're going to have a great semester!


You see in that last one how we have a conditional in the cell, and then call a function that has a conditional inside it?  

Let's put a whole bunch of these things together:

In [23]:
my_profs = ["John", "Paul", "George", "Ringo"]
one_best = False
for prof in my_profs:
    if is_best_prof(prof):
        one_best = True
if one_best:
    print("You're going to have a great semester!")
else:
    print("Well, make the best of it!")

Well, make the best of it!


As we can see here, we made **comparison** of names with the "equal" operation  (==).  Remember ... it's == not just = !

Other comparisons:

- strictly less than  < 
- less than or equal  <=
- strictly greater than  >
- greater than or equal  >=
- not equal  !=
- object identity  "is"
- negated object identity "is not"

What if we want to compare more than one element? 
We should include logical operations such as:

- "and", also known as "&" 
- "or", also known as "|"

Let's see if you can guess my age with this function!!


In [24]:
def is_my_age(age_argument):
    if age_argument < 20:
        return "Of course not!"
    elif (age_argument >= 20) & (age_argument <= 40):
        return "Maybe.."
    elif age_argument > 40:
        return "Don't even think about it!"

In [25]:
print (is_my_age(10))

Of course not!


In [26]:
print (is_my_age(23))

Maybe..


In [27]:
print (is_my_age(80))

Don't even think about it!


## Help, help, and more help!

- [Codecademy's Python Course](https://www.codecademy.com/learn/python). Working though this class will give you a _great_ foundation for Python.
- [Diving into Python](http://www.diveintopython.net/toc/index.html) online book. Working you way from chapter 1 through chapter 5 would put you in a very nice place!
- [Python for Data Analysis](https://www.amazon.com/Python-Data-Analysis-Wrangling-IPython-ebook/dp/B009NLMB8Q/ref=mt_kindle?_encoding=UTF8&me=) was the book that Prof. Foster suggested to me (Maria) when I was taking this course. You can take a look at the chapters: Preliminaries, Introductory Examples (e.g. "Counting Time Zones with pandas”), IPython (page 46 to 62) and specially, Pandas--one of the main Python packages for data analysis.  We will work with Pandas in class.


If you are ever stuck just remember: it is normal. This is actually how professional programmers work every day. Google is your best friend, and websites such as Stackoverflow.com have an answer to almost any programming question!
