# A quick tour of Python to understand basic Programming Structures


Spring 2017 - Prof. Foster Provost

Teacher Assistant: Maria L Zamora Maass


***

This notebook shows examples of Python built-in functions, packages and programming structures useful for Data Science and Business Analytics. This notebook is a modified version of [Rob Moakler](https://github.com/rmoakler/learning-data-science/blob/master/Spring%202016/Hands-on/Module%201%20-%20Python%20and%20IPython%20Notebooks/IPython%20Notebook%20Tour.ipynb)

IPython notebooks are made up of cells. There are two basics types of entries in an IPython notebook: text cells for comments, and code cells (commands). You can edit a cell by double clicking on it. You can get it back to the display mode (run a cell) by pressing the "Play" (▶) button, and you can also stop it with the "square" ◼︎ . Basically, all the tasks for cells can be found in the tool bar.



## Text cells

To write text in a cell we must select the cell and go to the toolbar to change it from "code" to "markdown". Now, you can write and do text formatting:

- Hashtag (number sign) is useful \# for titles
- Simple \*asterisk\* or \_underscores\_ to emphasize things: _example_. 
- Double **asterisks** to make things bold. 
- Square Brackets [ ] are for links and images
- Also, HTML code is allowed. Some resources can be found in [HTML w3schools](http://www.w3schools.com/html/html_examples.asp) <p style="color:red;">This is a text with HTML code.</p>

- And you can write math with $\LaTeX$ (this is a typesetting system for the production of scientific documentation https://www.latex-project.org/): This can be achieved by wrapping it in dollar signs, $x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}$. If you don't know how to write a symbol, you can go to [Detexify](http://detexify.kirelabs.org/classify.html).





If you are ever stuck, just Google "Markdown syntax" since the language the formatting is done in is called Markdown.

## Code cells

Now, we will see a code cell (this is the default format of any cell). Here, we simply type any Python command and then click "Play". When we play a cell, the code in it is executed and it returns what we are asking for. Some of this information can be "remembered" as long as we keep this window open (session running). Code cells will always start with "`In [ ]:`". 

For example, in the following cell, I want it to remember that the VARIABLE called "x" is just the sum of two given numbers and I also want it to print the sum. Please select it and press the ▶ button to run it.

In [32]:
x = 5 + 5
print "The value of x is " + str(x) + "."

The value of x is 10.


Instead of having to manually go and click the "Play" button every time, you can also run a cell with your keyboard. Just press **Ctrl + Enter** or **Shift + Enter**. Experiment with both of those to see what the difference is.

We typically read IPython notebooks from **top to bottom**. This means that if a cell relies on a variable or function that was created earlier in the notebook, you must run the corresponding cell to make that information available in future cells  (_we cannot just call "x" in other cell if we don't run this one before_) !!!!

_Note_: The number in the "In [#]:" statement will always increase by one for every time you run a code cell.

## Python commands 

Before we go any further with all the cool stuff IPython notebooks can do, let's talk about Python for a bit.

### 1. Variables and data types
Variables are used to store data. This data can be of a variety of types. Integer numbers, floating (decimal numbers), lists, strings, etc. Let's take a look at some of these:

In [33]:
some_integer = 5
some_float = 7.1
some_list = [1, 2, 3, 4]
some_string = "Rob"

We can print out these variables.

In [34]:
print some_integer
print some_float
print some_list
print some_string

5
7.1
[1, 2, 3, 4]
Rob


What if I want to print some text and then some numbers? One easy way to do this is to realize that printing will always **want** string data. If you have data that is not a string (like an integer or float), you can convert it to a string,

In [35]:
print "My integer was " + str(some_integer) + "."

My integer was 5.


Now, why was I able to type "`print some_integer`" without the `"str()"` part a few cells ago but can't do it now? The answer here is slightly confusing, but basically Python was able to figure out what your simple statement was meant to do. The second we made it slightly more complicated it got confused. It is always a good practice to convert everything to a string when printing it out.

That's cool, but what else can we do with our variables? We can do basic math,

In [36]:
some_integer + some_float

12.1

We can store this as a new variable to use later,

In [37]:
my_sum = some_integer + some_float

Nothing was printed out! This is because I told Python to hold onto the output. If you want to see output, you should always explicetly print.

In [38]:
print my_sum

12.1


What about that list we had? What does that mean? A list is exactly what it sounds like. It's a way to keep a collection of things in order. We can check to see how long our list is,

In [39]:
print len(some_list)

4


This looks good. Our list contained the numbers 1 through 4. What if we want a particular item from the list? How do we look at just the first item? To do a lookup, we use square brackets. Notice that when we created the list originally, we also used square brackets!

In [40]:
print some_list[1]

2


That's the second item, not the first! In Python (and almost every other language), counting start at zero! To get the first item we should look in the 0th space,

In [41]:
print some_list[0]

1


Adding things to the list is as easy as appending them,

In [42]:
some_list.append(5)

In [43]:
print some_list

[1, 2, 3, 4, 5]


Play around with different data types and see what you can do.

In [44]:
# Play around here!
# By the way, the pound (hash) symbol here is used to indicate a comment in code.

### 2. Functions

We have already used these twice! Functions allow us to do predefined operations. Functions are usually some sensible English word ending in open-and-close parentheses. One example is the `str()` function to convert numbers to strings,

In [45]:
str(5.124)

'5.124'

We also used the `append()` function to add stuff to a list.

If we knew we had to do some operation many times, and wanted to save a bit of time, we could define our own function. For example, consider having to calculate the area of a circle.

In [46]:
def area_of_a_circle(radius):
    area = 3.14 * radius * radius
    return area

In [47]:
circle_area = area_of_a_circle(5)
print circle_area

78.5


Can you see what is going on here? My function that I helpfully named `"area_of_a_circle"` takes one *argument* that we will call radius. It then uses this radius to get the area and then *returns* it. Now, whenever I want to get the area of some circle, I simply call `area_of_a_circle()` and place the radius in the middle of the parentheses.

Python has **many** functions, but we will be writing out own very often.

### 3. Loops

We will be doing a lot of repetative things in Python. This doesn't mean we need to do a ton of copy and pasting, though. We can use **loops** to make this easy. For example, if we wanted to square each number from 1 to 5,

In [48]:
for number in [1, 2, 3, 4, 5]:
    print number * number

1
4
9
16
25


The range function makes this even easier,

In [49]:
for number in range(5):
    print number * number

0
1
4
9
16


Not exactly the same... the range function will start from 0 and go to the last number minus one. We can fix this by telling it to start at 1:

In [50]:
for number in range(1, 6):
    print number * number

1
4
9
16
25


We aren't limited to this, let's bring in another list:

In [51]:
names = ["Robert", "John", "Sarah"]
ages = [26, 31, 29]

for i in range(len(names)):
    print str(names[i]) + " is " + str(ages[i]) + " years old."

Robert is 26 years old.
John is 31 years old.
Sarah is 29 years old.


### 4. Conditionals

Sometimes we want to check something before deciding what to do next. For example,

In [52]:
def is_best_prof(name):
    if name == "Robert":
        return "Yes!"
    else:
        return "No!"

In [53]:
print is_best_prof("Robert")

Yes!


In [54]:
print is_best_prof("John")

No!


### 5. Packages

Python has a ton of packages that make doing complicated stuff very easy. We won't discuss how to install packages, or give a detailed list of what packages exist, but we will give a brief description about how they are used. An easy way to think of why package are useful is by thinking: "**Python packages give us access to MANY functions!**"

In this class we will use four packages very frequently: `pandas`, `sklearn`, `matplotlib`, and `numpy`:

- **`pandas`** is a data manipulation package. It let's you store data in data frames. More on this next class.
- **`sklearn`** is a machine learning and data science package. It let's you do fairly complicated machine learning tasks, such as running regressions and building classification models with only a few lines of code!
- **`matplotlib`** let's you make nice looking plots.
- **`numpy`** (pronounced num-pie) is used for doing "math stuff" such as complex math operations (e.g., square roots, exponents, logs) and give you complex matrix operation abilities.

If it's confusing as to why this is useful, don't worry. As we use them throughout the semester, their usefulness will become apparent.

To make the contents of a package useful, you need to import it:

In [55]:
import pandas
import sklearn
import matplotlib
import numpy

Sometimes you will want to use short names for packages. This has just become the norm now, so we will often be doing it so that we fit in with all the professional programmers.

In [56]:
import pandas as pd
import numpy as np

We can now use some package specific things. For example, numpy has a function called `sqrt()` which will give us the square root of a numpy. Since it is part of numpy, we need to tell Python that's where it is by using a dot.

In [57]:
np.sqrt(25)

5.0

You may have noticed that earlier, when we added stuff to our list, we used `.append()`. This is very similar! Here, we told Python that numpy had a function called `sqrt()` that we would like to use. Earlier, we told Python that our list (and all lists!) had a function called `append()` that we would like to use.

That's all we say about packages for now. Soon, we will be using packages every class. With practice, you will understand why they are so great!

### 5.1. Auto complete for packages

One of the most useful things about IPython notebook is its tab completion. 

Try this: click just after `sqrt(` in the cell below and press `Shift + Tab` 4 times, slowly

In [None]:
np.sqrt(

I find this amazingly useful. I think of this as "the more confused I am, the more times I should press Shift+Tab". Nothing bad will happen if you tab complete 12 times.

Okay, let's try tab completion for function names! Just hit `Tab` when typing below to get suggestions.

In [None]:
np.sq

This is super useful when you forget the names of everything!

## 6. Help, help, and more help!

- [Codecademy's Python Course](https://www.codecademy.com/learn/python). Working though this class will give you a _great_ foundation for Python.
- [Diving into Python](http://www.diveintopython.net/toc/index.html) online book. Working you way from chapter 1 through chapter 5 would put you in a great place!
- [Python for Data Analysis](https://www.amazon.com/Python-Data-Analysis-Wrangling-IPython-ebook/dp/B009NLMB8Q/ref=mt_kindle?_encoding=UTF8&me=) was the book that Prof. Foster suggested me when I was the student. You can take a look to the chapters: Preliminaries, Introductory Examples (e.g. "Counting Time Zones with pandas”), IPython (page 46 to 62) and specially, Pandas.


If you are ever stuck just remember: it is normal. This is actually how professional programmers work every day. Google is your best friend, and websites such as Stackoverflow.com has an answer to almost any programming question!


## 7. Hands-on

To master your new found knowledge of Python, you should try these hands-on examples. Your homeworks will be in a similar format to this section.

1\. Create a list of 5 fruits (make sure to include an apple).

2\. Go through each fruit and check if it is an apple. If it is, print out "I found it!". If it's not an apple, do nothing.

3\. Add two new fruits to your list.

4\. Create a new empty list. Go through your list of fruits, and for each one, add an entry to the new list that tells us how many letters each fruit name is.

5\. Make a function called `half_squared` that takes a list and returns a new list where each element of the original is squared and then divided in half.

In [None]:
def half_squared(input_list):
    output_list = [] # What should we do?
    
    return output_list