# Introduction to Python for Data Science

1. [Welcome!](#welcome)
2. [What is Python?](#whatispython)
2. [Variables](#variables)
3. [Operators](#operators)
4. [Naming and Assignment](#naming)
5. [Control structures](#if)
6. [Basic functions](#func)
7. [Review Problems](#hwk)

## Welcome! <a id='welcome'></a>

This is a 4 week (8 hour) course that will introduce you to the basics of handling, manipulating, and modeling data with Python. This notebook is a review of some Python language essentials. We'll quickly review this material on during the first class, but taking a look at it before then will help you out, especially if you haven't used Python before.

The environment you're in right now is called a Jupyter notebook. [Project Jupyter](http://jupyter.org/) is an interactive environment that data scientists use for collaborating and communicating the results of projects. Each cell in the notebook can either contain text or code (often Python, but R, Julia, and lots of other languages are supported). This allows you to combine snippets of code with explanations and documentation.

Each cell in a notebook can be executed independently, but object and function declarations persist across cells. For example, I can define a variable in one cell...

In [4]:
my_variable = 10

... and then access that variable in a later cell:

In [5]:
print(my_variable)

10


We'll be using Jupyter notebooks extensively in this class. I'll give a more detailed introduction during the first class, but for now, the most important thing is to understand how to run code in the notebook.

As I mentioned above, there are two fundamental types of cells in a notebook - text (i.e. markdown) and code. When you click on a code cell, you should see a cursor appear in the cell that allows you to edit the code in that cell. A cell can have multiple lines - to begin a new line, press `Enter`. When you want to run the cell's code, press `Shift`+`Enter`.

Try changing the values of the numbers that are added together in the cell below, and observe how the output changes:

In [6]:
a = 11
b = 15
print(a+b)

26


You can also edit the text in markdown cells. To display the editable, raw markdown in a text cell, double click on the cell. You can now put your cursor in the cell and edit it directly. When you're done editing, press `Shift`+`Enter` to render the cell into a more readable format.

Try editing text cell below with your name:

Hello, my name is Nick!

To change whether a cell contains text or code, use the drop-down in the toolbar. When you're in a code cell, it will look like this:

![](images/code_cell.png)

and when you're in a text cell, it will look like this:

![](images/markdown_cell.png)

Now that you know how to navigate the notebook, let's review some basic Python.

## What is Python? <a id='whatispython'></a>

Python is an open source programming language that is extremely popular in the data science and web development communities. The roots of its current popularity in data science and scientific computing have an interesting history, but suffice to say that it's darn near impossible to be a practicing data scientist these days without at least being familiar with Python.

The guiding principles behind the design of the Python language specification are described in "The Zen of Python", which you can find [here](https://www.python.org/dev/peps/pep-0020/) or by executing:

In [8]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


To boil this down a bit, Python syntax should be easy to learn, and well-written Python code should be easy to read. Code that follows these norms is called *Pythonic*. We'll touch a bit more on what it means to write Pythonic code in class.

### What's the deal with whitespace?

A unique feature of Python is that whitespace defines scope. Many other programming languages use braces or `begin`/`end` keywords to define scope. For example, in Javascript, you write a `for` loop like this:

```
var count;
for(count = 0; count < 10; count++){
               console.log(count );
               console.log("<br />");
            }
```

The curly braces here define the code executed in each iteration of the for loop. Similarly, in Ruby, you write a `for` loop like this:

```
for count in 0..9
   puts "#{count}"
end
```

In this snippet, the code executed in each iteration of the `for` loop is whatever comes between the first line and the `end` keyword.

In Python, `for` loops look a bit different:

In [13]:
print('Entering the for loop:\n')
for count in range(10):
    print(count)
    print('Still in the for loop.')

print("\nNow I'm done with the for loop.")

Entering the for loop:

0
Still in the for loop.
1
Still in the for loop.
2
Still in the for loop.
3
Still in the for loop.
4
Still in the for loop.
5
Still in the for loop.
6
Still in the for loop.
7
Still in the for loop.
8
Still in the for loop.
9
Still in the for loop.

Now I'm done with the for loop.


Note that there is no explicit symbol or keyword that defines the code executed during each iteration - it's the indentation that defines the scope. When you define a function or class, or write a control structure like a `for` look or `if` statement, you should indent the next line (4 spaces is customary). Each subsequent line at that same level of indentation is considered part of the scope. You only escape the scope when you return to the previous level of indentation.