# Intro to Python
This notebook will serve as a brief Python primer arranged in Q&A fashion.

**Current Questions:**
- How do I get Python on my computer?
- What am I looking at now? What's this coding environment?
- I'm here to learn Python and come from SAS. What should I know?
    - Objects
    - Imports
- Where can I find out about the text conventions you're using?
- What's that stuff on the bottom of each page?


## How do I get Python on my computer?
Python comes preinstalled on most Linux and Mac computers, but only contains the standard library. Data Analysis in Python requires the use of several packages we have to install separately. The folks over at Continuum Analytics maintain the Anaconda Python distribution, which combines all the necessary packages (and more) for Data Analysis.

<div class="resources">
Download Anaconda Python Here: http://continuum.io/downloads
<br><br>
Instructions for installation are here: http://docs.continuum.io/anaconda/install
</div>

After installation, you can start the Notebook environment from the Launcher that should now be in your Applications folder.

## What am I looking at now? What's this coding environment?
You're looking at an [IPython Notebook](http://ipython.org/notebook.html) (you may also see it is a [Jupyter](https://jupyter.org/) Notebook). The notebook environment allows us to combine code execution, rich text, mathematics, plots and rich media all in one environment. Everything you're reading now is rich text formatted using Markdown. But, if I want to switch over to Python, I can just add a code cell below:

In [1]:
# Comments in Python start with the pound symbol.
# Let's do some math!
# Remember PEMDAS?

-5 + 4 * 8 / 4

3.0

We'll usually use it to display output from code or plots, but we can also add media, HTML, and anything else we want. 

In [2]:
from IPython.display import HTML
HTML('<iframe width="420" height="315" src="https://www.youtube.com/embed/aU4pyiB-kq0" frameborder="0" allowfullscreen></iframe>')

<div class="resources">
For the full docs on IPython Notebook: http://ipython.org/ipython-doc/stable/notebook/index.html
<br><br>
Tutorials: http://nbviewer.ipython.org/github/ipython/ipython/blob/master/examples/Notebook/Index.ipynb
</div>

## I'm here to learn Python and come from SAS. What should I know?
There are a few things that will require you to *think* about programming differently. 
### Objects
One twist is that we'll be doing a lot of object-oriented programming in SAS. Coming from SAS, the only thing I thought of as an *object* in the traditional sense was the dataset. I could read it in, apply procedures to it, and export it to a file (another object).

But what if the procedures I applied to the dataset were also objects that I could interact with? For example, instead of running `PROC LOGISTIC`, instead I interacted with a *Logistic Regression Object*. With this object I can call functions or methods on it to do things with it like fit it to data, predict new cases, or see its attributes (coefficients for example).

This will become more clear after going through some examples, but hopefully it's at least got you thinking. 

**Toy Object Example**  
The standard way to instantiate (create) an object is like this:

`aNewCar = Car()`

We now have an object called `aNewCar` that's a copy of the `Car` object. Let's assume that we can do something with this car like drive it, and our object has a `drive()` method. The standard way to call a method on an object is:

`aNewCar.drive(mph=60)`

I've also supplied an argument for the method, `mph=60`, telling it how fast I want it to drive. Not all methods have arguments, so you may often see:

`aNewCar.stop()`

Classes also have attributes or data. To display an attribute a similar syntax is used, but attributes won't have parenthesis. Assuming our car has a `wheels` attribute, we could see what the value of it is by doing:

`aNewCar.wheels`

And this would return the number of wheels on our car (hopefully 4).



<div class="resources">

For a good general tutorial of classes: http://interactivepython.org/runestone/static/thinkcspy/Classes/classesintro.html
        </div>

### Imports
This is also one of the more challenging aspects of transitioning to Python: importing modules and packages. In code you'll typically see three main ways of importing a module. 

** 1. Importing an entire module**  
You may see an import like the following:

`import StringIO`

This will import the entire `StringIO` module and create a reference to it. The documentation is a good place to see what classes and functions are available within that module. If we want to create an object of the `StringIO` class (confusingly from the `StringIO` module) we would do this after importing:

`myString = StringIO.String()`

It's similar to calling a function on a class, but also a bit different, since now `myString` is that object and we can still call functions on that.

** 2. Importing an entire module and changing the reference name.**  
You'll see this with the `pandas` package each time we import it. The statement will say:

`import pandas as pd`

This does the same as [1] but instead of referencing everything from the module by having to write `pandas.object`, we can now write `pd.object`. You may think that just makes us lazy, but when you type it a lot it saves a bunch of time. Plus everyone does it.

** 3. Importing specific objects from modules **
You'll see this when we start using the scikit-learn library. We almost never want to import the entire thing when we're only going to use single classes from it. For example, if we want to do a logistic regression, we'd type:

`from sklearn.linear_model import LogisticRegression`

There are a few things that may be confusing about this. First, `sklearn` is itself organized by modules, and the `LogisticRegression` class falls under the `linear_model` module. You'll also notice there are no parenthesis here since we're just importing.

Most importantly, we've imported the object itself directly into our namespace. Thus, we **do not** have to type something like `logit = sklearn.linear_model.LogisticRegression()`, but instead:

`logit = LogisticRegression()`

By comparison, we could have done the same thing with our StringIO object from example one and done the following:

`from StringIO import StringIO`  
`myString = StringIO()`

But this gets confusing and isn't really necessary since `StringIO` is not that large of a module.



<div class="resources">
For more on importing: https://grahamwideman.wikispaces.com/Python-+import+statement
</div>

<div class='pynote'>
<b>Python Note</b>: It is customary to import all of the functions, classes, and modules you will need at the beginning of a program. Sometimes when doing analysis you won't realize you need a module until halfway through -- that's okay, go back to the top and add the import statement to your import cell.
<br><br>
With that said, <b>I will be importing modules as needed in the code</b> to increase flow and modularity of the examples. Additionally, I hope this reduces confusion among newcomers who might open the notebook file and get immediately intimidated by several lines of code with various import formats.
</div>

## Woah, that green box is cool! Where can I find out about the text conventions you're using?
Right over [here](Conventions.ipynb).

## What's that stuff on the bottom of each page?
You'll usually see two things:
1. A printout of the python version number, as well as the vesrion numbers of any packages I use. This will be helpful if something doesn't work because of version differences.
2. A function that loads the CSS that styles each page. Looks good, right?

---

In [36]:
# Housekeeping
!python -V

Python 2.7.10 :: Anaconda 2.2.0 (x86_64)


In [3]:
# This cell imports the styling for this notebook. You can safely ignore it.

from IPython.display import HTML

def css_styling():
    styles = open("../_styles/custom.css", "r").read()
    return HTML(styles)
css_styling()