# Brief Python introduction

## About python
Python is an interpreted, object-oriented language with a great set of built-in data structures and simple syntax. In this tutorial, we will cover some basics to get you up and running with Python. The majority of the course will focus on scientific computing in python (numpy, scipy, matplotlib)

Please note that we will be using Python 3.6.X. Support for Python 2 is ending soon so it's not worth relearning idiosyncratic differences between the two.

## Installation

We are assuming previous python knowledge, so you should have some working installation. However, some installation methods are preferred over others. The most convenient by far is to install the cross-platform Anaconda (https://www.anaconda.com/what-is-anaconda/ - personally I prefer Miniconda to save some disk space). 

I recommend that Windows users in data science use Anaconda/Miniconda. There are many intricacies with settings paths in Windows that the Anaconda installation automatically takes care of.

Mac users should take care to not use the default system python when developing. You can install newer versions of python using Anaconda or the homebrew utility (https://brew.sh/). More advanced users can use homebrew to install `pyenv` (https://github.com/pyenv/pyenv), which also takes care of version management.

Linux users can also install Anaconda and Miniconda, and have access to python3 using aptitude (apt-get). I also recommend pyenv or Anaconda to easily manage python versions.

## Environments

### Python console

Assuming that Python is installed on your computer, you can invoke the Python interpreter by running `python` from the command line. There it is possible to issue Python commands one at a time and see their output immediately. For example, you could do something like the following:
    
    >>> 1 + 2
    3
    >>>
    
### Python scripts

You can also write some lines of Python code in a text file, say `source.py`, and run that *script* by running `python source.py` from the command line. Some of the homework assignments will be in this format.

### Jupyter notebooks

In our labs, instead of the command line interpreter invoked by `python`, we will be using a versatile interactive shell for running Python commands and scripts known as Jupyter Notebook (previously known as IPython Notebook). It works much like the command line `python` interpreter in that it reads code (written into *cells*) and prints output as soon as the commands are executed, but it also allows us to embed graphics, write Markdown documentation (like this!), go back to already executed code and re-run it and so on.

For now, the important thing to know is that Jupyter Notebook is started from the command line with `jupyter notebook`, that commands are added to cells which can be executed with *SHIFT+ENTER*, and that you can add new cells anywhere within each notebook (press *ESC+b* or *ESC+f* to add a cell before and after the current cell, respectively).

All of the exercises in this course will be written in the jupyter notebook, so it's very important that it is installed and working properly. It can be installed using pip or conda via the following:

`pip install -U jupyter`

`conda install jupyter`

### Virtualenvs

While we don't have time to cover vitualenvs in this course, it is good practice to use virtualenvs (virtual environments) to ensure all of your projects' dependencies are well organized. While virtualenvs were originally a user-created project, as of 3.3, virtualenv has been integrated to the default python and named "venv". See https://docs.python.org/3/library/venv.html for more information. Essentially, the virtualenv shuffles around your environment variables so that the python-specific commands point to a specific version of python. Note that python environments (Anaconda and Pyenv) have their own methods for handling virtualenvs, so you should use their tools if you're on their platform.

Anaconda: `conda env create VIRTUALENV_NAME`

Pyenv: `pyenv create VIRTUALENV_NAME`

Virtualenvs can then be activated using their `activate` script. Anaconda and pyenv handle it in the following way:
Anaconda: `activate VIRTUALENV_NAME`

pyenv: `pyenv activate VIRTUALENV_NAME`

Finally, to exit your virtual environment, run `deactivate` in the command line.


There are some tricks to getting the jupyter notebook to cleanly work with separate virtualenvs (you can always install it in each virtualenv, but that kind of defeats the purpose of a kernel). 

With your virtualenv activated, you can run 

`python -m ipykernel install --user --name=VIRTUALENV_NAME`

Contact me or look at the specific virtualenv pages for more information on setup (it's a very personal process!)

## Operators

Basic arithmetic and logical operations work as you may expect. Try writing a simple arithmetic expression. (like a calculator)

In [1]:
# write a few basic operations here
result = None
print(result)

None


In [2]:
a = 5 + 3
print(a)

8


Instead of symbols like `&&` and `||`, Python uses keywords like `and` and `or` to manipulate boolean expressions: 

In [3]:
a = True
b = False

In [4]:
print(a or b)
print(a and b)

True
False


Variables do not have to be declared, so:

In [5]:
welcome_message = 'First variable holding a string'
print(welcome_message.lower())

first variable holding a string


In [6]:
a = 5
a = 'hi'
print(a)

hi


## Data Structures

### Lists

Lists are a sequence of items accessible as an array:

In [7]:
list1 = []
print(list1)

[]


There is nothing stopping you from mixing types in lists, and items in lists can be other lists, for instance:

In [8]:
list1 = [1, 2, 3, 3]
list2 = [list1, list1]
print(list2)

[[1, 2, 3, 3], [1, 2, 3, 3]]


### Tuples

Tuples are similar to lists, but they are *immutable*. That is, once created, you cannot change content. (We won't really encounter tuples in this course)

### Sets

Sets are unordered collections of unique items. That is, a set cannot contain duplicates!

In [9]:
set1 = set(list1)
print(set1)

{1, 2, 3}


### Dictionaries

Finally, we have dictionaries which are unordered collections of key-value pairs. The key has to be immutable (a string, number, or tuple).

In [10]:
dict1 = {'one': 1, 'two': 2}
print(dict1['one'])

1


## Control Statements

Python has simple indentation based syntax and powerful iteration tools for looping and program control. 

Note that indentation matters. There are no brackets like in C or Java, so you must indent 
statements to be on the same level (using some number of spaces or, less frequently, tabs per level).

In [11]:
x = []
for i in range(10):
    x.append(i)
    print(i)
print(x)

0
1
2
3
4
5
6
7
8
9
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


### Functions

Defined using indentation as with control statements.

In [12]:
def counter(x):
    out = []
    for i in range(x):
        out.append(i)
    return out

In [13]:
print(counter(10))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


## Classes and Modules

We won't use object-oriented programming much, but it is easy and straightforward to declare classes in Python. We will explore these further when discussing scikit learn

We will be using Modules much more often. While Python is extrememly flexible, it is inefficient to continue reinventing the wheel for common functionalities. Python makes it extremely easy to import and publish modules that do so. In fact, Numpy and Scipy (introduced later), which add MATLAB-like functionalities to Python, are simply giant modules.

You can also import only specific parts of modules.

In [14]:
import time
time.time()

1517260491.7674265

In [15]:
from time import time
time()

1517260491.7818406

The simplest module you can create is one in which you save some functions to a text file, then import them as needed. A detailed description is available here: https://docs.python.org/3/tutorial/modules.html
