# Slack
You need to fill in the [form](https://docs.google.com/forms/d/1OmT8ODmVBNgl0eOmZT51JMTHUSA_eNrHTcDRnmNDMgQ) to get invitated

Slack url: https://rt-portal.slack.com/

# Online tutorials
https://www.codecademy.com/learn/python

http://pythontutor.ru/lessons/inout_and_arithmetic_operations/

# Python Cheat Sheet
http://www.datasciencefree.com/python.pdf

# Appendix: Python Language Essentials

In [None]:
from __future__ import division
#from numpy.random import randn
import numpy as np
import os
import matplotlib.pyplot as plt
np.random.seed(12345)
plt.rc('figure', figsize=(10, 6))
from pandas import *
import pandas
np.set_printoptions(precision=4)


## The Python interpreter

In [None]:
conda create -n py27 -y python=2.7 jupyter 
source activate py27 
conda install nb_conda

In [None]:
import sys
print (sys.version)
print (sys.version_info)

In [None]:
!python

The >>> you see is the prompt where you’ll type expressions. To exit the Python inter-
preter and return to the command prompt, you can either type exit() or press Ctrl-D .

```
$ python
Python 2.7.2 (default, Oct  4 2011, 20:06:09)
[GCC 4.6.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> a = 5
>>> print a
5
```

Running Python programs is as simple as calling python with a .py file as its first argu-
ment. Suppose we had created hello_world.py with these contents:

print 'Hello world'

In [None]:
%%writefile hello_world.py
print 'Hello world'

This can be run from the terminal simply as:

In [None]:
%run hello_world.py

## The Basics

### Language Semantics

#### Indentation, not braces

Python uses whitespace (tabs or spaces) to structure code instead of using braces as in
many other languages like R, C++, Java, and Perl. Take the for loop in the above
quicksort algorithm:

In [None]:
for x in array:
    if x < pivot:
        less.append(x)
    else:
        greater.append(x)

A colon denotes the start of an indented code block after which all of the code must be
indented by the same amount until the end of the block. In another language, you might
instead have something like:

In [None]:
for x in array {
        if x < pivot {
            less.append(x)
        } else {
            greater.append(x)
        }
    }

One major reason that whitespace matters is that it results in most Python code looking
cosmetically similar, which means less cognitive dissonance when you read a piece of
code that you didn’t write yourself (or wrote in a hurry a year ago!). In a language
without significant whitespace, you might stumble on some differently formatted code
like:

In [None]:
for x in array
    {
      if x < pivot
      {
        less.append(x)
      }
      else
      {
        greater.append(x)
      }
    }

As you can see by now, Python statements also do not need to be terminated by sem-
icolons. Semicolons can be used, however, to separate multiple statements on a single
line:

In [None]:
a = 5; b = 6; c = 7

#### Everything is an object

An important characteristic of the Python language is the consistency of its object
model. Every number, string, data structure, function, class, module, and so on exists
in the Python interpreter in its own “box” which is referred to as a Python object. Each
object has an associated type (for example, string or function) and internal data. In
practice this makes the language very flexible, as even functions can be treated just like
any other object.

#### Comments

Any text preceded by the hash mark (pound sign) # is ignored by the Python interpreter.
This is often used to add comments to code. At times you may also want to exclude
certain blocks of code without deleting them. An easy solution is to comment out the
code:

In [None]:
results = []
for line in file_handle:
    # keep the empty lines for now
    # if len(line) == 0:
    #   continue
    results.append(line.replace('foo', 'bar'))

#### Function and object method calls

Functions are called using parentheses and passing zero or more arguments, optionally
assigning the returned value to a variable:

In [None]:
result = f(x, y, z)
g()

Almost every object in Python has attached functions, known as methods, that have
access to the object’s internal contents. They can be called using the syntax:

In [None]:
obj.some_method(x, y, z)

Functions can take both positional and keyword arguments:

In [None]:
result = f(a, b, c, d=5, e='foo')

#### Variables and pass-by-reference

When assigning a variable (or name) in Python, you are creating a reference to the object
on the right hand side of the equals sign. In practical terms, consider a list of integers:

In [None]:
a = [1, 2, 3]

Suppose we assign a to a new variable b :

In [None]:
b = a

In some languages, this assignment would cause the data [1, 2, 3] to be copied. In
Python, a and b actually now refer to the same object, the original list [1, 2, 3] (see
Figure A-1 for a mockup). You can prove this to yourself by appending an element to
a and then examining b :

In [None]:
a.append(4)
b

<img src="imgs/img1.png" alt="Drawing" style="width: 400px;" >
Figure A-1. Two references for the same object

When you pass objects as arguments to a function, you are only passing references; no
copying occurs. Thus, Python is said to pass by reference, whereas some other languages
support both pass by value (creating copies) and pass by reference. This means that a
function can mutate the internals of its arguments. Suppose we had the following func-
tion:

In [None]:
def append_element(some_list, element):
    some_list.append(element)

Then given what’s been said, this should not come as a surprise:

In [None]:
data = [1, 2, 3]

append_element(data, 4)

In [4]: data
Out[4]: [1, 2, 3, 4]

#### Dynamic references, strong types

In contrast with many compiled languages, such as Java and C++, object references in
Python have no type associated with them. There is no problem with the following:

In [None]:
a = 5
type(a)
a = 'foo'
type(a)

Variables are names for objects within a particular namespace; the type information is
stored in the object itself. Some observers might hastily conclude that Python is not a
“typed language”. This is not true; consider this example:

In [None]:
'5' + 5

In some languages, such as Visual Basic, the string '5' might get implicitly converted
(or casted) to an integer, thus yielding 10. Yet in other languages, such as JavaScript,
the integer 5 might be casted to a string, yielding the concatenated string '55' . In this
regard Python is considered a strongly-typed language, which means that every object
has a specific type (or class), and implicit conversions will occur only in certain obvious
circumstances, such as the following:

In [None]:
a = 4.5
b = 2
# String formatting, to be visited later
print 'a is %s, b is %s' % (type(a), type(b))
a / b

Knowing the type of an object is important, and it’s useful to be able to write functions
that can handle many different kinds of input. You can check that an object is an
instance of a particular type using the isinstance function:

In [None]:
a = 5
isinstance(a, int)

isinstance can accept a tuple of types if you want to check that an object’s type is
among those present in the tuple:

In [None]:
a = 5; b = 4.5
isinstance(a, (int, float))
isinstance(b, (int, float))

#### Attributes and methods

Objects in Python typically have both attributes, other Python objects stored “inside”
the object, and methods, functions associated with an object which can have access to
the object’s internal data. Both of them are accessed via the syntax obj.attribute_name :

In [None]:
a = 'foo'

In [None]:
a. #click <Tab>

```
a.capitalize  a.format      a.isupper     a.rindex      a.strip
a.center      a.index       a.join        a.rjust       a.swapcase
a.count       a.isalnum     a.ljust       a.rpartition  a.title
a.decode      a.isalpha     a.lower       a.rsplit      a.translate
a.encode      a.isdigit     a.lstrip      a.rstrip      a.upper
a.endswith    a.islower     a.partition   a.split       a.zfill
a.expandtabs  a.isspace     a.replace     a.splitlines
a.find        a.istitle     a.rfind       a.startswith```

Attributes and methods can also be accessed by name using the getattr function:

In [None]:
getattr(a, 'split')

#### "Duck" typing

Often you may not care about the type of an object but rather only whether it has certain
methods or behavior. For example, you can verify that an object is iterable if it imple-
mented the iterator protocol. For many objects, this means it has a __iter__ “magic
method”, though an alternative and better way to check is to try using the iter function:

In [None]:
def isiterable(obj):
    try:
        iter(obj)
        return True
    except TypeError: # not iterable
        return False

This function would return True for strings as well as most Python collection types:

In [None]:
print isiterable('a string')
print isiterable([1, 2, 3])
print isiterable(5)

A place where I use this functionality all the time is to write functions that can accept
multiple kinds of input. A common case is writing a function that can accept any kind
of sequence (list, tuple, ndarray) or even an iterator. You can first check if the object is
a list (or a NumPy array) and, if it is not, convert it to be one:

#### Imports

In Python a module is simply a .py file containing function and variable definitions
along with such things imported from other .py files. Suppose that we had the following
module:

In [None]:
# some_module.py
PI = 3.14159

def f(x):
    return x + 2

def g(a, b):
    return a + b

If we wanted to access the variables and functions defined in some_module.py , from
another file in the same directory we could do:

In [None]:
import some_module
result = some_module.f(5)
pi = some_module.PI

Or equivalently:

In [None]:
from some_module import f, g, PI
result = g(5, PI)

By using the as keyword you can give imports different variable names:

In [None]:
import some_module as sm
from some_module import PI as pi, g as gf

r1 = sm.f(pi)
r2 = gf(6, pi)

#### Binary operators and comparisons

Most of the binary math operations and comparisons are as you might expect:

In [None]:
print 5 - 7
print 12 + 21.5
print 5 <= 2

See Table A-1 for all of the available binary operators.

Operation Description
* a + b Add a and b
* a - b Subtract b from a
* a * b Multiply a by b
* a / b Divide a by b
* a // b Floor-divide a by b , dropping any fractional remainder
* a \** b Raise a to the b power
* a & b True if both a and b are True . For integers, take the bitwise AND .
* a | b True if either a or b is True . For integers, take the bitwise OR .
* a ^ b For booleans, True if a or b is True , but not both. For integers, take the bitwise EXCLUSIVE-OR .
* a == b True if a equals b
* a != b True if a is not equal to b
* a <= b, a < b True if a is less than (less than or equal) to b
* a > b, a >= b True if a is greater than (greater than or equal) to b
* a is b True if a and b reference same Python object
* a is not b True if a and b reference different Python objects

To check if two references refer to the same object, use the is keyword. is not is also
perfectly valid if you want to check that two objects are not the same:

In [None]:
a = [1, 2, 3]
b = a
# Note, the list function always creates a new list
c = list(a)
print a is b
print a is not c

Note this is not the same thing is comparing with == , because in this case we have:

In [None]:
print a == c

A very common use of is and is not is to check if a variable is None , since there is only
one instance of None :

In [None]:
a = None
a is None

#### Strictness versus laziness

When using any programming language, it’s important to understand when expressions
are evaluated. Consider the simple expression:

In [None]:
a = b = c = 5
d = a + b * c

In Python, once these statements are evaluated, the calculation is immediately (or
strictly) carried out, setting the value of d to 30. In another programming paradigm,
such as in a pure functional programming language like Haskell, the value of d might
not be evaluated until it is actually used elsewhere. The idea of deferring computations
in this way is commonly known as lazy evaluation. Python, on the other hand, is a very
strict (or eager) language. Nearly all of the time, computations and expressions are
evaluated immediately. Even in the above simple expression, the result of b * c is
computed as a separate step before adding it to a .
There are Python techniques, especially using iterators and generators, which can be
used to achieve laziness. When performing very expensive computations which are only
necessary some of the time, this can be an important technique in data-intensive ap-
plications.

#### Mutable and immutable objects

Most objects in Python are mutable, such as lists, dicts, NumPy arrays, or most user-
defined types (classes). This means that the object or values that they contain can be
modified.

In [None]:
a_list = ['foo', 2, [4, 5]]
a_list[2] = (3, 4)
a_list

Others, like strings and tuples, are immutable:

In [None]:
a_tuple = (3, 5, (4, 5))
a_tuple[1] = 'four'

Remember that just because you can mutate an object does not mean that you always
should. Such actions are known in programming as side effects. For example, when
writing a function, any side effects should be explicitly communicated to the user in
the function’s documentation or comments. If possible, I recommend trying to avoid
side effects and favor immutability, even though there may be mutable objects involved.

### Scalar Types

Python has a small set of built-in types for handling numerical data, strings, boolean
( True or False ) values, and dates and time. See Table A-2 for a list of the main scalar
types. Date and time handling will be discussed separately as these are provided by the
datetime module in the standard library.

Table A-2. Standard Python Scalar Types

Type Description
* *None* The Python “null” value (only one instance of the None object exists)
* *str* String type. ASCII-valued only in Python 2.x and Unicode in Python 3
* *unicode* Unicode string type
* *float* Double-precision (64-bit) floating point number. Note there is no separate double type.
* *bool* A True or False value
* *int* Signed integer with maximum value determined by the platform.
* *long* Arbitrary precision signed integer. Large int values are automatically converted to long .

#### Numeric types

The primary Python types for numbers are int and float . The size of the integer which
can be stored as an int is dependent on your platform (whether 32 or 64-bit), but Python
will transparently convert a very large integer to long , which can store arbitrarily large
integers.

In [None]:
ival = 17239871
ival ** 6

Floating point numbers are represented with the Python float type. Under the hood
each one is a double-precision (64 bits) value. They can also be expressed using scien-
tific notation:

In [None]:
fval = 7.243
fval2 = 6.78e-5

In Python 3, integer division not resulting in a whole number will always yield a floating
point number:

In [None]:
3 / 2

In Python 2.7 and below (which some readers will likely be using), you can enable this
behavior by default by putting the following cryptic-looking statement at the top of
your module:

In [None]:
from __future__ import division

Without this in place, you can always explicitly convert the denominator into a floating
point number:

In [None]:
3 / float(2)

To get C-style integer division (which drops the fractional part if the result is not a
whole number), use the floor division operator // :

In [None]:
3 // 2

Complex numbers are written using j for the imaginary part:

In [None]:
cval = 1 + 2j
cval * (1 - 2j)

#### Strings

Many people use Python for its powerful and flexible built-in string processing capa-
bilities. You can write string literal using either single quotes ' or double quotes " :

In [None]:
a = 'one way of writing a string'
b = "another way"

For multiline strings with line breaks, you can use triple quotes, either ''' or """ :

In [None]:
c = """
This is a longer string that
spans multiple lines
"""

Python strings are immutable; you cannot modify a string without creating a new string:

In [None]:
a = 'this is a string'
a[10] = 'f'
b = a.replace('string', 'longer string')
b

Many Python objects can be converted to a string using the str function:

In [None]:
a = 5.6
s = str(a)
s

Strings are a sequence of characters and therefore can be treated like other sequences,
such as lists and tuples:

In [None]:
s = 'python'
list(s)
s[:3]

The backslash character \ is an escape character, meaning that it is used to specify
special characters like newline \n or unicode characters. To write a string literal with
backslashes, you need to escape them:

In [None]:
s = '12\\34'
print s

If you have a string with a lot of backslashes and no special characters, you might find
this a bit annoying. Fortunately you can preface the leading quote of the string with r
which means that the characters should be interpreted as is:

In [None]:
s = r'this\has\no\special\characters'
s

Adding two strings together concatenates them and produces a new string:

In [None]:
a = 'this is the first half '
b = 'and this is the second half'
a + b

String templating or formatting is another important topic. The number of ways to do
so has expanded with the advent of Python 3, here I will briefly describe the mechanics
of one of the main interfaces. Strings with a % followed by one or more format characters
is a target for inserting a value into that string (this is quite similar to the printf function
in C). As an example, consider this string:

In [None]:
template = '%.2f %s are worth $%d'

In this string, %s means to format an argument as a string, %.2f a number with 2 decimal
places, and %d an integer. To substitute arguments for these format parameters, use the
binary operator % with a tuple of values:

In [None]:
template % (4.5560, 'Argentine Pesos', 1)

String formatting is a broad topic; there are multiple methods and numerous options
and tweaks available to control how values are formatted in the resulting string. To
learn more, I recommend you seek out more information on the web.
I discuss general string processing as it relates to data analysis in more detail in Chap-
ter 7.

#### Booleans

The two boolean values in Python are written as True and False . Comparisons and
other conditional expressions evaluate to either True or False . Boolean values are com-
bined with the and and or keywords:

In [None]:
True and True
False or True

Almost all built-in Python tops and any class defining the __nonzero__ magic method
have a True or False interpretation in an if statement:

In [None]:
a = [1, 2, 3]
if a:
    print 'I found something!'

b = []
if not b:
    print 'Empty!'

Most objects in Python have a notion of true- or falseness. For example, empty se-
quences (lists, dicts, tuples, etc.) are treated as False if used in control flow (as above
with the empty list b ). You can see exactly what boolean value an object coerces to by
invoking bool on it:

In [None]:
bool([]), bool([1, 2, 3])
bool('Hello world!'), bool('')
bool(0), bool(1)

#### Type casting

The str , bool , int and float types are also functions which can be used to cast values
to those types:

In [None]:
s = '3.14159'
fval = float(s)
type(fval)
int(fval)
bool(fval)
bool(0)

#### None

None is the Python null value type. If a function does not explicitly return a value, it
implicitly returns None .

In [None]:
a = None
a is None
b = 5
b is not None

None is also a common default value for optional function arguments:

In [None]:
def add_and_maybe_multiply(a, b, c=None):
    result = a + b

    if c is not None:
        result = result * c

    return result

While a technical point, it’s worth bearing in mind that None is not a reserved keyword
but rather a unique instance of NoneType .

#### Dates and times

The built-in Python datetime module provides datetime , date , and time types. The
datetime type as you may imagine combines the information stored in date and time
and is the most commonly used:

In [None]:
from datetime import datetime, date, time
dt = datetime(2011, 10, 29, 20, 30, 21)
dt.day
dt.minute

Given a datetime instance, you can extract the equivalent date and time objects by
calling methods on the datetime of the same name:

In [None]:
dt.date()
dt.time()

The strftime method formats a datetime as a string:

In [None]:
dt.strftime('%m/%d/%Y %H:%M')


Strings can be converted (parsed) into datetime objects using the strptime function:

In [None]:
datetime.strptime('20091031', '%Y%m%d')


See Table 10-2 for a full list of format specifications.
Type Description
* %Y 4-digit year
* %y 2-digit year
* %m 2-digit month [01, 12]
* %d 2-digit day [01, 31]
* %H Hour (24-hour clock) [00, 23]
* %I Hour (12-hour clock) [01, 12]
* %M 2-digit minute [00, 59]
* %S Second [00, 61] (seconds 60, 61 account for leap seconds)
* %w Weekday as integer [0 (Sunday), 6]
* %U Week number of the year [00, 53]. Sunday is considered the first day of the week, and days before the first Sunday of the year are “week 0”.
* %W Week number of the year [00, 53]. Monday is considered the first day of the week, and days before the first Monday of the year are “week 0”.
* %z UTC time zone offset as +HHMM or -HHMM , empty if time zone naive
* %F Shortcut for %Y-%m-%d , for example 2012-4-18
* %D Shortcut for %m/%d/%y , for example 04/18/12

In [None]:
dt.replace(minute=0, second=0)

The difference of two datetime objects produces a datetime.timedelta type:

In [None]:
dt2 = datetime(2011, 11, 15, 22, 30)
delta = dt2 - dt
delta
type(delta)

Adding a timedelta to a datetime produces a new shifted datetime :

In [None]:
dt
dt + delta