# Python Basics

This guide was primarily intended as a transitional guide for R users, but should be useful as a general introduction to Python. Don't be alarmed by the constant comparison of two dramatically different languages.

The author (at this moment singular) uses python 2.7 for the same intertial reason that anyone would. Please do not judge them as you move off the farm and into the big city I mean learn Python 3.

We will skip python installation and configuration for now, which are critically important for real Python use, but are less important for the hour or two we'll be spending with the language this time.

Note that ''there is no excuse for actually reading the language documentation. Spending a few days reading the docs will very quickly pay for the months you would otherwise spend fumbling around in misery/stackexchange.'' One of the blessings of having a well-maintained language is having official docs, what are ya counting your blessings before they hatch now?

* [Python Tutorial](https://docs.python.org/2/tutorial/index.html) To start
* [Language Reference](https://docs.python.org/2/reference/index.html) To understand
* [Standard Library](https://docs.python.org/2/library/index.html) To become powerful

Two pages from the FAQ are also useful

* [Programming FAQ](https://docs.python.org/2/faq/programming.html) Answers to common practical questions
* [Design and History FAQ](https://docs.python.org/2/faq/design.html) Answers to common philosophical questions


## What is a Python?

Python is an interpreted programming language. An interpreted programming language is a "high-level" programming language that converts syntactically correct human-readable text into a series of tricks that are already in a language that a computer can understand. Interpreted languages contrast with compiled languages, which need to be converted (compiled) to computer language (machine code) before they can be used.

A Python is a text interpreter. A text "parser." To use this parser, one must know its rules.

## Opinionated Syntax

In Python, your comportment is important. I mean formatting is important. I mean formatting is syntax.

Where other languages rely on delimiters like (), {}, etc. to determine the logical structure of a program, line breaks and tabs are also meaningful in Python. 

Statements like 

In [135]:
for i in range(5):
    print("Hello again")

Hello again
Hello again
Hello again
Hello again
Hello again


are thus functionally different than

In [136]:
#for i in range(5):
#print("Hello again")

which would return an error, and Python would tell us so and refuse to proceed.

This foray into indentation syntax is an out-of-place convulsion of information, but it is intended as an early echo of what will be a recurring theme: Python is picky about the way that things are written. 

This is for a good reason. Python is an opinionated language, and I have found few cases where I disagree with it. Its opinions are formally stored in [PEPs: Python Enhancement Proposals](https://www.python.org/dev/peps/). The most influential being [PEP 20](https://www.python.org/dev/peps/pep-0020/), which can be printed as often as we forget it as follows:

In [175]:
import this

<module 'this' from '/usr/local/Cellar/python/2.7.14/Frameworks/Python.framework/Versions/2.7/lib/python2.7/this.py'>


## Practical Syntax

Now to using this thing.

We'll indulge in the typical series of list creation operations that for some reason frequently feature military slang or Monty Python (they just said the name of the movie in the movie!) quotes.

### Assignation

Variables are created and assigned names much like you would expect, with the '=' operator

In [138]:
cool_variable = 10 # ten is cool. comments are preceded by # btw

though, as we'll discuss more in a moment, variable assignation/creation is better characterized as "object instantiation" because everything in Python is an object.

### Types

Like any other language, Python can be used as a calculator.

In [139]:
1+1

2

In [140]:
2*2

4

In [141]:
5-3

2

In [142]:
8/3

2

That doesn't look right if I ever been in third grade. An immediate implication of everything in Python being an object is that we should be clear about what ''type'' or ''class'' an object is. An object's type determines not only what it is, but what it knows how to do. 

In [143]:
# What is an 8 anyway?
type(8)

int

Since `8` and `3` are `int`egers by default, Python assumes that the output should also be an integer, so it returns `2` as the nearest integer to `8/3`, rounded down. `float`s, floating point numbers, or numbers with those points as they are apparently known on my keyboard in this moment, are created by using a `.`

In [144]:
8/3.

2.6666666666666665

In [145]:
8./3 
# also works, coersion works as in R, 
# going from least to most permissive

2.6666666666666665

Let's run through the other common [builtin types](https://docs.python.org/2/library/stdtypes.html) before our typing hands get too sweaty

#### Numbers

In [146]:
type(8) # Plain integers - only 32 bits of precision

int

In [147]:
# We can see the maximum integer like so...
# (more in imports down below)
import sys
sys.maxint

9223372036854775807

In [148]:
type(sys.maxint)

int

In [149]:
# and if we go one over we get....
type(sys.maxint+1)

long

In [150]:
long(100) # Long integers that can be any value

100L

In [151]:
type(100.) # Floating point numbers 

float

In [152]:
# and complex numbers
complex(10, 1)

(10+1j)

In [153]:
# with real and imaginary parts
a_complex_number = complex(10, 1)
a_complex_number.imag # more on this syntax later too...

1.0

Because I was a tease before, let's round out the arithmetic operators

In [154]:
13 % 6 # Modular arithmetic

1

In [155]:
2 ** 8 # exponents

256

### Sequences

Two common types share perhaps an unexpected common soul, `str`ings and `list`s. We can think of `str`ings as character sequences.

In [156]:
type("cool string")

str

In [157]:
'''triple quoted strings
can span multiple lines
and generally allow you
to be too verbose'''

'triple quoted strings\ncan span multiple lines\nand generally allow you\nto be too verbose'

In [158]:
type(["a", 'list', "of", "cool", "words"])
# note that "" and '' are interchangeable

list

Strings are, by default, "byte" strings in Python 2. I hear that Python 3 does it better by not having them at all. A long and tedious story short, characters need to have some means of ''encoding'' as 0/1 bits. The most versatile/common/universal means of encoding nowadays is ''unicode'', but back in the day of slow computers it was ''ASCII'', which represents characters with 8 bits, and so only has 2^8=256 characters. 

We do unicode in Python 2 with 

In [159]:
type(u'a cool unicode string')

unicode

String handling is one of the early and immediate standout features of Python, where you can do stuff like

In [160]:
'get together ' + 'u stringy ones'

'get together u stringy ones'

In [161]:
"  probs stole this string from the internet   ".strip()

'probs stole this string from the internet'

In [162]:
dont_forget_her_name = "Alecia"
or_her_birthday = 10
uh_oh = "oh cmon {} of course today is {}".format(dont_forget_her_name, or_her_birthday)
uh_oh.capitalize()

'Oh cmon alecia of course today is 10'

Lists are such a common type you'll probably forget they are a type instead of just being a brackety stack of all the junk you want, but some quick hits:

Python uses 0 indexing, where 0 is the first item in a sequence. This may take some getting used to, but has some helpful properties and avoids the ambiguity of other indexing regimes. The final number in a number:number slice is uninclusive - for 0:3 we get the numbers from the first entry ''up to but not including'' the fourth entry (aka the third entry).

In [163]:
gotta_count = [0,1,2,3,4,5]
gotta_count[1]

1

In [164]:
gotta_count[0:3] # we get three numbers this way

[0, 1, 2]

In [166]:
things_to_remember = ['money', 'medicine']
things_to_remember.append('to eat')
print(things_to_remember)

['money', 'medicine', 'to eat']


List comprehensions are useful for all your casual iteration needs, [see more about the syntax here](https://docs.python.org/2/tutorial/datastructures.html#list-comprehensions)

In [167]:
a_range = range(10)
print("a range is {}".format(a_range))

odds = [number for number in a_range if number % 2 == 1]
print("and the odd numbers in it are {}".format(odds))


a range is [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
and the odd numbers in it are [1, 3, 5, 7, 9]


stuff one should also know about but we won't cover here
* [tuples](https://docs.python.org/2/library/functions.html#tuple)
* [sets](https://docs.python.org/2/library/stdtypes.html#set)

### Dictionaries

The rough equivalent of R's lists are dictionaries. Dictionaries are mappings from some key to some value. Rather than the implicit positional mapping from an object's position in a list to its values, dictionaries allow (almost) anything to be mapped to (almost) anything.

In [168]:
dictionary_to_read = { # dictionaries are declared w/ brackets
    # each item has a key and a value, declared with a colon
    'bigfoot': 'whoa that dude really got my heart pumpin',
    # entries are comma separated
    25: 'almost counted there once',
    'macroeconomics': 'makes ya sound drunk when ya say it even if ya arent!'
}
from pprint import pprint # prettyprint, just using it so there's \n between items
pprint(dictionary_to_read)

{25: 'almost counted there once',
 'bigfoot': 'whoa that dude really got my heart pumpin',
 'macroeconomics': 'makes ya sound drunk when ya say it even if ya arent!'}


Indexing dictionaries is as we expect

In [169]:
dictionary_to_read['bigfoot']

'whoa that dude really got my heart pumpin'

But watch out, because keys are no longer positional (always remember! Dictionaries '''have no order''' even if you think you're making stuff with an order. You're looking for [odicts](https://docs.python.org/2/library/collections.html#collections.OrderedDict)

In [170]:
dictionary_to_read[25]

'almost counted there once'

We can access different parts of the dictionary...

In [171]:
print(dictionary_to_read.keys())
print(dictionary_to_read.values())
print(dictionary_to_read.items()) # see looket that list of tuples

['macroeconomics', 25, 'bigfoot']
['makes ya sound drunk when ya say it even if ya arent!', 'almost counted there once', 'whoa that dude really got my heart pumpin']
[('macroeconomics', 'makes ya sound drunk when ya say it even if ya arent!'), (25, 'almost counted there once'), ('bigfoot', 'whoa that dude really got my heart pumpin')]


one more important method...

In [172]:
learned_new_words = {
    "agronomics": "that's the word i meant earlier",
    "meddlesome": "you keep taking the things out of my desk"
}

dictionary_to_read.update(learned_new_words)
pprint(dictionary_to_read)

{25: 'almost counted there once',
 'agronomics': "that's the word i meant earlier",
 'bigfoot': 'whoa that dude really got my heart pumpin',
 'macroeconomics': 'makes ya sound drunk when ya say it even if ya arent!',
 'meddlesome': 'you keep taking the things out of my desk'}


### Logic & Comparison

What is a programming language without logic?

The basic booleans

In [1]:
type(True)
type(False)

bool

All logical statements yield a boolean or sequence of booleans.

Comparison syntax is also as we expect

In [6]:
1 < 2  # less than
1 > 2  # greater than
2 >= 3 # greater than or equal to
2 <= 3 # less than or equal to
'this' == 'that' # equal to
'this' != 'that' # not equal to

True

Logical operators are words rather than symbols - the symbols are 'bitwise' operators, which we won't cover.

In [5]:
1 < 2 and     2 <= 3 # boolean AND
1 > 2 or      2 <= 3  # boolean OR
1 < 2 and not 2 >= 3 # boolean NOT

True

Comparison and logical syntax can be chained

In [2]:
1 < 2 <= 3
# is equivalent to
1 < 2 and 2 <= 3

True

The 'is' operator is slightly special, it tests whether its arguments are the same object, not the same value. So even though all the variables below are ones, the one that is a `float` is not the same as the `int`egers.

In [9]:
one_one       = 1
another_one   = 1
different_one = 1.

print(one_one == another_one)
print(one_one is another_one)
print(one_one == different_one)
print(one_one is different_one)

True
True
True
False


Tests of membership are also straightforward

In [13]:
my_bank_account = ['empty space', 'identity thieves', 'self worth']
print('self worth' in my_bank_account)
print('money'  not in my_bank_account)

True
True


All objects have logical values, typically being false when they are empty. 
The truth value of non-empty objects is a little tricky - since they are not "equal to" `True` they will fail a boolean test, but they will be true when used in an `if` or `while` condition

In [21]:
pocket_lint   = []
lonely_number = 0
list_of_one = ['one']

print(pocket_lint == True)
print(lonely_number == True)
print(list_of_one == True)

if pocket_lint:
    print("there was something in there after all!")
    
if list_of_one:
    print("why'd you put that in a list anyway?")


False
False
False
why'd you put that in a list anyway?


## Objects

Everything in Python is an object. 

Every object has an '''identity, a type, and a value'''. An object's '''identity''' is not the name given to it on assignment, but its literal identity to the computer -- roughly its memory address. An object's '''type''' determines, as we have seen above, the operations or `methods` available to it: Python knows how to add numbers because addition is defined for objects that are number types. An object's '''value''' is a more nebulous concept, but we can think of it as the data that the object contains ([in some places in the documentation](https://docs.python.org/2/reference/expressions.html#value-comparisons) value is implicitly defined as the thing that is used in comparison operations). 

Objects have '''attributes''' and '''methods.''' Attributes function mostly like [an object's dictionary](https://docs.python.org/2.7/reference/datamodel.html#the-standard-type-hierarchy), and are the object's (relatively) static properties. Methods are functions that are particular to a class (unbound) or particular to an instance (bound) ([for more clarification about an instance vs. a class](https://www.codecademy.com/en/forum_questions/558cd3fc76b8fe06280002ce) and [what it means to be bound vs unbound method](http://stupidpythonideas.blogspot.com/2013/06/how-methods-work.html)). Both are accessed using `.` notation, the difference being that methods are '''called''' using the `()` operator. Perhaps confusingly, anything using the `.` notation is considered an attribute -- one can think of an object having a method as one of its attributes in that having that method is something that describes the object, or something the object knows about, but methods have the special ability of running additional code when called with `()`.

In [23]:
some_list = [1,2,3,4]

# we can list the attributes of an object with dir
dir(some_list)

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__delslice__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getslice__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__imul__',
 '__init__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__rmul__',
 '__setattr__',
 '__setitem__',
 '__setslice__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'append',
 'count',
 'extend',
 'index',
 'insert',
 'pop',
 'remove',
 'reverse',
 'sort']

In [29]:
# or by calling the help function
help(some_list)

Help on list object:

class list(object)
 |  list() -> new empty list
 |  list(iterable) -> new list initialized from iterable's items
 |  
 |  Methods defined here:
 |  
 |  __add__(...)
 |      x.__add__(y) <==> x+y
 |  
 |  __contains__(...)
 |      x.__contains__(y) <==> y in x
 |  
 |  __delitem__(...)
 |      x.__delitem__(y) <==> del x[y]
 |  
 |  __delslice__(...)
 |      x.__delslice__(i, j) <==> del x[i:j]
 |      
 |      Use of negative indices is not supported.
 |  
 |  __eq__(...)
 |      x.__eq__(y) <==> x==y
 |  
 |  __ge__(...)
 |      x.__ge__(y) <==> x>=y
 |  
 |  __getattribute__(...)
 |      x.__getattribute__('name') <==> x.name
 |  
 |  __getitem__(...)
 |      x.__getitem__(y) <==> x[y]
 |  
 |  __getslice__(...)
 |      x.__getslice__(i, j) <==> x[i:j]
 |      
 |      Use of negative indices is not supported.
 |  
 |  __gt__(...)
 |      x.__gt__(y) <==> x>y
 |  
 |  __iadd__(...)
 |      x.__iadd__(y) <==> x+=y
 |  
 |  __imul__(...)
 |      x.__imul__(y) <==

The `__(attribute)__` syntax means that it is a "special" attribute, ignore that for now, [but if you're curious see here](https://www.python.org/dev/peps/pep-0008/#descriptive-naming-styles) [or here](https://shahriar.svbtle.com/underscores-in-python).

The base classes don't have very exciting attributes, but...

In [35]:
# we can get a list's length in a functional style
print(len(some_list))

# which is equivalent to accessing its __len__ attribute
print(some_list.__len__)

# but since __len__ is a method, we have to call it
print(some_list.__len__())

4
<method-wrapper '__len__' of list object at 0x10d3d8710>


If you are more used to functional programming, using an object's methods can be thought of calling a function on an object and passing the object as the first argument.

In [37]:
# Rather than thinking about combining lists like
# new_list = append(some_list, another_list)

# We do
some_list.append(5)
print(some_list)

[1, 2, 3, 4, 5, 5]


We will cover object creation in more detail later.

## Grouped Statements and Control Flow

A programming language is pretty useless if every statement is unrelated to other statements. Other languages will group statements using delimiters like `{}`. Python instead uses line breaks and indentation. This results in more readable code. Grouped statements are typically those inside control flow tools like `if` and `while`, though others exist like `with`.

`if` is the simplest way to make something conditional. All grouped statements contain an expression and end in a colon. They apply to all the following lines of the same indentation level

In [39]:
day_of_week = 'tuesday'

print(day_of_week == 'tuesday')

if day_of_week == 'tuesday':
    print("it is tuesday")
    
# this would throw an error because our indentation is wrong
# if day_of_week == 'tuesday':
# print('it is tuesday')

# this would always print 'it is tuesday', because it is not in the 'if's' indentation level
# thus thwarting our conditional statement
# if day_of_week == 'tuesday':
#     joy = True
# print('it is tuesday')

True
it is tuesday


`if` statements can be combined with additional conditions: 

* `elif` - if the first thing is False, check if this one is true
* `else` - if everything is False, do this.

`elif` must come after `if`, and `else` has to come last.

In [40]:
day_of_week = 'wednesday'

if day_of_week == 'tuesday':
    print('it is tuesday')
elif day_of_week == 'wednesday':
    print('it is wednesday')
else:
    print('who knows')

it is wednesday
