# Basics of the Python syntax

**Author:** Ties de Kok ([Personal Website](http://www.tiesdekok.com))  
**Last updated:** 18 May 2018  
**Python version:** Python 3.6  
**License:** MIT License  

**Note:** Some features (like the ToC) will only work if you run it locally, use Binder, or use nbviewer by clicking this link: 
https://nbviewer.jupyter.org/github/TiesdeKok/LearnPythonforResearch/blob/master/0_python_basics.ipynb

# *Introduction*

This notebook contains an overview of basic Python functionality that you might come across using Python for Social Science Research.   

**Note:** this notebook is deliberately not 100% comprehensive, it only discusses the basic things you need to get started.

# *Table of Contents* <a id='toc'></a>

* [Variables](#variables)   
* [Displaying something](#display)
* [Numerical operations](#num-operations)   
* [String operations](#string-operations)   
* [Data structures](#data-structures)   
* [Slicing](#slicing)   
* [Functions](#functions)   
* [Whitespaces](#whitespace)   
* [Conditionals](#conditionals)   
* [Looping](#looping)   
* [Comprehensions](#comprehensions)   
* [Catching exceptions](#catching-exceptions)   
* [Importing libraries](#importing)   
* [OS operations](#os-operations)   
* [File Input / Output](#files)   

## <span style="text-decoration: underline;">Variables</span><a id='variables'></a> [(to top)](#toc)

Basic numeric types in Python are `int` for integers and `float` for floating point numbers.  
Strings are represented by `str`, in Python 3.x this implies a sequence of Unicode characters.

In [3]:
a = 5
b = 3.5
c = 'A string'

In [4]:
type(a), type(b), type(c)

(int, float, str)

Converting types:

In [5]:
int(3.6), str(5)

(3, '5')

Checking types:

In [6]:
type(a), type(b), type(c)

(int, float, str)

In [7]:
isinstance(a, float)

False

## <span style="text-decoration: underline;">Displaying something</span><a id='display'></a> [(to top)](#toc)

In [43]:
print('Hello')

Hello


*Note:* `print 'Hello'` does not work in Python 3

In [44]:
print('Hello ' + 'World')

Hello World


In [46]:
apples = 'apples'
print('I have', 2, apples)

I have 2 apples


## <span style="text-decoration: underline;">Numerical operations</span><a id='num-operations'></a> [(to top)](#toc)

In [8]:
2+2

4

In [9]:
3 / 4

0.75

*Note:* if you use Python 2 you have to do:

In [10]:
3 / float(4)

0.75

## <span style="text-decoration: underline;">String operations</span><a id='string-operations'></a> [(to top)](#toc)

Define strings with single, double or triple quotes (for multi-line)

In [24]:
hello = 'world'
saying = "hello world"
paragraph = """ This is
a paragraph
"""

### Variables in strings

In [25]:
'%d' % 20

'20'

In [26]:
'%.3f %.2f' % (20, 1/3)

'20.000 0.33'

A more clean alternative is to use `.format()`

In [16]:
'{} {}'.format(20, 1/3)

'20 0.3333333333333333'

In [15]:
'{1} {0}'.format(20, 1/3)

'0.3333333333333333 20'

In [27]:
'{:.3f} {:.2f}'.format(20, 1/3)

'20.000 0.33'

**Note:** starting from Python 3.6 you can also use so-caled "F strings":

In [15]:
year, p_version = 2018, '3.6'
f'The year {year} is pretty awesome with F-strings from Python {p_version}'

'The year 2018 is pretty awesome with F-strings from Python 3.6'

## <span style="text-decoration: underline;">Data structures</span><a id='data-structures'></a> [(to top)](#toc)

There are 4 basic data structures: lists (`list`), tuples (`tuple`), dictionaries (`dict`), and sets (`set`)

### Lists

Lists are enclosed in brackets

In [19]:
pets = ['dogs', 'cat', 'bird'] 
pets.append('lizard')
pets

['dogs', 'cat', 'bird', 'lizard']

### Tuple

Tuples are enclosed in parentheses  
*Note:* You cannot add or remove elements from a tuple but they are faster and consume less memory

In [18]:
pets = ('dogs', 'cat', 'bird')
pets

('dogs', 'cat', 'bird')

### Dictionaries

Dictionaries are build using curly brackets  
*Note:* Dictionaries are unordered but have key, value pairs

In [21]:
person = {'name': 'fred', 'age': 29}
print(person['name'], person['age'])

fred 29


In [31]:
person['money'] = 50
del person['age']
person

{'money': 50, 'name': 'fred'}

### Set

A set is like a list but it can only hold unique values.  

In [26]:
pets_1 = set(['dogs', 'cat', 'bird'])
pets_2 = set(['dogs', 'horse', 'zebra', 'zebra'])
pets_2

{'dogs', 'horse', 'zebra'}

There are many useful operations that you can perform using sets

In [28]:
pets_1.union(pets_2)

{'bird', 'cat', 'dogs', 'horse', 'zebra'}

In [29]:
pets_1.intersection(pets_2)

{'dogs'}

In [30]:
pets_1.difference(pets_2)

{'bird', 'cat'}

### Combinations

Data structures can hold any Python object!

In [23]:
combo = ('apple', 'orange')
mix = {'fruit' : [combo, ('banana', 'pear')]}
mix['fruit'][0]

('apple', 'orange')

## <span style="text-decoration: underline;">Slicing</span><a id='slicing'></a> [(to top)](#toc)

If an object is ordered (such as a list or tuple) you can select on index  

In [32]:
pets = ['dogs', 'cat', 'bird', 'lizzard']

In [33]:
favorite_pet = pets[0]
favorite_pet

'dogs'

In [34]:
reptile = pets[-1]
reptile

'lizzard'

In [35]:
pets[1:3]

['cat', 'bird']

In [36]:
pets[:2]

['dogs', 'cat']

*Note:* this also works on strings:

In [37]:
fruit = 'banana'
fruit[:2]

'ba'

## <span style="text-decoration: underline;">Functions</span><a id='functions'></a> [(to top)](#toc)

A Python function takes arguments as input and defines logic to process these inputs (and possibly returns something).

In [38]:
def add_5(number):
    return number + 5

The action of defining a function does not execute the code! It will only execute once you call the function:

In [39]:
add_5(10)

15

You can also add arguments with default values:

In [40]:
def add(number, add=5):
    return number + add

In [41]:
add(10)

15

In [42]:
add(10, add=3)

13

### Python also has unnamed functions for one-time use called "lambda functions"

In [2]:
pairs = [(1, 'one'), (2, 'two'), (3, 'three'), (4, 'four')]
pairs.sort(key=lambda pair: pair[1])
print(pairs)

[(4, 'four'), (1, 'one'), (3, 'three'), (2, 'two')]


**Note:** don't worry if these don't make sense. Just remember that they are simply functions without a name.

## <span style="text-decoration: underline;">Whitespace (blocks)</span><a id='whitespace'></a> [(to top)](#toc)

Indentations are required by Python to sub-set blocks of code.  
*Note:* these subsets have their own local scope, notice variable `a`:

In [47]:
def example():
    a = 'Layer 1'
    print(a)
    
    def layer_2():
        a = 'Layer 2'
        print(a)
        
    layer_2()

In [48]:
example()

Layer 1
Layer 2


## <span style="text-decoration: underline;">Conditionals</span><a id='conditionals'></a> [(to top)](#toc)

In [47]:
grade = 95
if grade == 90:
    print('A')
elif grade < 90:
    print('B')
elif grade >= 80:
    print('C')
else:
    print('D')

C


## <span style="text-decoration: underline;">Looping</span><a id='looping'></a> [(to top)](#toc)

In [51]:
for num in range(0, 6, 2):
    print(num)

0
2
4


In [52]:
list_fruit = ['Apple', 'Banana', 'Orange']
for fruit in list_fruit:
    print(fruit)

Apple
Banana
Orange


In [53]:
for num in range(100):
    print(num)
    if num == 2:
        break

0
1
2


You can also (infinitely) loop using `while`:

In [50]:
count = 0
while count < 4:
    print(count)
    count += 1

0
1
2
3


Looping over a tuple in a list:

In [49]:
tuple_in_list = [(1, 2), (3, 4)]
for a, b in tuple_in_list:
    print(a + b)

3
7


Looping over a dictionary:  
*Note:* if using Python 2.7 you need to use `.iteritems()`

In [16]:
dictionary = {'one' : 1, 'two' : 2, 'three' : 3}
for key, value in dictionary.items():
    print(key, value + 10)

one 11
two 12
three 13


## <span style="text-decoration: underline;">Comprehensions</span><a id='comprehensions'></a> [(to top)](#toc)

A comprehension makes it easier to generate a list or dictionary using a loop.  

**List comprehension:**

In [52]:
new_list = [x + 5 for x in range(0,6)]
new_list

[5, 6, 7, 8, 9, 10]

*Traditional way to achieve the same:*

In [53]:
new_list = []
for x in range(0,6):
    new_list.append(x + 5)
new_list

[5, 6, 7, 8, 9, 10]

**Dictionary comprehension:**

In [55]:
new_dict = {'num_{}'.format(x) : x + 5 for x in range(0,6)}
new_dict

{'num_0': 5, 'num_1': 6, 'num_2': 7, 'num_3': 8, 'num_4': 9, 'num_5': 10}

*Traditional way to achieve the same:*

In [56]:
new_dict = {}
for x in range(0,6):
    new_dict['num_{}'.format(x)] = x + 5
new_dict

{'num_0': 5, 'num_1': 6, 'num_2': 7, 'num_3': 8, 'num_4': 9, 'num_5': 10}

## <span style="text-decoration: underline;">Catching Exceptions</span><a id='catching-exceptions'></a> [(to top)](#toc)

A Python exception looks like this:

In [18]:
num_list = [1, 2, 3]
num_list.remove(4)

ValueError: list.remove(x): x not in list

You can catch exceptions using `try` and `except`:

In [19]:
try:
    num_list.remove(4)
except:
    print('ERROR!')

ERROR!


It is usually best practice to specify the error type to except:

In [25]:
try:
    num_list.remove(4)
except ValueError as e:
    print('Error: ', e)
except Exception as e:
    print('Other error: ', e)
finally:
    print('Done')

Error:  list.remove(x): x not in list
Done


## <span style="text-decoration: underline;">Importing Libraries</span><a id='importing'></a> [(to top)](#toc)

In [64]:
import math
math.sin(1)

0.8414709848078965

In [65]:
import math as math_lib
math_lib.sin(1)

0.8414709848078965

In [66]:
from math import sin
sin(1)

0.8414709848078965

## <span style="text-decoration: underline;">OS operations</span><a id='os-operations'></a> [(to top)](#toc)

In [27]:
import os

### Get current working directory

In [28]:
os.getcwd()

'G:\\My Drive\\Work\\Programming\\Active\\PythonAccountingResearch'

### List files/folders in directory

In [31]:
os.listdir()[:5]

['README.md',
 'environment.yml',
 '3_visualizing_data.ipynb',
 '.gitignore',
 'exercises.ipynb']

*Note:* combine with simple comprehension to filter on file type!

In [33]:
[file for file in os.listdir() if file[-5:] == 'ipynb'][:5]

['3_visualizing_data.ipynb',
 'exercises.ipynb',
 '1_opening_files.ipynb',
 '0_python_basics.ipynb',
 '4_web_scraping.ipynb']

### Change working directory

In [70]:
os.chdir(r'G:\My Drive\Work\Programming\Active\PythonAccountingResearch')

*Note:* `r'path'` indicates a raw string  
A raw string does not see `\` as a special character

## <span style="text-decoration: underline;">File Input/Output</span><a id='files'></a> [(to top)](#toc)

You can open a file with different file modes:  
`w` -> write only  
`r` -> read only  
`w+` -> read and write + completely overwrite file   
`a+` -> read and write + append at the bottom


In [71]:
with open('new_file.txt', 'w') as file:
    file.write('Content of new file. \nHi there!')

In [72]:
with open('new_file.txt', 'r') as file:
    file_content = file.read()

In [73]:
file_content

'Content of new file. \nHi there!'

In [74]:
print(file_content)

Content of new file. 
Hi there!


In [75]:
with open('new_file.txt', 'a+') as file:
    file.write('\n' + 'New line')

In [76]:
with open('new_file.txt', 'r') as file:
    print(file.read())

Content of new file. 
Hi there!
New line


*Note:* using `with` is best as it automatically closes the file