<span>
<img src="http://www.sobigdata.eu/sites/default/files/logo-SoBigData-DEFINITIVO.png" width="180px" align="right"/>
</span>
<span>
<b>Author:</b> <a href="http://about.giuliorossetti.net">Giulio Rossetti</a><br/>
<b>Python version:</b>  3.6<br/>
<b>Last update:</b> 22/01/2018
</span>

<a id='top'></a>
# *Python Basics*

This notebook contains an overview of basic Python functionality that you might come across using Python for Social Science Research.

**Note:** this notebook is purposely not 100% comprehensive, it only discusses the basic things you need to get started.

## Table of Contents

1. [Hello Wolrd!](#display)
2. [Variables](#variables)
3. [Numeric Operations](#numeric)
4. [Strings](#strings)
5. [Data Structures](#ds)
6. [Slicing](#slicing)
7. [Functions](#functions)
8. [Code Blocks](#blocks)
9. [Flow Control](#flow)
10. [Comprehensions](#comp)
11. [Exceptions](#exceptions)
12. [File Input/Output](#io)
13. [Importing Libraries](#libs)
14. [Zen of Python](#zen)

<a id='display'></a>
## 1. Hello World!  ([to top](#top))

Your first - one line - program: **"Hello World"**

In [1]:
print("Hello world!")

Hello world!


<a id='variables'></a>
## 2. Variables and Types  ([to top](#top))

In programming languages **variables** are named entities used to store information.

A **variable** can reference several types of contents, for instance:
- Basic numeric **types** in Python are ``int`` for integers and ``float`` for floating point numbers.
- Strings are represented by ``str``, in Python 3.x this implies a sequence of Unicode characters.


In [2]:
a = 1
b = 0.5
c = 'Giulio'

In [3]:
type(a), type(b), type(c) # check variable type

(int, float, str)

In [4]:
isinstance(a, float) # test variable type

False

In [5]:
int(2.5), str(2), float(3) # type conversion 

(2, '2', 3.0)

In [6]:
print("Have a nice class,", c)

Have a nice class, Giulio


<a id='numeric'></a>
## 3. Numeric operations  ([to top](#top))

The python interpreter allows to perform simple math

In [7]:
5+2 # sum (subtraction, alike)

7

In [8]:
5/2 # division

2.5

In [9]:
5%2 # module

1

In [10]:
5//2 # integer division

2

In [10]:
5**2 # exponentiation

25

<a id='strings'></a>
## 4. String basics  ([to top](#top))

Strings are used to describe basilar text objects: as such they can be manipulated and transformed.

**Note:** strings are *immutable*, any tranformation generates a modified *copy* of the original string.

In [11]:
s = "This is a string"

In [12]:
l = s.split(" ")
l

['This', 'is', 'a', 'string']

In [13]:
r = s.replace("a", "THE")
r

'This is THE string'

In [14]:
s.lower()

'this is a string'

In [15]:
s.upper()

'THIS IS A STRING'

In [16]:
name = 'Giulio'
age = 33

'Hi, my name is {}: I am {}'.format(name, age)

'Hi, my name is Giulio: I am 33'

In [17]:
name = 'Giulio'
age = 33

'Hi, my name is %s: I am %d' % (name, age)

'Hi, my name is Giulio: I am 33'

<a id='ds'></a>
## 5. Data structures  ([to top](#top))

**Data structures** are conceptual models that are used to *organize* and *structure* data. 

There are 4 basic data structures: lists (list), tuples (tuple), dictionaries (dict), and sets (set)

### Lists

List are **mutable** collections of objects (their contents can change as the program executes).

In python a lists is defined using brackets

In [16]:
fruits = ['apple', 'banana', 'orange'] 
fruits.append('pineapple')
fruits

['apple', 'banana', 'orange', 'pineapple']

### Tuples

Tuples are **immutable** objects.

In python tuples are enclosed in parentheses<br/>
**Note**: You cannot add or remove elements from a tuple but they are faster and consume less memory

In [17]:
fruits = ('apple', 'banana', 'orange')
fruits

('apple', 'banana', 'orange')

### Dictionaries

Dictionaries are **mutable** key-indexable objects.

In python dictionaries are build using curly brackets<br/>
**Note**: Dictionaries are unordered but have key, value pairs

In [22]:
student = {'name': 'Albert', 
           'age': 21, 
           'department': 'Computer Science' }
print(student['name'], student['age'], student['department'])

Albert 21 Computer Science


In [23]:
student['income'] = 50
del student['age']
student

{'name': 'Albert', 'department': 'Computer Science', 'income': 50}

In [24]:
student.keys()

dict_keys(['name', 'department', 'income'])

In [25]:
'name' in student, 'age' in student

(True, False)

In [26]:
student.values()

dict_values(['Albert', 'Computer Science', 50])

In [27]:
student.items()

dict_items([('name', 'Albert'), ('department', 'Computer Science'), ('income', 50)])

### Sets

A set is like a list but it can only hold unique values

In [28]:
fruits1 = set(['apple', 'banana', 'orange'])
fruits2 = set(['kiwi', 'banana', 'melon'])

fruits1, fruits2

({'apple', 'banana', 'orange'}, {'banana', 'kiwi', 'melon'})

In [35]:
fruits1.append('apple')
fruits1

['apple', 'banana', 'orange', 'apple']

Many operations can be efficiently performed using sets

In [29]:
it = fruits1 & fruits2 # intersection
it

{'banana'}

In [30]:
un = fruits1 | fruits2 # union
un

{'apple', 'banana', 'kiwi', 'melon', 'orange'}

In [31]:
diff = fruits1 - fruits2 # difference
diff

{'apple', 'orange'}

### Combination

Data structures can be combined and nested

In [32]:
combo = ('apple', 'orange')
mix = {'fruit' : ['banana', 'pear'], combo: ('melon', 'kiwi')}
mix

{'fruit': ['banana', 'pear'], ('apple', 'orange'): ('melon', 'kiwi')}

<a id='slicing'></a>
## 6. Slicing  ([to top](#top))

If an object is ordered (such as a list or tuple) you can select on index

In [33]:
fruits = ['apple', 'banana', 'orange', 'pineapple', 'pear']

In [34]:
first_fruit = fruits[0]
first_fruit

'apple'

In [35]:
last_fruit = fruits[-1]
last_fruit

'pear'

In [36]:
subset = fruits[1:4] # By convention: left index included, rigth excluded
subset

['banana', 'orange', 'pineapple']

In [37]:
subset = fruits[:4] 
subset

['apple', 'banana', 'orange', 'pineapple']

**Note**: slicing also works on strings!

In [38]:
s = "Hello world!"
s[6:]

'world!'

<a id='functions'></a>
## 7. Functions  ([to top](#top))

A funciton is a **named** and **reusable** snippet of code.

A function takes *arguments* as input and defines logic to process these inputs (and possibly returns something).

In [39]:
def multiply(a, b):
    return a*b

The action expressed by **multiply** will only execute once you call it:

In [40]:
multiply(2, 3)

6

Function can also define default values for their arguments:

In [41]:
def multiply(a, b=5):
    return a*b

multiply(2)

10

In [42]:
multiply(2, b=3)

6

<a id='blocks'></a>
## 8. Code Blocks  ([to top](#top))

In python blocks of code need to be nested.<br/>
Variables in the outer blocks can be seen by the inner ones, the opposite does not apply.

Indentations are required by Python to define blocks of code. Each indentation level is identified by 4 spaces (one tab)<br/>
**Note**: code subsets have their own local *scope* (notice variable *a*):

In [43]:
def example():
    a = 'Layer 1'
    print(a)
    
    def layer_2():
        a = 'Layer 2'
        print(a)
 
    layer_2()

In [44]:
a = 3
if a > 0:
    print('a')
    if a == 3: 
        print('c')
else:
    print('b')

a
c


In [45]:
example()

Layer 1
Layer 2


<a id='flow'></a>
## 9. Flow Control  ([to top](#top))

Python, as all programming languages, defines primitives to allow flow control: they are **conditional** and **cycles**

### Conditional: If-Elif-Else

A conditional statement allows to check **logic** conditions. 

Conditions can be:
- mathematical (<, >, <=, >=, !=)
- logical (and, or, not, in, is)

In [52]:
age = 25
if age == 20:
    print('A')
elif age < 20:
    print('B')
elif age >= 25:
    print('C')
else:
    print('D')

C


In [57]:
sex = "M"
age = 24
if age in [20, 21, 25] and sex == 'M':
    print("Hello")
    
if age is not None:
    print(age + 6)

30


In [46]:
list(range(0, 6, 2))

[0, 2, 4]

#### None: a special "value"
``None`` is just a value that commonly is used to signify 'empty', or 'no value here'.

A variable can be tested to be ``None`` with the keyword ``is``

In [47]:
name = "Irene"
if name is not None:
    print('Hi {}'.format(name))

Hi Irene


### Cycles: For loops

``for`` cycles are used to iterate over a data structure (e.g., a list). <br/>
Since the size of a given data structure is *finite*, a ``for`` cycle *necessarely* ends.

In [60]:
for num in range(0, 6, 2): # range(from, to, step)
    print(num)

0
2
4


In [48]:
list_fruit = ['Apple', 'Banana', 'Orange']
for i, fruit in enumerate(list_fruit):
    print(i, fruit, list_fruit[i])
    
list_fruit = ['Apple', 'Banana', 'Orange']
enumerate(list_fruit) #iterable, like?

0 Apple Apple
1 Banana Banana
2 Orange Orange


<enumerate at 0x7f6330442ea0>

Looping over a list of tuples

In [49]:
tuple_in_list = [(1, 2), (3, 4)]
for a, b in tuple_in_list:
    print(a + b)

3
7


Looping over a dictionary

In [65]:
dictionary = {'one' : 1, 'two' : 2, 'three' : 3}
for k, v in dictionary.items():
    print(k, v)

one 1
two 2
three 3


### Cycles: While loops

``While`` cycles allow to loop until a specific condition (called *guard*) is satisfied. <br/>

Indeed, conversely from ``for`` cycles, you can describe infinite loop using while.

In [None]:
count = 0
while count < 4:
    print(count)
    count += 1

<a id='comp'></a>
## 10. Comprehensions  ([to top](#top))

Comprehension makes it easier to generate a list or dictionary using a loop.

### List comprehension

In [50]:
lst = [x + 2 for x in range(0,6)]
lst

[2, 3, 4, 5, 6, 7]

It is an alternative to:

In [51]:
lst = []
for x in range(0,6):
    lst.append(x + 2)
lst

[2, 3, 4, 5, 6, 7]

### Dict comprehension

In [52]:
ndt = {'num_{}'.format(x) : x + 5 for x in range(0,6)}
ndt

{'num_0': 5, 'num_1': 6, 'num_2': 7, 'num_3': 8, 'num_4': 9, 'num_5': 10}

It is an alternative to:

In [54]:
ndt = {}
for x in range(0,6):
    ndt['num_{}'.format(x)] = x + 5
ndt

{'num_0': 5, 'num_1': 6, 'num_2': 7, 'num_3': 8, 'num_4': 9, 'num_5': 10}

### Comprehension with conditions

In [55]:
lst = [x for x in range(0,6) if x%2==0]
lst

[0, 2, 4]

<a id='exceptions'></a>
## 11. Exceptions  ([to top](#top))

In Python when something *wrong* happens an exception is raised:

In [56]:
num_list = [1, 2, 3]
num_list.remove(4)

ValueError: list.remove(x): x not in list

You can catch exceptions using try and except:

In [71]:
try:
    num_list.remove(4)
except:
    print('ERROR!')

ERROR!


It is usually best practice to specify the error type to except:

In [72]:
try:
    num_list.remove(4)
except ValueError as e:
    print('Number not in the list')
except Exception as e:
    print ('Generic error')
finally:
    print('Done')

Number not in the list
Done


<a id='io'></a>
## 12. File Input/Output  ([to top](#top))

You can open a file with different file modes: <br/>
w -> write only <br/>
r -> read only <br/>
w+ -> read and write + completely overwrite file <br/>
a+ -> read and write + append at the bottom <br/>

In [None]:
with open('new_file.txt', 'w') as file:
    file.write('Content of new file. \nHi there!')

In [None]:
with open('new_file.txt', 'r') as file:
    file_content = file.read()
    
file_content

In [None]:
print(file_content)

In [None]:
with open('new_file.txt', 'a+') as file:
    file.write('\n' + 'New line')

In [None]:
with open('new_file.txt', 'r') as file:
    for line in file:
        print(line)

<a id='libs'></a>
## 13. Importing Libraries  ([to top](#top))

In python some funtionality are left outside from the core language: in order to access them it is necessary to ``import`` dedicated packages.

A ``package`` is a choerent collection of ``functions``. 

There exist hundreds of thousands different packages: if you need something, it is likely that someone already coded and packaged it!

In [None]:
import math
math.sin(1)

In [None]:
import math as mt
mt.sin(1)

In [None]:
from math import sin
sin(1)

To install a new package you can use a command line tool: ``pip``.

Just open a terminal (from Anaconda) and type

    pip install <package name>
  

<a id='zen'></a>
## 14. Zen of Python ([to top](#top))

When you have doubts just remember the "Zen of Python"

In [None]:
import this