## Python 3 - what to take into account

*Largely inspired by http://python-3-for-scientists.readthedocs.io/en/latest/python3_user_features.html*

### Integer division not default anymore

In [1]:
2/3

0.6666666666666666

In [2]:
2//3

0

### Recursive search for filenames in a directory and its subdirectories

In [12]:
import os
import glob

Check all the Excel-files inside my github repos, recusively into the different repos:

In [17]:
glob.glob(os.path.join("/home/stijn_vanhoey/githubs", "**", "*.xlsx"), recursive=True)

['/home/stijn_vanhoey/githubs/inbo_alien-species-checklist/data/raw/rinse/neobiota-023-065-s001.xlsx',
 '/home/stijn_vanhoey/githubs/inbo_alien-species-checklist/data/raw/fishes/ExoticFishSpeciesFlanders.xlsx',
 '/home/stijn_vanhoey/githubs/inbo_alien-species-checklist/data/raw/rinse-annex-b/AnnexB RINSE Registry of NNS.xlsx',
 '/home/stijn_vanhoey/githubs/inbo_alien-species-checklist/data/raw/wrims/WRIMS_distributions_20151005-Belgium-extraction.xlsx',
 '/home/stijn_vanhoey/githubs/inbo_alien-species-checklist/src/grafiek_pathways/pathways_tabel.xlsx',
 '/home/stijn_vanhoey/githubs/forks/DataAnalysis/data/NH4MetingenDries.xlsx',
 '/home/stijn_vanhoey/githubs/personal_open_data_showcases/vehicle_registration_example/Registration transactions 2014_tcm466-262543.xlsx',
 '/home/stijn_vanhoey/githubs/temp/alien-species-checklist/source-datasets/rinse/neobiota-023-065-s001.xlsx',
 '/home/stijn_vanhoey/githubs/temp/alien-species-checklist/source-datasets/fishes/ExoticFishSpeciesFlanders.xlsx

### Cleaning lists now similar to dictionaries

For dictionaries: `clear` function

In [23]:
d = {'flux': 1}
d.clear()
d

{}

Also available to lists:

In [24]:
d = ["bird", "plant", "mammal"]
d.clear()
d

[]

### Print is a function

`print` is a function. It allows you to customize aspects such as what separator to use between variables, and whether to go to the next line between successive print statements:

In [26]:
a, b = 1, 2
print(a, b)

1 2


Adding separators and end of line statements:

In [70]:
a, b = 1, 2
print(a, b, sep="\t,", end=";\n")

1	,2;


Print can also be used to print everything in a file:

In [101]:
with open('data.txt', 'w') as f:
    print(a, b, file=f)

is equivalent to e.g.

In [65]:
with open('data.txt', 'w') as f:
    f.write("{} {}".format(a, b))

### Unpacking of variables

In [66]:
a, b, *rest = range(5)
print(a, b, rest, sep="\t;\t")

0	;	1	;	[2, 3, 4]


In [67]:
*rest, a, b = range(5)
print(a, b, rest, sep="\t;\t")

3	;	4	;	[0, 1, 2]


You can use this to read specific lines of a datafile:

In [75]:
with open('data.txt', 'w') as f:
    a, b, c, *rest  = range(8)
    print(a, b, c, rest, file=f, sep="\n")

In [82]:
with open('data.txt', 'r') as f:
    first, second, *rest, last = f.readlines()
print(first, second, last)

0
 1
 [3, 4, 5, 6, 7]



### String interpolation

In [86]:
a, b = 10, 20
'The sum of the values is {}.'.format(a+b)

'The sum of the values is 30.'

For Python 3.6, the following syntax will be possible as well:

### Unicode

Unicode strings are the default in Python 3. 

In [88]:
s4 = "unicode strings are great! 😍"

In [89]:
print(s4)

unicode strings are great! 😍


### Error catching

In [96]:
try:
    f = open('data.txt')
except FileNotFoundError:
    print("File not found...")

In [95]:
try:
    f = open('data_2.txt')
except FileNotFoundError:
    print("File not found...")

File not found...


Other new errors:

Check for new errors: <br>
https://docs.python.org/3/whatsnew/3.3.html#pep-3151-reworking-the-os-and-io-exception-hierarchy

### Function annotations

Check the data types that go in and out of a function:

In [98]:
def remove_spaces(x: str) -> str:
    return x.replace(' ', '')

In [99]:
remove_spaces("jan mieke")

'janmieke'

Check http://mypy-lang.org/ to make this really work, as currenlty this is just good for devs...

### Closing...

In [102]:
# remove the dummy data file
os.remove("data.txt")