# Python - an introduction

## Disclaimer

* All trainers are voluntary.
* We are no experts, just regular python users (for science, data treatment...).
* We'll do our best but there is no doubt we won't have answers to all questions.

Practice, practice, practice...

## Why Python ?

* Free        (open-source)
* Transparent (open-source)
* Reliable and advanced
* Readable    (indentation)
* Very portable and modular
* Large scientific community of users
    * Lots of libraries for science
    * Scientific needs and culture

## Inspired from Matlab, but with a different approach


![Ecosystems](img/Intro_Ecosystem.png)

## A good package manager is needed !

A Package manager cross-checks compatibility between libraries and identifies dependencies.
It knows what you have, and has access to an online repository to downloads additional libraries, updates...

* Package managers for *linux*: apt-get, yum...
* Package managers for *python* : pip, conda...

## A good package manager is needed !

**PIP**
* The 'standard' python package manager
* Downloads source (i.e.: not compiled) files from *Pypi* (https://pypi.python.org/pypi)
* Recommended for advanced users

**CONDA**
* Python package manager of the Anaconda distribution.
* Anaconda = pre-set Python distribution with all common libraries
* Downloads pre-compiled files from https://anaconda.org/ 
* Recommended for most (installed on intra)

## Python installation (using Anaconda, on linux)

Go to https://docs.anaconda.com/anaconda/ 

Dowload the anaconda installer (bash script) and execute it.

Anaconda installs all common libraries.
It also installs *conda*, if you need to add other libraries

**Good news**:
All of this is already done on *intra*

## Python 2 or 3 ?

* **Python 2**:
    * The historic heavywieght, lots of libraries, robust, widespread...
    * Python 2.7 is the last version
* **Python 3**: 
    * The developpers wanted to implement improvements in the next version of Python that would make it not retro-compatible => they started Python 3
    * In practice, the differences are minor (print is a function instead of a statement, character strings are encoded differently...)
    * Most libraries are now compatible with Python 3 too, and this trend will strengthen in the future
    * Python 3.6 is the latest stable version, 3.7 is coming...
    
If you are starting Python now => Use Python 3
(+ most IRFM-specific libraries are Python 3-only, e.g.: pywed)

## What can you do with Python ?


More or less everything... the question is rather *which library shall I use for my needs?*

Indeed, there are plenty of libraries already existing that probably do what you want.

Examples
---------
* **numpy** : vectorized computation (large matrices and operations on them)
* **scipy** : scientific computing (compatoble with numpy, provides advanced functions like fft, minimization, interpolation...)
* **pandas** : handle large table with heterogeneous data
* **Sympy** : symbolic maths
* **matplotlib** : pubication-quality data plotting (2D), matlab-inspired
* **mayavi**, **vispy**, **paraview**... : 3D plotting, large datasets...
* **bokeh** : web-based visualization
* **scikit-learn** : machine learning
* **scikit-image**: image processing
* **warnings** : implement warnings in codes
* **os** : perform bash operations from python
* **pexpect** : handle os operations that expect answers (i.e.: ssh...)
* **Cython** : C-like optimization of Python code
* ...

## Python is now mature

It took time but Python can now be considered mature (number of libraries, wide community of users, harmonized good practices...).

Reference guide for good coding practices in Python (PEPs):
https://www.python.org/dev/peps/

Useful sources of info:
* Forum with most answers: https://stackoverflow.com/
* Online free video Python lessons by INRIA (themed) : https://www.youtube.com/channel/UCWpkVtH93qQ5JpSZEwONjGA
* Official documentation of each library : http://www.numpy.org/, https://matplotlib.org/ ...
* Julien Hillairet's crash course: https://github.com/jhillairet/Python_Course_For_Fusion 
* Python Data Science Handbook : https://jakevdp.github.io/PythonDataScienceHandbook/ 
* Numerical Python : https://www.apress.com/us/book/9781484205549


## How shall I work with Python?

* **IDE** : Integrated Development Environment
    * Provide all-in-one solution : editor, console, debugger, variables tracking...
    * Matlab-like
    * Very practical but heavy
    * The main IDEs for Python are **spyder** (matlab-inspired) and **pycharm** (python-optimized)

* **Editor + IPython console**:
    * Edit your code in the editor, execute it in the console, track variables with prints, use the built-in debugger...
    * The IPython console natively provides several features of IDEs (completion, debugger...) + magic words
    * The "good old way", lightweight and more portable    

In both cases, it is highly recommended to know the **IPython console** !

## IPython console


IPython console = Python console... but better!

"I" => *interactive* shell for Python

* Tab completion
* Inline help
* Syntax highlighting
* Magic functions (fast shortcuts)
* Built-in debugger
* System shell commands
* ...

Tutorial: http://ipython.readthedocs.io/en/stable/interactive/tutorial.html 

## Getting prepared

* Connect to intra (nashira, sirrah, spica...) with your Unix account
* Open a terminal
* Load the imas modules (which loads imas but also Anaconda/Python3.6):
    > module load imas
* Start an IPython console:
    > ipython

## Inputs, output

The console opens in the terminal (like *matlab -nodesktop*).
The *input prompt* is green and has a number.

Let's assign a value to variable a, and multiply a by 2:

In [2]:
a = 0

In [18]:
2*a

0

When relevant, a numbered *output prompt* appears, showing the result.
Numbers are used to get back the corresponding input / output, with the *_* syntax:

In [23]:
_i18

'2*a'

In [24]:
_18

0

## Magic words


IPython has a lot of magic words that make everyday coding more efficient:
https://ipython.org/ipython-doc/3/interactive/magics.html


In [32]:
# Timing a one-line operation
%timeit 2*3

The slowest run took 118.38 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 17.5 ns per loop


In [35]:
# History if visited directories
%dhist

Directory history (kept in _dh)
0: /home/admlocal/Bureau/FormationPython


In [37]:
# Open a file that you can edit as a python script and executes it
%edit

IPython will make a temporary file named: /tmp/ipython_edit_gbdp1ajr/ipython_edit_be4dg6uh.py


In [39]:
# List currently available magic functions
%lsmagic

Available line magics:
%alias  %alias_magic  %autocall  %automagic  %autosave  %bookmark  %cat  %cd  %clear  %colors  %config  %connect_info  %cp  %debug  %dhist  %dirs  %doctest_mode  %ed  %edit  %env  %gui  %hist  %history  %killbgscripts  %ldir  %less  %lf  %lk  %ll  %load  %load_ext  %loadpy  %logoff  %logon  %logstart  %logstate  %logstop  %ls  %lsmagic  %lx  %macro  %magic  %man  %matplotlib  %mkdir  %more  %mv  %notebook  %page  %pastebin  %pdb  %pdef  %pdoc  %pfile  %pinfo  %pinfo2  %popd  %pprint  %precision  %profile  %prun  %psearch  %psource  %pushd  %pwd  %pycat  %pylab  %qtconsole  %quickref  %recall  %rehashx  %reload_ext  %rep  %rerun  %reset  %reset_selective  %rm  %rmdir  %run  %save  %sc  %set_env  %store  %sx  %system  %tb  %time  %timeit  %unalias  %unload_ext  %who  %who_ls  %whos  %xdel  %xmode

Available cell magics:
%%!  %%HTML  %%SVG  %%bash  %%capture  %%debug  %%file  %%html  %%javascript  %%js  %%latex  %%perl  %%prun  %%pypy  %%python  %%python2  %%python3

## Shell commands

The "!" syntax allows you to pass commands to the underlying shell

In [2]:
# list content of current directory
!ls

01_Python_Introduction.ipynb	    Cython  Python_Course_For_Fusion
01_Python_Introduction.slides.html  img


In [3]:
# get current working directory
!pwd

/home/admlocal/Bureau/FormationPython


**Exit the ipython console**

In [None]:
exit()

## Inline help

If you don't know what a variable is, or what a function does, or which arguments it takes, the ipython console provides several ways to get help or just generic info:

In [5]:
s = "this is a string"
a = 0
def f(x):
    """ This is my documentation """
    X = 2.*x + 1
    return X

In [8]:
# The type function
type(a)
type(s)
type(f)

# The ? syntax
a?
s?
f?

# The ?? syntax (get source code)
a??
s??
f??

# The print(<>.__doc__) command
print(a.__doc__)
print(s.__doc__)
print(f.__doc__)

# The help() function
help(a)
help(s)
help(f)

int(x=0) -> integer
int(x, base=10) -> integer

Convert a number or string to an integer, or return 0 if no arguments
are given.  If x is a number, return x.__int__().  For floating point
numbers, this truncates towards zero.

If x is not a number or if base is given, then x must be a string,
bytes, or bytearray instance representing an integer literal in the
given base.  The literal can be preceded by '+' or '-' and be surrounded
by whitespace.  The base defaults to 10.  Valid bases are 0 and 2-36.
Base 0 means to interpret the base from the string as an integer literal.
>>> int('0b100', base=0)
4
str(object='') -> str
str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or
errors is specified, then the object must expose a data buffer
that will be decoded using the given encoding and error handler.
Otherwise, returns the result of object.__str__() (if defined)
or repr(object).
encoding defaults to sys.getdefaultencoding().
e

## Multiple assignments

Muliple values can be assigned to multiple variables in a one-liner:

In [9]:
a, s, f = 0, 'string', lambda x:2*x

In [10]:
print(a)
print(s)
print(f)

0
string
<function <lambda> at 0x7fd1e0818840>


# Some built-in types: int, float, list, tuple, dict

* **Numbers** (conversions are automatic)
    * int, long (no size limit)
    * float (~15 numbers)
    * complex (j)
    
    
* **Character strings** (iterable, immutable):
    * ''
    * ""
    * """ """


* **Iterables** (heterogeneous)
    * list  (mutable)
    * tuple (non-mutable)
    * dict  (key-value pairs)


* **functions**
* **classes**: methods, attributes...

## Numbers

In [13]:
# int
ai = 10
# float
af = 4.
# complex
ac = 1. + 2.j

In [17]:
# Integer division
print('ai//int(af)', type(ai//int(af)), ai//int(af))
# Automatic conversion
print('ai/af      ', type(ai/af), ai/af)
print('ai + af    ', type(ai+af), ai+af)
print('ac + af    ', type(ac+af), ac+af)

ai//int(af) <class 'int'> 2
ai/af       <class 'float'> 2.5
ai + af     <class 'float'> 14.0
ac + af     <class 'complex'> (5+2j)


## Strings

'' and "" are interchangeable.
The two format exist so you can embbed one in another

In [23]:
s0 = "Python doesn't mix up the two types"
s1 = 'Python doesn"t mix up the two types'

Python is object-oriented, all of these (numbers, strings...) are python objects.

They have attributes and built-in methods that can be accessed via the dot '.' syntax

In [25]:
s0 = s0.replace('two','three')
print(s0)
print(s0+s1)

Python doesn't mix up the three types
Python doesn't mix up the three typesPython doesn"t mix up the two types


## Play with char strings

Strings are a very useful / flexible object in Python.
They come in with a lot of built-in methods to facilitate manipulation

In [9]:
# Create a string
s0 = 'Python_Irfm_01'

# Make all upper case
print(s0.upper())

# Parse / split
l0 = s0.split('_')
print(l0)

PYTHON_IRFM_01
['Python', 'Irfm', '01']


**Exercice :** find a quick way to replace underscores ('\_') with spaces in s0

In [10]:
s1 = s0.replace('_',' ')
print(s1)

Python Irfm 01


Don't hesitate to explore all methods, the one you need is probably already there...

## Iterables

Iterables are Python objects that have the '\__iter\__' attribute, which means they contain several things and you can iterate on them (strings are iterable).

In [7]:
s0 = 'python_irfm_blablabla'
print(hasattr(s0,'__iter__'))

# Get individual elements
print(s0[0], s0[1], s0[-1])

# Make all major case
print(s0.upper())

True
p y a
PYTHON_IRFM_BLABLABLA


Some are **mutable** (i.e.: you can change their content after they have been defined), some **immutable**.

**lists** []

They are the most flexible / common type of iterable.
They are mutable and can contain hétérogeneous objects

In [26]:
la = [0, 'a', ['4',None], (3,'r')]
print(la)

[0, 'a', ['4', None], (3, 'r')]


In [27]:
la[0] = 5
la[2] = ['1']
print(la)

[5, 'a', ['1'], (3, 'r')]


## Useful generic commands

I don't know what a variable is: *type*, *?*, *??*

In [29]:
# Get type if variable
type(a)

int

In [30]:
# Get help
a?

# Get source code (if function)
a??

In [31]:
# Get length of a
len(a)

TypeError: object of type 'int' has no len()

## Importing libraries

Each library needs to be imported in the python console / script, with the *import* command.
You can then use it. The name of the library serves as a namespace (i.e.: it gives access to other functions / objects / attributes using the dot '.' syntax):

In [9]:
import numpy
a = numpy.cos(5)
print(a)

0.283662185463


For efficiency, you can always rename an imported library when you import it (give it any short name you want, but in practices everyone uses the same for the most common libraries):

In [10]:
import numpy as np
a = np.cos(5)
print(a)

0.283662185463
