# Workshop on Python programming for data analysis in geosciences

Author: Martina Natali (martinanatali@cnr.it)

Full material available at:

https://github.com/martina01natali/Python_Workshops

Run this command to download everything:

`git clone https://github.com/martina01natali/Python_Workshops.git`

Follow me on GitHub! :)

## Notes on how to use Jupyter notebooks (more at the end of this notebook)

Jupyter notebooks are interactive, so you can use each cell (this block you're reading) as an interactive python line to execute commands and operations.

* run a cell: Ctrl+Enter
* run a cell and go the next one: Shift+Enter
* new cell: B (when no cell is open, else Esc+B)

---

## Python language

* Python is an interpreted high-level programming language for general-purpose programming. 
* Conceived in the late 1980s by Guido van Rossum in the Netherlands 
* First release in 1991
* Python 2.0 released in 2000
* Python 3.0 released in 2008
   * **Not completely backward-compatible**
   * In 2018 usage statistics finally report 25%/75% for Python2/3 respectively
   * Python 2.7 end-of-life postponed from 2015 to 2020 due to forward-porting difficulties of many portions of code
   * **We will mainly focus on Python 3.x during this course** 
   * There is the utility 2to3 to convert code from the old to the new python (not completely effective)
* Python evolves through so called PEPs (Python Enhancement Proposal) https://peps.python.org/

## Philosophy

* A common neologism in code is **pythonic**, i.e. a code which uses Python idioms following the recommendations of the Python community
* Useful PEPs are PEP 20, PEP 8 (style guide) (click here to see examples of well-written, pythonic code: https://peps.python.org/pep-0008/)
* The Zen of Python (PEP 20) summarizes Python philosophy through aphorisms such as:

<pre style="font-size: 24px;">
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
</pre>

## Running Python scripts via command line (Linux)
To run a Python script just write your code giving (usually) `.py` extension and then run

```shell
python3 script_file_name.py
```

If you want to avoid the name of the interpreter you have to specify the *shebang* line at the beginning of the file

```python
#!/usr/bin/env python3
```

replacing `python3` with `python2` (if you really need Python 2)

If you have a code which is compatible with both Python 2 and Python 3 (on realistic codes you have to work to achieve this) you can use the generic shebang

```python
#!/usr/bin/env python
```

Of course you also have to make the script executable if you want to call as `./script_file_name.py`

```shell
chmod +x script_file_name.py
```

## Jupyter notebook

* You can start it from a terminal by running 

```shell
jupyter notebook
```
    or
```shell
jupyter lab
```

* A browser is automatically opened with the a browsable view on your home 
* You can either
   * open an existing notebook clicking to the notebook filename (.ipynb extension)
   * start a new notebook (New button in top-right) selecting the kernel from the list of the installed ones
   * open a terminal (New button)

## Using notebooks
* A notebook is composed by cells:
   * **code**: cells containing Python code which can be executed (press y outside of the cell to make it code) 
   * **markdown**: cells containing text which can be formatted using easy Markdown language (a superset of HTML) (press m outside of the cell to make it markdown)
    https://guides.github.com/features/mastering-markdown/
   * **raw**: cells with unformatted text (press r outside of the cell to make it raw), which are useful if you want to exclude frmo execution a specific cell
* The cool feature on *code* is executing it inside the cell itself
   * using mouse press the Run buttons
   * with keyboard use Ctrl+Enter, Shift+Enter or Alt+Enter
   * There are several shortcuts to become quick using notebooks
      * check the Keboard Shortcuts under the Help menu

## Environment managing

We are working in a Python environment, which has some libraries installed and specific features.

The rules are:
* when working on a project, create an environment for that specific project, with the packages you need
* packages/libraries have to be manually downloaded in the environment to make them work: the best option is to download them all at the beginning when creating the environment, so that the package manager (`conda`) can do the work of making all packages compatible with each other
* if you download packages with a specific channel (e.g. conda-forge), prefer using that channel and not the `pip` installer, since `pip` superseeds conda packages and may cause the packages to be incompatible (potentially "breaking" your environment)

In [4]:
# to install a package
# the ! at the beginning makes the line a command line to run on the terminal
!conda install -c conda-forge rasterio

Channels:
 - conda-forge
 - defaults
Platform: linux-64
Collecting package metadata (repodata.json): \ ^C


In [5]:
# installing using pip: some packages are not available on conda-forge
# and we have to get them from another repository
pip install hydroeval

Collecting hydroeval
  Using cached hydroeval-0.1.0-py3-none-any.whl.metadata (4.8 kB)
Using cached hydroeval-0.1.0-py3-none-any.whl (22 kB)
Installing collected packages: hydroeval
Successfully installed hydroeval-0.1.0
Note: you may need to restart the kernel to use updated packages.


To restart the kernel: go in the menu of jupyter, Kernel->Restart kernel

## The requirement file

If you want to specify all dependencies and versions used in a project you can use a **requirement file**.<br>
A requirement file allows to specify which packages and versions should be installed for the project.<br>
`pip freeze` command shows all installed packages in requirements format.<br>
Its output can be redirect to a file (the requirement file).

In [None]:
pip freeze > requirements_geoenv.txt

Once you have a requirement file you can replicate the same environment in another system with the command `pip install` with `-r` option:

In [None]:
pip install -r requirements_geoenv.txt

## ...ready?

In [2]:
print("Hello World!")

Hello World!
