# Quick words
This notebook will try to cover a lot in not a lot of time, so feel free to come back to it at anytime.

In [None]:
from __future__ import annotations		# Do not worry about this line for now, it will come clearer later
# BUT RUN THE CELL!

Also, I haven't forgotten the people who already are comfortable in Python, there will be some advanced topic or things you didn't know to keep you on your toes, like this ;)

In [None]:
import antigravity

# The basics of a Notebook
Sorry for this small interruption...
Getting back on track!

## Cell types
A notebook (actually a json file for the curious) is separated into cells.
There are two cells `Markdown` cells and `code` cell. 
Markdown cells are the one used to write text, using, as you guess it markdown language (fancy name to say using normal text file 
that can set with basic html formatting really easily).
This give a really useful description of the formatting in markdown: https://www.markdownguide.org/basic-syntax/
Code cells are used to run code (you'll see them later).
Python usually, but you can always escape to the shell.

In [None]:
print("This is code cell")

## Escaping to shell
Let's say you want to run something that is not Python in a notebook.
First, why would you want to do this?  
- To install a dependency (in Colab), maybe?
- Or you want to check something, but you only know the command in `bash` and in python would be annoying to code.
- Something else (I'm not your mum, you can do what you want)

Running something that is not python is called `escaping`, as you are escaping the python interpreter and land in your `shell`.
On Google Colab the shell is `bash` as it using a Linux distribution.  
However, if you are running on macOS 10.15 or above it is `zsh`.
On Windows it is `cmd` (default) or `PowerShell`.
All the command given here should work on any OS, but be careful when you are on your own.

So how do you escape?  
Easy use `!` at the start of the line, and the line will be escaped. 
Yes there's a way to make multiline escaping, but from experience you rarely need it, and it behaves not always as expected (tested in a Google Colab, so take as it is).

### Installing dependencies 
So, let's install the dependencies from the notebook, by escaping.
But instead of installing all the packages one by one let's use the convenient `requirements.txt`
#TODO Fix this bit, and make sure to only use conda (cartopy is annoying 


In [None]:
!pip3 install -r requirements.txt

Or using Conda

In [None]:
!conda install --file requirements.txt

## git
Now, it is good time to mention `git` and it's relationship with notebooks.  
It's complicated.

`git`, for those who don't know, to track changes in your files as you code. See it as a timeline of snapshot of your code. 
You manually take these snapshot by doing a "commit". `git`, then looks at the difference (called a `diff`) between before and after the commit to know what changes.
The issue is that notebooks stores a lot of metadata and boilerplates code (json) in the file, so the `diff` can be easily messy and not that useful.
That's why I would highly recommend to clear the outputs of every cell before committing.


Also don't use escaping in a notebook to use git for that notebook. Best to use the terminal, or worse case another notebook that is not part of the repository you want to commit too (I had to this once, in Google Colab).

If you did not understand this, just remember this: 
- "Clear all the outputs, before commits"
- "If you are not sure with git, ask for help!"

# Notebook: when to use and not use
So, to come clean with you...
I don't really like notebook most of the time, and I am quite fond of a good old python script file.

But there are some instances where I think a notebook is better:
- Showcasing some library
- Doing a workshop/guided exercise (like this)
- Quickly and dirtly (is that word) throw some code together to try something like plotting, but once sorted should be moved to a script (if relevant).

I have seen notebooks that are uncommented (no cell text cells), had to run cells in non sequentially order, outputs were not cleaned (okay if it is to show the expected output).
Or duplicate code between to notebook, because you cannot import code from a notebook to another (that's why we have .py file for scripts).
And worse of all have all the code (and I mean all) in one cell. 

# Magic

A notebook is running in IPython, which means `magics`.
Magics are command that are specific to the IPython kernel (and thus jupyter).
They are called by `escaping` with a `%` for line mode, and `%%` for cell mode.

In [None]:
%pwd

Or this:

In [None]:
%time 2**128

In [None]:
%%timeit
pass

The `%%timeit` is my favourite.

And all of them works in a terminal, as long as you are running IPython.
The python console in PyCharms uses IPython by default.

Here is all the magics commands if you want: https://ipython.readthedocs.io/en/stable/interactive/magics.html#

# Python Basic

Okay, enough teasing with coding and notebook and git!  
We're to code!

What better place to starts than the basics?  
"What again some basics?" you are probably thinking.
Well you all have at least some coding experience, maybe not in Python.
So I'll assume you know what a `for` loop and variable are, but focusing on some differences and quirk of Python compared to other languages - looking at you `R` and `MATLAB`.

## Indentation
Python is extremely strict with indentation: it needs it!
Where in `C` the scope of a function, loops, or if-statement is defined with {}, **python uses colon and indentation**.
Like this:

In [None]:
for i in range(10):
	print(i)

Yes, I am using tabs and not spaces!  
I will die on that hill! 
(It's a question of accessibility, so people can set their tab spacing to their liking and be stuck by hard-coded spaces)

## Indexing

As most programming languages (expect `R`, `MATLAB`, and `lua` to name a few), `Python` uses **zero-based indexing**.
This means this:

In [None]:
lst = ["a", "b", "c", "d"]
print(f"First element {lst[0]}")
print("Last element", lst[-1])		# Not the -1 can be slow if the list is really really really big

The first elements is index `0`, the last of is `-1`.

Also, I've just showed two types of string formatting:
The first one is called f-string (obviously) and is more readable and can evaluate values between the {} on the fly (I really like them, and will be using them a lot).
The second, is the vanilla string formatting. 

## Dictionary

So `Python` as this cool object type called a dictionary.
For reference in other language (to name a lot):
- **JavaScript**: Object or Map
- **Java**: HashMap, Hashtable, or TreeMap
- **C**++: std::unordered_map (for hash tables), std::map (for balanced trees)
- **C**#: Dictionary or Hashtable
- **PHP**: Associative Array
- **R**: Named List
- **MATLAB**: containers.Map

In [None]:
my_dict = {'name': 'Alice', 'age': 30, 'email': 'alice@email.com'}
my_dict

(Yes, it prints without a print, how? it's an IPython thing, and only display the last code line)

## Functions and lambda

Python as two main types of functions: __standard__ and __lambda__.

### Standard function
Standard function (user-defined) are defined by using the keyword `def`.
And if they return something, the `return` keyword (where the function stops).

In [None]:
def sqrt(x):
	return x ** 0.5			# ** is the python way of doing powers.

sqrt(3)

## Lambda function
Lambda function or anonymous functions are define inline and only contains a single expression.
They are concise and can be useful for simple operations, often used as arguments to higher-order functions like map, filter, and sorted.

In [None]:
sqrt = lambda x: x ** 0.5

sqrt(2)

#### Generators
For advanced people: do you know about generator functions?  
A function that uses the `yield` keyword, which outputs changes everytime its value is accessed.  
Can you guess what this code does?

In [None]:
def generate_squares(n):
	for i in range(1, n + 1):
		yield i * i

square_generator = generate_squares(5)

for square in square_generator:
	print(square)

## Other object types

If you are familiar with another language you may want a specific object type, you can create your own, but I'd advise having a look at the python's standard library (no need to install) `collections`: https://docs.python.org/3/library/collections.html

And if you want to do some wierd things with a list or similar: https://docs.python.org/3/library/itertools.html

## Paths and OS

This is the point where Windows is the annoying one...
For instance this folder structure:
├── tier1  
│      ├── images  
│      │      └── image_01.png  

macOS and Linux will give the path as: `tier1/images/image_01.png`  
Whereas Windows is: `tier1\images\image_01.png` 

Put there's a catch `\` is a reserved character in Python.
It is used for a new line `\n` or a tab `\t` and so on.  
So if you want to actually type `\` you need to escape it, with a `\`.  
Yep, you read that correctly... this means the above path in Windows becomes: `tier1\\images\\image_01.png`  

This is great don't you think? Well, if you are only using one OS kinda, but how can you be sure this will ever be the case?  
What is the solution? Try one, catching the error if it does work to do use the other path? Nah, too complicated!

The solution is using a package. Actually two or good and in standard library: `os` and `pathlib`.  
Here we'll only use `os` but you are welcome to try `pathlib`.

For instance, and you do not have to care about the `/` in the string

In [None]:
import os

os.path.abspath("tier1/images/image_01.png")		# This gives the absolute path

You can even add paths together

In [None]:
root_dir = "tier1"
image_dir = "images"
image_name = "image_01.png"

os.path.join(root_dir, image_dir, image_name)

## Import from another file

### `__name__` and `"__main__"`

# PEP8 (pylint)

# Useful things
A collection of random and useful things for you when you code.

## Type hints
Python does not care if you define the type of the variables (contrary to `C` for instance), it will check at runtime.

However, you as human being it might be helpful to be reminded of what type is a certain variable.  
It can also make sure you don't do type changing or type mixing unless you are certain that's what you want (those are costly operation to perform for python).  
Most helpful, it helps people and you (after a 2-week break) understand what your code is supposed to do.

**Introducing type hints/hinting!**  
What is this?  
Well you just say what type is each variable, it won't help python as it will still check the variable type at runtime (for now), but if you are using an IDE (with the correct options) it will probably scream (nicely) at you.

Taking the previous sqrt function and let's use type hint.

In [None]:
def sqrt(x: float) -> float:
	return x ** 0.5

The function takes __any__ number (so a float) and returns a square root (so a float).

See here for references:
https://docs.python.org/3/library/typing.html

**I will be using them as we go along.**
Note that if you are using python<3.10, may need to use the first code cell in this notebook to make it available.
Better to be same, I always add it to the top of each of my .py files.

**For advanced people**: Another good thing from doing type hints, if you ever want to compile your code through Cython to have a C version, you will need to do it, and change the `def` into `cdef`, so you are half-way there.

## Zen of Python

There is this thing in Python, that I think you ought to read at least once (it will improve your coding).  
It is called the Zen of Python, here they are

In [None]:
import this

## Docstrings
You have this function and don't know what it does or use, what to do?  
Call the help function, of course!

In [None]:
help(print)

Now you created a function and you want it have a help message. You can do it with what is called a docstring.  
You do it by making a multiline comment (with these `"""`) just after the function declaration. 

In [None]:
def sqrt(x: float) -> float:
	"""
	Function to calculate the square root of a given number.
	
	Parameters
	----------
	x : float,
		The number for which the square root will be calculated.

	Returns
	-------
	float,
		The square root of the given number.
		
	Examples
	--------
	>>> sqrt(4)
	2.0
	"""
	return x ** 0.5		

help(sqrt)

Note: I've used the [Numpy style](https://numpydoc.readthedocs.io/en/latest/format.html) of formatting the docstring by preference.

## tqdm
What is [tqdm](https://github.com/tqdm/tqdm)? It is a really useful library to do progress bar. This extremely valuable to know if your code is running, or is it running too slow.  
It is not standard library, so you would need to install it in any project, it is already install for this one.  

Here is an example.

In [None]:
from tqdm import tqdm		# tqdm.auto is to allow correct behaviour in notebook
from time import sleep

for i in tqdm(range(1_000)):			# Yes, you can use _ to format number and it will still work.
	sleep(0.01)

Another useful one, although a bit less straightforward it rich: https://github.com/Textualize/rich

## Error handling

If you want to filter errors, you can do that with the `try`, `except` blocks.  
Like this:

In [None]:
def divide(a: float, b: float) -> float | None:
	try:
		result = a / b
	except ZeroDivisionError:		 # block catches the division-by-zero exception
		print("Cannot divide by zero!")
		return None		# will return None if division by zero
	else:		# block executes if no exception occurs
		return result
	finally:	# block executes no matter what, even if an exception is raised or not
		print("Division operation attempted.")

# Test the function
print(divide(10, 2))
print(divide(10, 0))

# IDE

- numpy
	- array creations (load from file)
	- slicing and indexing
	- operations (broadcasting, matrix) 
		- diff from matlab
	- (adv) dtype (1.24)
	- (adv) cupy, numba
- scikit
	- linear regression
- matplotlib
	- seaborn colour

- pandas / polars
	- creating df
	- load csv
	- data manipulation (dropnan, lambda, iloc, loc)
	- (adv) mutability of df
	- saving, pickle
- xarray / dask
	- lazy loading compute
	- chunk
	- multi-dim
	- dataset vs data-array
	- data selection/slicing
- more plots
	- seaborn (pandas)
	- cmocean
- cartopy
	- projections
	- coastlines etc