<a href="https://colab.research.google.com/github/janbertoo/2022_ML_Earth_Env_Sci/blob/main/Copy_of_Week_1_Basics_of_Python.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Welcome to "Machine Learning for Earth and Environmental Sciences"

For this week's lab, our goal is to (re)familiarize ourselves with the basics of Python, focusing on libraries that are especially convenient for manipulating geoscientific datasets, implementing machine learning algorithms, and visualizing results. Our main learning objectives today are to get (re)acquainted with several Python libraries:

1.   [The Python Standard Library](https://docs.python.org/3/library/) and [Math](https://docs.python.org/3/library/math.html) for basic arithmetic
2.   [Numpy](https://numpy.org/doc/stable/index.html) for scientific computing 
3.   [Matplotlib](https://matplotlib.org/) for visualization
4.   [Pandas](https://pandas.pydata.org/) for manipulating tabular data
5.   [Xarray](https://docs.xarray.dev/en/stable/) for manipulating multidimensional gridded data
6.   [Cartopy](https://scitools.org.uk/cartopy/docs/latest/) for making maps

Today's tutorial adapts excellent online resources to efficiently introduce the basics of Python, in particular [An Introduction to Earth and Environmental Data Science](https://earth-env-data-science.github.io/intro.html) by Ryan Abernathey, Kerry Key, and Tim Crone [(License)](https://creativecommons.org/licenses/by-sa/4.0/)

There are many excellent tutorials to get started with Python, such as:

*   [Programming with Python](https://swcarpentry.github.io/python-novice-inflammation/) by © Software Carpentry and © Data Carpentry [(License)](https://creativecommons.org/licenses/by/4.0/), if you need a tutorial that focuses on the fundamentals and goes at a slower pace than this tutorial.
*   The [Python Basics](https://energy4climate.pages.in2p3.fr/public/education/machine_learning_for_climate_and_energy/notebooks/1_tutorial_introduction.html) page from [Machine Learning for Climate and Energy](https://energy4climate.pages.in2p3.fr/public/education/machine_learning_for_climate_and_energy/chapters/frontmatter.html) by Bruno Deremble and Alexis Tantet [(License)](https://creativecommons.org/licenses/by-sa/4.0/). This page is appropriate if you are looking for a quick tutorial covering the libraries to get started with machine learning for the environmental sciences. 


If you are struggling with some of the exercises, do not hesitate to:


*   Use a direct Internet search, or [stackoverflow](https://stackoverflow.com/)
*   Ask your neighbor(s), the teacher, or the TA for help
*   Debug your program, e.g. by following [this tutorial](https://swcarpentry.github.io/python-novice-inflammation/11-debugging/index.html)
*   Use assertions, e.g. by following [this tutorial](https://swcarpentry.github.io/python-novice-inflammation/10-defensive/index.html)



# 1. Python Fundamentals

*   This notebook assumes that you are using Python 3, whose documentation can be found [at this link](https://docs.python.org/3/).
*   Python has a built-in [interpreteter](https://en.wikipedia.org/wiki/Interpreter_(computing)), which means that it will directly execute the instructions you give without requiring compilation.
*   Python is so widely used that you can usually find answers to your question via a direct Internet search, often on [stackoverflow](https://stackoverflow.com/).



---

Go to notebook [`S1_1_Python_fundamentals`](https://colab.research.google.com/github/tbeucler/2022_ML_Earth_Env_Sci/blob/main/Lab_Notebooks/S1_1_Python_Fundamentals.ipynb)

---

# 2. Numpy for Scientific Computing


---

Go to notebook [`S1_2_Numpy`](https://colab.research.google.com/github/tbeucler/2022_ML_Earth_Env_Sci/blob/main/Lab_Notebooks/S1_2_Numpy.ipynb)

---


# 3. Matplotlib for Visualization


---

Go to notebook [`S1_3_Matplotlib`](https://colab.research.google.com/github/tbeucler/2022_ML_Earth_Env_Sci/blob/main/Lab_Notebooks/S1_3_Matplotlib.ipynb)

---

# 4. Pandas for Manipulating Tabular Data


---

Go to notebook [`S1_4_Pandas`](https://colab.research.google.com/github/tbeucler/2022_ML_Earth_Env_Sci/blob/main/Lab_Notebooks/S1_4_Pandas.ipynb)

---

# 5. Xarray for Multidimensional Gridded Data


---

Go to notebook [`S1_5_Xarray`](https://colab.research.google.com/github/tbeucler/2022_ML_Earth_Env_Sci/blob/main/Lab_Notebooks/S1_5_Xarray.ipynb)

---

# 6. Cartopy for Making Maps


---

Go to notebook [`S1_6_Cartopy`](https://colab.research.google.com/github/tbeucler/2022_ML_Earth_Env_Sci/blob/main/Lab_Notebooks/S1_6_Cartopy.ipynb)

---

# Incredible!! 😃
You've reached the end of week 1's lab. If you're done early, consider:


*   Trying out the notebook's bonus exercises
*   Checking out the [Seaborn official documentation/tutorial](https://seaborn.pydata.org/tutorial.html) and/or online tutorial (e.g., [this one](https://www.geeksforgeeks.org/python-seaborn-tutorial/?ref=lbp)) for statistical graphics 
*   Helping students around you if applicable
*   Giving feedback on how to improve this notebook (typos, hints, exercises that may be improved/removed/added, etc.) by messaging the teacher and TA(s) on Moodle
*   Formulating your final project for this course. 

**Final Project**
The final project’s goal is to answer a well-defined scientific question by applying one of the ML algorithms introduced in class on an environmental dataset of your choice (e.g., related to your Masters thesis or your PhD research). 

*   Can you think of a large environmental dataset linked to a scientific question you are passionate about?
*   In the affirmative, how could you format the dataset to facilitate its manipulation in Python?
*   In the negative, consider browsing the [list of benchmark datasets](http://mldata.pangeo.io/index.html) maintained by [Pangeo](https://pangeo.io/)






