# A gentle introduction to ML applied to Climate change

With this "course", I aim to give you a brief but rich introduction on how to leverage ML for climate related projects.

### What I want you to learn
My aim with this brief course is to provide you with a general understanding of
1. (Very briefly)What is ML?
2. How to think in ML terms.
3. How to leverage ML for Climate related projects.
4. The implications of using ML and it's impact on the environment

### What I will not teach you.
This course is not designed to be an exhaustive resource on ML, data science, data wrangling or climate topics. I will however try to point you in the right direction for you to dig deeper into certain topics and provide useful resources.


#### Notes on writing and tools
The tone and language for this course will be light and informal with a whiff of humor, sarcasm, and fun.

For the sake of simplicity, I will use Python (and maybe teach you a little).

## Introduction
Machine Learning has existed for a while now and it's a broad term used to describe a wide umbrella of algorithms, models and techniques.


Yet, in a very simple and informal fashion we think of it as the use of data (sometimes historic, sometimes realtime,sometimes synthetic) to predict or forecast a phenomenon, process or event of interest.
[ML Wikipedia site](https://en.wikipedia.org/wiki/Machine_learning) for a more formal definition.

The ability to predict something has value in itself, entire businesses are built around that but if you ask me... there's a whole world to be explored when you transform predicting into automating.

For example, I can predict a driver's level of drowsiness and the likelihood of unintended lane changes to automate the process of keeping a car in its current lane.

For example, in the climate space, we could automate the charging of a solar-powered battery in anticipation of cloudy weather. By using weather predictions, we can ensure the battery is charged when solar power is unavailable.


### Thought experiment
Think of another other Climate-focused prediction / automation.
* What would be interesting to predict?
  * Why would it be interesting?
  * What would be the value of it?
* Are you solving a problem? If so, which one?
* Could you automate a process with this approach/prediction?


Write your answer below, make it as realistic or as crazy as you'd like.



*   How seasons are going to change and how this is going to affect crop production and then food security
*   The most vulnerable areas to climate change



# Informal introduction to elements of ML


For any ML project we have an absolutely necessary element: data.
Data can come from in various forms, particularly when it is being represented in a computer.

Different forms or formats of data could be SQL tables, CSV files, songs, images, videos, binary files, etc. We must learn how to use and handle the types of data we want to use for analysis or prediction. More on this later.

Let's get back to defining our basic necesarry elements of ML. Besides data, we need a model... A model is a (simplified) representation of the process or phenomenon we are interested in  predicting.

Let's say we are interested in predicting tomorrow's temperature. What kinds of data would we want to use here? Here's a list of probably useful data points:
1. Yesterdays and todays temperature
2. Satellite imagery, we can detect clouds or rain
3. Wind records
4. Date
5. many more...

(small note on language: from this point onwards I'll have to use a more "mathy" way of writing. If you dont like math, I'm sorry but this is necessary. I promise I'll try to keep it as interesting and simple as possible...maybe)

## Problem framing
The "data points" above or better said, the features above will be our input for the model and temperature will be our output. If we go by the standard nomeclature, we can call our features one to - four as: $x_1 , ..., x_4$

Since these 4 features describe one single observation, we can say that the group (or vector) of them $[ x_1, ..., x_4 ]= 𝐱$.

The previous expression is only saying that we can represent a series of features into a vector that describe a data point. And if we have several data points, meaning several observations of the same features... the we can say our dataset is 𝐗 (capital, bold X).

Ok, that was easy... now what if we want to say that we have a model that uses 𝐗 to predict tomorrows temperature? Let's say we call our model $f$ and lets say we call the temperature $y$ then it's easy to say $y = f(𝐗)$.

Ta da! we are done with the mathy stuff, we just defined an expression to say "I have all these features I names $x_1 ... x_4$ that I will use through my model $f$ to predict tomorrows temperature $y$".

The general case for the example above is that we have a label(temperature) to predict, since we have data that will help us do that... we call that supervised learning. That is, I "know" what the model "should" say when I use certain data as input, since I have the answer. And since that is the case, I can correct what the model says if I know it's answer was wrong. I will talk more about this later, for now all you need to know is that when we have specific data we want to predict, we call it supervised learning. Makes sense?


## ML Taxonomy
Taxonomy is just a fancy word to say "different flavors/categories/subcategories of..."

To introduce you to the basic elements of ML, we must first divide it into the different kinds of problem definition.

1.   Supervised Learning: we have the answer
2.   Unsupervised Learning: we do not have the answer.
3. Reinforcement Learning: we want an agent to learn the best behaviour possible to perform a task.


(You can easily find a more comprehensive taxonomy but this one will suffice for our purposes)
