## Training Introduction 

In this lesson, we will:

**Understand the objectives and approach**
1. How does this fit with Digital Earth?
2. What we will learn and won't. 

**Learn about Python and Interactive Python**
1. Standard Interpreter
2. IPython Kernel 

**Take a look at an IPython Notebook**
- Break down the key parts
- Take a look at the Machine Learning Notebook and it's particulars 







## Digital Earth for a Resilient Caribbean Project

This project is meant to introduce Earth Observation technologies and data products
for the purpose of informing interventions around housing and infrastructure. 

The project has 3 components; Capacity Building, Operational Support, and Knowledge Exchange

The project is conducting the following:
1. Geospatial Capacity and Needs Assessments (completed in St. Lucia, Dominica is in progress) 
2. Support for the development of geospatial policy 
3. Support for data collection and development 
4. Training in Earth Observation and related technologies

This training falls within Component 1 of the project and supports points 3-4 

Several other trainings are being prepared, including; Remote Sensing, LiDAR/Point Clouds, 
Machine Learning, Drone Mapping. 

## Dominica Rooftop Classification 

[Github Link](https://github.com/GFDRR/caribbean-rooftop-classification/tree/master)

This project wass based on a Machine Learning algorithm. This algorithm:

1. Extracts building footprints from RGB imagery (aerials, satellite, drone). 
2. Extracts roof type based on geometry. 
3. Performs change detection (useful for damage assessment and development control). 

Motivations for this:
- Keeping the building footprints basemap up to date for planning, DRM, and statistics purposes. 
- Quick analysis of disaster impacts using local data. 

Ideal scenario: Regular georeferenced drone collected imagery can be processed using this algorithm and the data added to the national databses. 


## What is Python?

High-level programming language. 

**Interpreted not Compiled.** 
- Interpreter produces and shows results
- Compiler converts into machine code and gives you an executable which produces the results 
- You can also package your Python programs into an executable or 
  run your script via the command line (you rarely use the interpreter to write programs)



<table style="font-size:20px">
    <tr>
        <th>What we will cover</th>
        <th>Important but not covered (still important!)</th>
    </tr>  
    <tr>
        <td>Strings, numbers,lists,dictionaries,
            conditionals, iterators)</td>
        <td>Functions, Classes, Complex types and operations </td>        
    </tr>
    <tr>
        <td>Jupyter Notebook and Server</td>
        <td>Version Control (git)</td>        
    </tr>
      <tr>
        <td>Data Science and Visualization</td>
        <td>In-depth Machine Learning (next training)</td>        
    </tr>
    <tr>
        <td>Local and Cloud Notebooks</td>
        <td>Parallelization and Cluster Computing</td>        
    </tr>
</table>

**Python Hello World**

```Python 
print('hello world!')
```

**Rust Hello World**
```Rust
fn main() {
    println!("Hello, world!");
}
```

**Strongly typed and uses type inference for variables**
- Compiled language (Rust) - no inference, strong typing, immutable variables by default
- Interpreted but strongly typed (Python) - inference but no type conversions, some mutable (lists, sets, dictionaries) and some not (int, double, complex, strings) 
- Interpreted and weakly typed (Javascript) - type conversion and inference.



**Python (Infers type from user input)**
``` Python
your_name = input('Enter your name: ')
print(your_name)
```



**Rust (Type will always be string)**
``` Rust
use std::io; 

fn main() {
    println!("Enter your name: ");    

    let mut your_name = String::new(); //vars immutable by default, mut makes them mutable, String type defined

    io::stdin() 
        .read_line(&mut your_name) //& denotes reference that allows access to the var 
        .expect("Failed to read line");

    println!("You entered: {your_name}"); //{} placeholder

}
```

**Mature and large ecosystem**
- Strong data analysis and GIS capabilities (arcpy,gdal/ogr (fiona, etc...), pandas/geopandas,scipy,numpy)
- Significant machine learning capabilities (scikit-learn, tensorflow, pytorch)
- Mostly Free and Open Source Software
- Much more accessible to beginners with software like Anaconda, Collaboratory, etc... 
- One of the most popular programming languages today

## Python and Interactive Python (IPython)


What if I run a series of calculations, or a search for a particular filename in a directory with hundreds of files? 

This is a simple example:

I have downloaded the past year of earthquake events for the Caribbean from USGS, divided into files for every data from January 1st to November 26th. If we wanted to find the filenames which correspond to November? 

See- find_dates.py 

**IPython Interpreter**

Now lets try the same in the IPython interpreter. 

Start the IPython shell using Anaconda. 

Code is neatly divided into blocks, sessions and variables can be saved (%save line range and %load for sessions, %store for %store -r for variables). 



## Analysis Notebooks 

Building on IPython you have the Analysis Notebook. This completely eliminates the need for the command line and allows the use of markdown and html/css for documenting code. This is the tool we will be using for the rest of the training.

There are several examples of notebooks for various real-world applications, such as:

[Notebooks for Geospatial Machine Learning](https://github.com/deepVector/geospatial-machine-learning)

[Kaggle ML/AI Notebooks](https://www.kaggle.com/code?searchQuery=spatial&language=Python)

[NASA OpenSARLab Notebooks](https://github.com/ASFOpenSARlab/opensarlab-notebooks)

[ESRI Notebooks for teaching ArcPy](https://github.com/Esri/arcgis-python-api/tree/master/labs)

The following slides are the same example in notebook format. Each cell is a different part of the algorithm, you can break it down any way that makes sense. 




In [1]:
from pathlib import Path
from datetime import datetime

In [2]:
directory = 'C:\\Users\\mikef\\OneDrive\\Desktop\\Digital Earth Caribbean\\Dominica\\Training\\Training Materials\\intro\\earthquake_example'

# Generator objects are single use and cannot be saved 
dirobj=Path(directory).glob('*')

In [3]:
start_date = datetime(2023, 11, 1)
end_date = datetime(2023, 11, 26)  

%store start_date
%store end_date

Stored 'start_date' (datetime)
Stored 'end_date' (datetime)


In [None]:
for file in dirobj:
    
    date_str = file.stem.split('_')[2]
    date = datetime.strptime(date_str, '%Y-%m-%d')

    if date >= start_date and date <= end_date:
        print(file.stem)

### General Notes


Lets go back to the python interpreter. Let's say I want to find all of the files in a directory whose filenames match a certain pattern. To do this we'll use the pathlib and glob modules (which are part of the standard library since 3.8) 


For the example - browse to the directory and locate files based on a pattern 

Directory:
'C:\\Users\\mikef\\OneDrive\\Desktop\\Digital Earth Caribbean\\Dominica\\Training\\Training Materials\\intro\\earthquake_example'


When we do this in the interpreter, once we find the output we want, how do we persist it? As soon as you close the interpreter, your output will be lost (your commands though will be persisted for while based on the configuration of the interpreter). 



We can output to a script that can be passed to the interpreter to run. This will give you whatever the contents of the directory are when the script is run, you can output to a log file to persist it. But that's complicated, you are now creating multiple files, plus the documentation to understand the algorithm (either through commenting or a seperate readme) and comments restrict you to text input only.  

_________
IPYTHON
What if we can do this all in one and persist both the outputs and code? This is where IPython comes in. 

Open the command line and ipython interpreter. 

This is the ipython interpreter - It builds on the standard python installation to create a "kernel". It contains many of the capabilities of an IDE (tab auto completion and formatting) in addition to the ability to persist code, state (variable values), and outputs. 

Use the %save command in IPython to save your code to a python file %save filename.py 1-4
Use the %store command to store variables %store varname 
and to restore %store -r varname 


This is an improvement on the interpreter, and offers an much nicer interface for exploring datasets and sketching out functions. 

I note that many of these features are present in an IDE, and infact IPython has the %edit command which can allow you to open up a block in an IDE. 

The %embed command can be placed into a python script to open IPython to explore variables in a part of that script. 

___________________
ANALYSIS NOTEBOOKS

Building on the IPython interpreter, you have the Analysis Notebook. It abstracts away the command line in favour of a web  interface, allowing notebooks to be run locally in your browser or in the cloud. We will be using this tool for the rest of training. It allows the use of all standard Python libraries, packages, and IPython features. It has a very wide set of capabilities and we only have time to cover what is essential to run the machine learning notebooks for the next training as well as for GIS and data science. At the end of this training you should feel comfortable to enough to open and run an analysis notebook. 

Let's look at the example as a notebook. 

Now I can share the notebook with code, outputs, and comments in markdown, with tables, diagrams, images, links, etc... 

I hope you can agree that this is a much more useful way of sharing certain types of Python projects. 

Although this is perfect for data exploration, sketching out algorithms and testing, it is not a complete solution for building complex software packages (although it is one tool that can assist with this). 

For spatial analysis, this approach works well. It is similar to using model builder GUI but allows more powerful documentation of each step. 

Now we have some idea of what Python is and why Analysis Notebooks exist. 



