# Getting started

## Introduction

### Aims

Develop code with a focus on **reusability** and **reproducability**.

Why?
- Reused code is less likely to be incorrect
- If somebody can reuse your code that's the best
  contribution you can make.
- Coding skills are universally valuable

### Software stack

- git for version control:
    - Central repository: https://github.com/see-mof/geostat
    - Send me your accounts I will give you write access.
- conda for package management
    - Scientific software can be messy to install (dependency hell)
    - Conda helps alleviates this and helps to keep your system clean
- Python for scientific code
    - Jupyter notebooks to present results
    - Code that is used more than once should go in a package.
    

## Cloning the repository

- To clone the repository:
    ````
    $ git clone git@github.com:see-mof/geostat.git
    ````
- Suggestion: start working each in your own folders ``gpm`` and ``dardar``

## Setting up conda



### Installing conda
- If you don't have conda go to the [conda docs](https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.htm) and follow the installation instructions.
    
### Conda environments

  - Conda environments are a way to separate software dependencies
    for different projects.
  - We will use a conda environment to manage the software dependencies 
    that we need.
  - Conda environments can be exported to `.yml` file and


### Conda environments

- The repository contains an environment file ``geostat.yml`` located in ``conda/geostat.yml``.
- Create new environment ``geostat`` with these packages:
  ````
  $ conda env create -n geostat -f conda/geostat.yml
  ````
    - This will take about ``2GB`` of space.
- To activate the environment:
  ````
  $ conda activate geostat
  ````
  - **Note**: This need to be redone every time you start a new terminal
  

### Recap
- You should now have all software installed on your computer to get started
- To run this notebook, activate the ``geostat`` conda environment and start
  a jupyter notebook:
      
````
    $ conda activate
    $ jupyter notebook
````

## Accessing satellite data
- GOES data is freely available
- GPM: Need account on [https://disc.gsfc.nasa.gov/](https://disc.gsfc.nasa.gov/)
- CloudSat/DARDAR: Need account on [https://www.icare.univ-lille.fr/](https://www.icare.univ-lille.fr/)

## Finding data
- Use [NASA earthdata search](https://search.earthdata.nasa.gov/projects?p=!C1383813815-GES_DISC&pg[1][m]=download&q=gprof%20gmi&sb[0]=-71.15625%2C-26.01994%2C-33.60938%2C1.60844&m=-5.697616105023798!-89.7890625!3!1!0!0%2C2&tl=1595484979!4!!) to find GPM overpasses over Brazil
    - Will produce list of files to download, probably good to start with time range of 1 year.
- CloudSat data is available from ``Dendrite``

## Reading and processing data
- Downloading and processing satellite data is unfortunately quite messy
- Package we will use:
  - **pansat**: Download and open data, Developed by PhDs at out department
  - **satpy**: Open and process data
  - **xarray**: Array package built on **numpy** which simplifies handling of geophysical data.

### Installing **pansat**

- Make sure geostat conda environment is activated

````
git clone https://github.com/see-mof/pansat
pip install -e .
````

- Setting up pansat:
 - Pansat manages your download accounts for you (They are stored encrypted in your HOME folder)
 - For NASA GES DISC:
  ````
  $ pansat add "GES DISC" <username>
  ````
 - For ICARE:
  ````
  $ pansat add "ICARE" <username>
  ````
