<div>
<table style="width: 100%">
	<tr>
		<td>
		<table style="width: 100%">
			<tr>
                <td ><center><font size="5"><b>Module 49</b></font><center>
                <center><font size="6">Digital Innovations for Water Challenges</font><center></td>
			</tr>
			<tr>
                <td><center><font size="14">Notebook 1.a</font><center></td>
			</tr>
			<tr>
                <td><center><font size="6"><b>CHIRPS data download</b></font><center></td>
			</tr>
		</table>
		</td>
		<td><center><img src='images\ihe-delft-institute_unesco_fc-lr.jpg'></img></td>
	</tr>
</table>
</div>

# Table of contents
1. [Learning objectives](#learningobs)
2. [Introduction](#introduction)
3. [Download data from ftp](#ftp)
4. [Download data using OPeNDAP](#opendap)

# 1. Learning objectives<a name="learningobs"></a>

- Download data from ftp
- Unzip files
- Practice building a loop
- Practice string formatting
- Download data using OPeNDAP to subset region 

# 2. Introduction<a name="introduction"></a>
In this notebook you will learn to download CHIRPS data from a region of interest using 2 methods:
- as tiff files from the UCSB ftp site using simple ftp: https://data.chc.ucsb.edu/products/CHIRPS-2.0/
- as .nc files from the NOAA coastwatch website using the OPeNDAP (https://www.opendap.org/) framework:
https://coastwatch.pfeg.noaa.gov/erddap/griddap/chirps20GlobalDailyP05.html 

You will need the following python packages installed:  
>wget <br> gzip

In [None]:
#load modules
import wget
import gzip
import datetime
import os
from pathlib import Path

>*Reminder*: if you get a ModuleNotFoundError you can install them from within the notebook by running:  
!conda install *packagename*    
            or  
            !pip install *packagename*    

# 3. Download CHIRPS from the UCSB ftp site <a name="ftp"></a>
From the homepage of CHIRPS you can navigate to the ftp where the data is hosted.
https://data.chc.ucsb.edu/products/CHIRPS-2.0/ <br>
From here you can navigate the folders to reach the specific files you are interested in - we will learn how to download these files using wget.

### 3.a. First we will create folder where you wish to place the files and navigate to the directory

In [None]:
Path(r"./CHIRPS_tiff").mkdir(parents=True, exist_ok=True)
os.chdir(r"./CHIRPS_tiff")

Note: If you want to check the directory you are currently in you can use the following command:
>os.getcwd()

### 3.b. Next identify the path to the file you want to download and use wget to download the data
Navigate the ftp site to find the file you want to download. In the cell below we have chosen the daily data for the entire globe for 01/01/2022.

In [None]:
link_to_file_1 = r'https://data.chc.ucsb.edu/products/CHIRPS-2.0/global_daily/tifs/p05/2022/chirps-v2.0.2022.01.01.tif.gz'

To obtain the file using wget, simply run the command below:

In [None]:
fout = wget.download(link_to_file_1)

### 3.c. You now have a zipped file - let's unzip it using gzip

In [None]:
outfilename = fout[:-3]
with gzip.GzipFile(fout, 'rb') as zf:
    file_content = zf.read()
    save_file_content = open(outfilename, 'wb')
    save_file_content.write(file_content)
save_file_content.close()
zf.close()

### 3.d. Exercise: 
Write a code to download all chirps monthly data for Africa from 2020.
<br>Hint: you can use the f-strings formatting in a loop (https://docs.python.org/3/tutorial/inputoutput.html)
<br>For example: 
```python
for i in [1,2]:   
    print(f'The value of i is {i}')
```
will return: <br>
>The value of i is 1 <br>
>The value of i is 2

You can then format this output to match your needs. For example if you look at the format of the file names you will see that the month is a padded integer, i.e. 01 for january instead of 1. 

```python
for i in [1,2]:   
    print(f'The value of i is {i:02d}')
```
will return: <br>
>The value of i is 01 <br>
>The value of i is 02

>You can find some more examples here: https://docs.python.org/3/library/string.html#format-string-syntax

In [None]:
#Run this cell as an example of string formatting
for i in [1,2]:
    print(f'The value of i is {i:02d}')

In [None]:
#Complete code in this cell to download the year 2020
for i in list(range(1,13)):
    link_to_file = f'https://data.chc.ucsb.edu/products/CHIRPS-2.0/africa_monthly/tifs/chirps-v2.0.2020.{COMPLETE HERE}.tif.gz'
# to be removed in the notebook given to students:
#link_to_file = f'https://data.chc.ucsb.edu/products/CHIRPS-2.0/africa_monthly/tifs/chirps-v2.0.2020.{i:02d}.tif.gz'
    fout = wget.download(link_to_file)
    print(f' Download for month {i} complete')
    outfilename = fout[:-3]
    with gzip.GzipFile(fout, 'rb') as zf:
        file_content = zf.read()
        save_file_content = open(outfilename, 'wb')
        save_file_content.write(file_content)
    save_file_content.close()
    zf.close()

# 4. From the NOAA server using OPeNDAP <a name="opendap"></a>
In part 1. we downloaded data from an ftp. While we could select the months or days of interest, in terms of spatial extent we could only choose pre-existing tiles. 
Using the OPeNDAP framework, we can make calls to download only data from a specific area as well.

### 4.a. Create a new folder for the .nc data

In [None]:
Path(r"../CHIRPS_nc").mkdir(parents=True, exist_ok=True)
os.chdir(r"../CHIRPS_nc")

### 4.b. Define the time period and geographic area you wish to download

In [15]:
start_date = datetime.datetime(2020,1,1)
end_date = datetime.datetime(2021,12,1)

bounding_box = [-10, 0, 5, 15] #bounding_box = [latmin, latmax, lonmin, lonmax]

In [16]:
year_st = start_date.year
m_st = start_date.month
d_st = start_date.day
year_end = end_date.year
m_end = end_date.month
d_end = end_date.day

lat_1 = bounding_box[0]
lat_2 = bounding_box[1]
lon_1 = bounding_box[2]
lon_2 = bounding_box[3]

In [17]:
path_to_nc = f'https://coastwatch.pfeg.noaa.gov/erddap/griddap/chirps20GlobalMonthlyP05.nc?precip%5B({year_st}-{m_st}-{d_st}T00:00:00Z):1:({year_end}-{m_end}-{d_end}T00:00:00Z)%5D%5B({lat_1}):1:({lat_2})%5D%5B({lon_1}):1:({lon_2})%5D'

In [18]:
wget.download(path_to_nc)

-1 / unknown

'chirps20GlobalMonthlyP05_ea7a_a7e6_216e.nc'

## How does it work?  
Go to the website: https://coastwatch.pfeg.noaa.gov/erddap/griddap/chirps20GlobalDailyP05.html   
Explore the options and see if you can understand how we got to the creation of path_to_nc.


## Exercise
From the page https://coastwatch.pfeg.noaa.gov/erddap/griddap/index.html?page=1&itemsPerPage=1000
Try to create a similar code as the one presented above to download the daily instead of monthly CHIRPS data.