### Load Modules

In [1]:
import sys, os
from tqdm import tqdm
import requests

path = os.path.dirname(os.path.abspath(os.getcwd()))
sys.path.insert(0, path+"\\.")
print('Please make sure that the path below points out your project working directory: \n')
print(path)

data_path = os.path.join(path, 'data')

from src.data.utils import *

Please make sure that the path below points out your project working directory: 

c:\Users\ucn\Education\20211\InnoLab\gather-project


### Configurations

Three configurations should be specified in this notebook to create the image dataset:

* **mapbox_token** : *string* :

    The user access token to mapbox api and services.  If you don't have an access token from [mapbox](https://www.mapbox.com/), you can easily get by creating an account. After you create your mapbax account, simply go to [account page](https://account.mapbox.com/) to list or create tokens. Note that each Mapbox API has [rate limits](https://docs.mapbox.com/api/overview/#rate-limits) that cap the number of requests you can make against an endpoint.
* **tile_size** : *float* :

    The edge length of the square tile in kilometers. To use while creating grid of tiles over the area of interest which also defines spatial resolution of the images that will be gathered. Note that higher edge length will result less number of tiles with lower resolution in compare to lower edge length.
* **image_pixel** : *integer*:

    Pixel count of one edge of the image that will be returned for each tile, e.g. 256.

In [2]:
mapbox_token = 'pk.xxx'
tile_size = .5 # in kilometers
image_pixel = 256 # 256 --> 256x256 images

### Read Antananarivo Administrative level 2 boundaries

Madagascar administrative level 0 (country), 1 (region), 2 (district), 3 (commune), and 4 (fokontany) boundary and line shapefiles, geodatabase, and gazetteer are available at [The Humanitarian Data Exchange](https://data.humdata.org/dataset/madagascar-administrative-level-0-4-boundaries). In our case, administrative level 2 shapefile is used.



In [3]:
mgd_boundary = gp.read_file(os.path.join(data_path, 'geodata', 'mdg_adm_2.zip'))
mgd_boundary.head(2)

Unnamed: 0,ADM0_PCODE,ADM0_EN,ADM1_PCODE,ADM1_EN,ADM1_TYPE,ADM2_PCODE,ADM2_EN,ADM2_TYPE,PROV_CODE,OLD_PROVIN,PROV_TYPE,NOTES,SOURCE,geometry
0,MG,Madagascar,MG11,Analamanga,Region,MG11101001A,1er Arrondissement,District,1,Antananarivo,Old Provinces/Faritany dissolved in 2007,Previous district name is Antananarivo Renivoh...,Note that Communes (admin 3) have become the D...,"POLYGON ((47.50556 -18.89146, 47.50563 -18.891..."
1,MG,Madagascar,MG11,Analamanga,Region,MG11101002A,2e Arrondissement,District,1,Antananarivo,Old Provinces/Faritany dissolved in 2007,Previous district name is Antananarivo Renivoh...,Note that Communes (admin 3) have become the D...,"POLYGON ((47.55842 -18.91178, 47.55857 -18.911..."


**Filtering with the antananarivo area codes**

* Antananarivo Province is referenced by administrative level 1 P-codes of `MG11`, `MG12`, `MG13` and `MG14` which are Analamanga, Vakinankaratra, Itasy, Bongolava regions in Madagascar.
* On the other hand, `Antananarivo Renivohitra` district is part of Analamanga region which is covered by administrative level 2 P-codes starting with `MG1110100`.

In [4]:
antananarivo_province = mgd_boundary[[i.startswith('MG1') for i in mgd_boundary.ADM1_PCODE]]
antananarivo_renivohitra = antananarivo_province[[i.startswith('MG1110100') for i in antananarivo_province.ADM2_PCODE]]

### Get grid of tiles for Antananarivo Renivohitra

In [5]:
tiles, map_ = get_grid(antananarivo_renivohitra, tile_size, visualize = True, mapbox_token = mapbox_token)
map_

Creating grid of tiles...
Done! Square grid with 548 tiles is created!
Visualizing the grid...
Enjoy your interactive map! You can click on tile to display the tile number.


### Get images and mask for the tiles using Mapbox Static API

Executing the next code block will loop through each tile in the grid and save image and mask PNG file to the images and masks folder respectively in data directory.  

In [6]:
tile_bounds = tiles.loc[:,'geometry'].bounds.reset_index(drop = True)
tile_bounds = tile_bounds.apply(lambda x: x.astype(str)) 
tile_bounds['bounds'] = tile_bounds.agg(','.join, axis=1)

data_len = len(tile_bounds)
image_size = str(image_pixel) + 'x' + str(image_pixel)

print('Collecting data for ' + str(data_len) + ' tiles with the size of ' + image_size + '...')

for i in tqdm(range(data_len)):
     
     file_name = 'image_' + str(i+1).zfill(len(str(data_len))) + '.png'
     
     # get mask
     url_mask = 'https://api.mapbox.com/styles/v1/utkucanozturk/ckziwvqvd002114nydr1hmjmm/static/[' + tile_bounds.loc[i, 'bounds'] + ']/' + image_size + '?access_token=' + str(mapbox_token) + '&attribution=false&logo=false'
     response = requests.get(url_mask)
     
     # write mask
     with open(os.path.join(data_path, 'masks', file_name), 'wb') as output:
          _ = output.write(response.content)

     url_satellite = 'https://api.mapbox.com/styles/v1/mapbox/satellite-v9/static/[' + tile_bounds.loc[i, 'bounds'] + ']/' + image_size + '?access_token=' + str(mapbox_token) + '&attribution=false&logo=false'
     response = requests.get(url_satellite)

     # write mask
     with open(os.path.join(data_path, 'images', file_name), 'wb') as output:
          _ = output.write(response.content)
     
print('Done! Check the data folder for the images!')


  0%|          | 0/548 [00:00<?, ?it/s]

Collecting data for 548 tiles with the size of 256x256...


100%|██████████| 548/548 [07:39<00:00,  1.19it/s]

Done! Check the data folder for the images!



