<a href="https://colab.research.google.com/github/Vizzuality/soils-revealed-data/blob/master/prepare_boundaries_for_mapbox.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Prepare data for the Soils Revealed project

https://github.com/Vizzuality/soils-revealed-data

`Edward P. Morris (vizzuality.)`

## Description
This notebook transforms vector boundaries into MapBox tiles format (MBTILES) using tippecanoe and uploads the resulting tiles to MapBox.

```
MIT License

Copyright (c) 2020 Vizzuality

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
```

# Setup

## Linux dependencies

In [1]:
%%bash
# Install AWS CLI (for MapBox uploads)
apt install --no-install-recommends -y -q awscli

Reading package lists...
Building dependency tree...
Reading state information...
The following additional packages will be installed:
  docutils-common python3-botocore python3-certifi python3-chardet
  python3-colorama python3-dateutil python3-docutils python3-idna
  python3-jmespath python3-pkg-resources python3-pyasn1 python3-requests
  python3-roman python3-rsa python3-s3transfer python3-six python3-urllib3
  python3-yaml sgml-base xml-core
Suggested packages:
  docutils-doc fonts-linuxlibertine | ttf-linux-libertine texlive-lang-french
  texlive-latex-base texlive-latex-recommended python3-setuptools
  python3-cryptography python3-openssl python3-socks sgml-base-doc debhelper
Recommended packages:
  python3-pil python3-pygments
The following NEW packages will be installed:
  awscli docutils-common python3-botocore python3-certifi python3-chardet
  python3-colorama python3-dateutil python3-docutils python3-idna
  python3-jmespath python3-pkg-resources python3-pyasn1 python3-reques





In [2]:
%%bash
# Install tippecanoe (for MapBox mbtiles)
apt install --no-install-recommends -q -y build-essential libsqlite3-dev zlib1g-dev
make
make install
add-apt-repository -y ppa:ubuntu-toolchain-r/test
apt update -q -y
apt install --no-install-recommends -q -y g++-5
export CXX=g++-5

git clone https://github.com/mapbox/tippecanoe.git
cd tippecanoe
make -j
make install

Reading package lists...
Building dependency tree...
Reading state information...
build-essential is already the newest version (12.4ubuntu1).
zlib1g-dev is already the newest version (1:1.2.11.dfsg-0ubuntu2).
zlib1g-dev set to manually installed.
The following additional packages will be installed:
  libsqlite3-0
Suggested packages:
  sqlite3-doc
The following NEW packages will be installed:
  libsqlite3-dev
The following packages will be upgraded:
  libsqlite3-0
1 upgraded, 1 newly installed, 0 to remove and 24 not upgraded.
Need to get 1,131 kB of archives.
After this operation, 2,136 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libsqlite3-0 amd64 3.22.0-1ubuntu0.3 [498 kB]
Get:2 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libsqlite3-dev amd64 3.22.0-1ubuntu0.3 [632 kB]
Fetched 1,131 kB in 1s (975 kB/s)
(Reading database ... 147715 files and directories currently installed.)
Preparing to unpack .../libsqlit



make: *** No targets specified and no makefile found.  Stop.
make: *** No rule to make target 'install'.  Stop.




Cloning into 'tippecanoe'...
mbtiles.cpp: In function ‘void tilestats(const std::map<std::__cxx11::basic_string<char>, layermap_entry>&, size_t, json_writer&)’:
  bool first = true;
       ^
mbtiles.cpp: In function ‘void mbtiles_write_metadata(sqlite3*, const char*, const char*, int, int, double, double, double, double, double, double, int, const char*, const std::map<std::__cxx11::basic_string<char>, layermap_entry>&, bool, const char*, bool, const std::map<std::__cxx11::basic_string<char>, std::__cxx11::basic_string<char> >&, const string&, const string&)’:
    bool first = true;
         ^


In [3]:
!tippecanoe -h

tippecanoe: invalid option -- 'h'
Usage: tippecanoe [options] [file.json ...]
  Output tileset
         --output=output.mbtiles [--output-to-directory=...] [--force]
         [--allow-existing]
  Tileset description and attribution
         [--name=...] [--attribution=...] [--description=...]
  Input files and layer names
         [--layer=...] [--named-layer=...]
  Parallel processing of input
         [--read-parallel]
  Projection of input
         [--projection=...]
  Zoom levels
         [--maximum-zoom=...] [--minimum-zoom=...]
         [--extend-zooms-if-still-dropping] [--one-tile=...]
  Tile resolution
         [--full-detail=...] [--low-detail=...] [--minimum-detail=...]
  Filtering feature attributes
         [--exclude=...] [--include=...] [--exclude-all]
  Modifying feature attributes
         [--attribute-type=...] [--attribute-description=...]
         [--accumulate-attribute=...] [--empty-csv-columns-are-null]
         [--convert-stringified-ids-to-numbers]
         [--

## Python packages

In [4]:
%%bash
# Install mapbox python package
pip install mapbox

Collecting mapbox
  Downloading https://files.pythonhosted.org/packages/5b/de/8dbc8e6615e2b09159efe698116d652e2d6867c71f5ae86745ff3e495ec5/mapbox-0.18.0-py2.py3-none-any.whl
Collecting polyline>=1.3.1
  Downloading https://files.pythonhosted.org/packages/0c/4a/67edcfd960ff64221782531c867d862acc6a4e85b382a291bcb820dcde72/polyline-1.4.0-py2.py3-none-any.whl
Collecting iso3166
  Downloading https://files.pythonhosted.org/packages/a0/42/15d2ef2211ddb26deb810a21b084ee6f3d1bc7248e884dcabb5edc04b649/iso3166-1.0.1-py2.py3-none-any.whl
Installing collected packages: polyline, iso3166, mapbox
Successfully installed iso3166-1.0.1 mapbox-0.18.0 polyline-1.4.0


In [5]:
!pip list

Package                  Version        
------------------------ ---------------
absl-py                  0.9.0          
alabaster                0.7.12         
albumentations           0.1.12         
altair                   4.1.0          
asgiref                  3.2.7          
astor                    0.8.1          
astropy                  4.0.1.post1    
astunparse               1.6.3          
atari-py                 0.2.6          
atomicwrites             1.3.0          
attrs                    19.3.0         
audioread                2.1.8          
autograd                 1.3            
awscli                   1.14.44        
Babel                    2.8.0          
backcall                 0.1.0          
beautifulsoup4           4.6.3          
bleach                   3.1.4          
blis                     0.4.1          
bokeh                    1.4.0          
boto                     2.49.0         
boto3                    1.12.39        
botocore        

## Authorisation

### Google cloud storage

Either use user authorisation or a service account, save credentials to your drive or upload.

In [None]:
# For auth WITHOUT service account
#from google.colab import auth
#auth.authenticate_user()

# https://cloud.google.com/resource-manager/docs/creating-managing-projects
#project_id = "soc-platform"
#!gcloud config set project {project_id}

In [8]:
# Mount drive
from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


In [None]:
# Copy GC credentials to home (place in your GDrive, and connect Drive)
!cp "/content/drive/My Drive/soc-platform-6a9bf204638c.json" "/root/.soc-platform-6a9bf204638c.json"

In [10]:
# Auth WITH service account
!gcloud auth activate-service-account \
  edward-morris-vizzuality@soc-platform.iam.gserviceaccount.com  \
          --key-file=/root/.soc-platform-6a9bf204638c.json --project="soc-platform"


Activated service account credentials for: [edward-morris-vizzuality@soc-platform.iam.gserviceaccount.com]


In [11]:
# Test GC auth
!gsutil ls "gs://vizz-data-transfer"

gs://vizz-data-transfer/SOC_maps/


### MapBox

Create a JSON file and add it to your drive or upload:
```
{"MB_USER:"user-name", "MB_TOKEN":"token"}
```

In [None]:
# Copy GC credentials to home (place in your GDrive, and connect Drive)
!cp "/content/drive/My Drive/copernicus-forests-mapbox.json" "/root/.copernicus-forests-mapbox.json"

In [17]:
# Set up Mapbox (S3) credentials as environmental variables
import json
import os

# Set user and token as environment variables
c = json.loads(open("/root/.copernicus-forests-mapbox.json").read())
os.environ['MB_USER'] = c['MB_USER']
os.environ['MB_TOKEN'] = c['MB_TOKEN']

# Make call to mapbox api and save return to file
!curl -X POST https://api.mapbox.com/uploads/v1/${MB_USER}/credentials?access_token=${MB_TOKEN} > credentials.json
r = json.loads(open("credentials.json").read())
#print(r)

# Set credentials as environ variables
os.environ['MB_BUCKET'] = r['bucket']
os.environ['MB_KEY'] = r['key']
os.environ['AWS_ACCESS_KEY_ID'] = r['accessKeyId']
os.environ['AWS_SECRET_ACCESS_KEY'] = r['secretAccessKey']
os.environ['AWS_SESSION_TOKEN'] = r['sessionToken']

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   857  100   857    0     0   2158      0 --:--:-- --:--:-- --:--:--  2158


# Utils

## copy_gcs

In [None]:
import os
import subprocess

def copy_gcs(source_list, dest_list, opts=""):
  """
  Use gsutil to copy each corresponding item in source_list
  to dest_list
  """
  for s, d  in zip(source_list, dest_list):
    cmd = f"gsutil -m cp -r {opts} {s} {d}"
    print(f"Processing: {cmd}")
    r = subprocess.call(cmd, shell=True)
    if r == 0:
        print("Task created")
    else:
        print("Task failed")
  print("Finished copy")

## upload_to_mapbox

In [None]:
# Upload task for mapbox
import os
from mapbox import Uploader

def upload_to_mapbox(file_path, tileset_name):
  """
  Given a local file path and a MapBox tileset name
  push to MapBox AWS S3 staging and create MapBox upload task
  """
  username = os.getenv("MB_USER")
  my_token = os.getenv("MB_TOKEN")
  u = Uploader(access_token=my_token)  # handles authentication
  tileset = f"{username}.{tileset_name}"  # name your tileset
  job = u.upload(open(file_path, 'rb'), tileset)  # upload happens here
  # job = u.create(url, tileset, name=tileset_name)  # starts the tiling job
  status = job.status_code
  print(status)

## create_mbtiles

In [None]:
import os
import subprocess

def create_mbtiles(source_path, dest_path, layer_name, opts="-zg --drop-densest-as-needed --extend-zooms-if-still-dropping --force --read-parallel"):
  """
  Use tippecanoe to to create a MBTILE at dest_path from source_path.
  layer_name is used for the name of the layer in the MBTILE.
  Regex file path (/*.geojson) is supported for source_path.
  """
  cmd = f"tippecanoe -o {dest_path} -l {layer_name} {opts} {source_path}"
  print(f"Processing: {cmd}")
  r = subprocess.call(cmd, shell=True)
  if r == 0:
      print("Task created")
  else:
      print("Task failed")
  print("Finished processing")

# Process data

## Create MBTILES

In [27]:
layer_name = "SWE_biovar_species"
source_path = "'/content/drive/My Drive/copernicus-forests/SWE_zonal_biovar_ISEA-3-HEXAGON_grid.geojson'"
dest_path = "'/content/drive/My Drive/copernicus-forests/SWE-bv-spp.mbtiles'"
create_mbtiles(source_path, dest_path, layer_name, opts="-zg --drop-densest-as-needed --extend-zooms-if-still-dropping --force --read-parallel")

Processing: tippecanoe -o '/content/drive/My Drive/copernicus-forests/SWE-bv-spp.mbtiles' -l SWE_biovar_species -zg --drop-densest-as-needed --extend-zooms-if-still-dropping --force --read-parallel '/content/drive/My Drive/copernicus-forests/SWE_zonal_biovar_ISEA-3-HEXAGON_grid.geojson'
Task created
Finished processing


## Upload to MapBox

In [None]:
# Add to Mapbox
import glob 
import os
path = '/content/drive/My Drive/copernicus-forests/'
files = [f for f in glob.glob(path + "**/*.mbtiles", recursive=True)]
print(files)
for f in files:
  print(f)
  upload_to_mapbox(f, os.path.splitext(os.path.basename(f))[0])

['/content/drive/My Drive/copernicus-forests/SWE-bv-spp.mbtiles']
/content/drive/My Drive/copernicus-forests/SWE-bv-spp.mbtiles
201
