[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/wri/cities-urbanshift/blob/main/geospatial-layers/scripts/administrative-boundaries/administrative_boundaries_updates.ipynb)

# Context

## Objective

This notebook aims at providing an exploratory analysis of the different versions of administrative boundaries of UrbanShift cities and generate a geojson file compiling the most relevant versions. 

Two main data sources are used to collect the administrative boundaries of the 23 UrbanShift cities:
- **The global geoBoundaries database**: It provides boundaries at various administrative levels. A javascript library (`cities360) has been developed in order to extract these boundaries based on the city name.
- **Cities specific boundaries**: The boundaries collected from the geoBoundaries database may be different from the real administrative limits. The extracted boundaries have been shared with cities coordinators and some of them need to be updated by new boundaries collected in various formats.

Our goal is to examine the updated boundaries, compare them with those extracted from the geoBoundaries database and build a `geojson` file that compile these various geospatial coordinates that can be used futher for the diffrent analysis.

The table below shows a complete list of the administrative boundaries by highlighting those that need to be updated and describing the updates fromats and contents:

| Country | City name |  Need for <br/> specific updates | Updates <br/> data sources | Updates format | Updates storage | Integration status | Comments |
|---- | ---- | ---- |---- |---- |---- |---- | ---- |
| Argentina | Mendoza |  `Yes` | [AMBADATA](https://www.ambadata.gob.ar/mapa) [UNICIPIO](https://www.mendoza.gov.ar/unicipio/que-es-unicipio/) | `NA`| `NA` | `Waiting for updates` | Boundaries seem to be wrong. <br/> Waiting for validation from the Ministry of Environment  |
| Argentina | Mar Del Plata |   `No` | `NA` | `NA` | `NA` | `NA` |`NA`|
| Argentina | Buenos Aires |   `Yes` | [AMBADATA](https://www.ambadata.gob.ar/mapa) [UNICIPIO](https://www.mendoza.gov.ar/unicipio/que-es-unicipio/) | `NA`| `NA` | `Waiting for updates` | Boundaries seem to be wrong. <br/> Waiting for validation from the Ministry of Environment  |
| Argentina | Ushuaia |   `No` | `NA` | `NA` | `NA` | `NA` |`NA`|
| Argentina | Salta |   `No` | `NA` | `NA` | `NA` | `NA` |`NA`|
| Brazil | Teresina |   `Yes` | [Teresina Planning Deep](https://semplan.pmt.pi.gov.br/ride-teresina/) | `SHP` | [GCP storage](https://storage.googleapis.com/urbanshift/administrative_boundaries/raw/source_specific_updates/BRA-Teresina.geojson)  | `Updates integrated` |Pau D'Arco Municipality was added by law to the Teresina IDR in 2017 <br/> Corrects limits sent by Bruno. |
| Brazil | Florianopolois |   `Yes` | [Florianopolis Metropolitan Agency (SUDERF)](https://www.scc.sc.gov.br/index.php/suderf/a-superintendencia)  | `SHP` | [GCP storage](https://storage.googleapis.com/urbanshift/administrative_boundaries/raw/source_specific_updates/BRA-Florianopolois.geojson) | `Updates integrated` |By law, the limits sent are from the expanded MR. <br/> The GEF-7 projects focus only on the core MR. <br/> Corrects limits sent by Bruno. |
| Brazil | Belem |  `Yes` |  [IBGE2020](https://www.ibge.gov.br/geociencias/organizacao-do-territorio/malhas-territoriais/15774-malhas.html?=&t=downloads)  | `SHP` | [GCP storage](https://storage.googleapis.com/urbanshift/administrative_boundaries/raw/source_specific_updates/BRA-Belem.geojson) | `Updates integrated` | The layer sent had only the land limits. Some cities have significant water domains. <br/> Corrects limits sent by Bruno. |
| Costa Rica | San Jose |  `Yes` |  [MOPT(2017)](https://www.mopt.go.cr/wps/portal/Home/informacionrelevante/planificacion/mapasRVN/!ut/p/z1/04_Sj9CPykssy0xPLMnMz0vMAfIjo8ziPQPcDQy9TQx8DMKMXQ0cnXycfN1Mzd0tQs30w8EKDHAARwP9KGT97oZuhgaOgY4GwcGuRsYGRiZQ_XgURBGyPwqsBN1epyAjJ2MDA3d_I6wKUJxYkBsaYZDpqAgAP_F8dg!!/dz/d5/L2dBISEvZ0FBIS9nQSEh/) [MINVAH(2020)](https://geoexplora-mivah.opendata.arcgis.com/datasets/MIVAH::limite-del-plan-gam-1982/explore?location=9.930180%2C-83.991104%2C10.00)  | `SHP` | [GCP storage](https://storage.googleapis.com/urbanshift/administrative_boundaries/raw/source_specific_updates/CRI-San_Jose.geojson) | `Updates integrated` | The geoBoundaries layer corresponds to the full extent of the cantons (municipalities) that form the GAM. <br/>  However, the GAM Plan splits some cantons and limit the GAM to a minor area. <br/> The correct limit is officially distributed by the Ministries of Transportation and Housing.  <br/>  The inner polygon corresponds to the Urban Macrozone of the GAM  <br/>  Need to validate that in a later stage when we have a GEF-7 project manager in Costa Rica.|
| Rwanda | Kigali |  `No` | `NA`   | `NA` | `NA` | `NA` | `NA`|
| Sierra Leone | Freetown |  `No` |  `NA`  | `NA` | `NA` |`NA` |`NA` |
| Marocco | Marrakech |  `No` |  `NA`  | `NA` |`NA`  | `NA` | `NA`|
| India | Chennai |  `No` |  `NA`  | `NA` | `NA` | `Waiting for updates`  | Final confirmation would be after the submission of the project to GEF in September. |
| India | Pune |  `No` | `NA`   | `NA` | `NA` | `Waiting for updates` | Final confirmation would be after the submission of the project to GEF in September. |
| India | Surat |  `No` |  `NA`  |`NA`  | `NA` | `Waiting for updates` | Final confirmation would be after the submission of the project to GEF in September. |
| Indonesia | Jakarta |  `Yes` | [GIS BPDB](http://gis.bpbd.jakarta.go.id/layers/geonode%3Adki_kota#more)   | `geojson` | [GCP storage](https://storage.googleapis.com/urbanshift/administrative_boundaries/raw/source_specific_updates/IDN-Jakarta.geojson) | `Updates integrated` | |
| Indonesia | Semarang |  `Yes` | [Open Street Map](https://openstreetmap.id/en/data-semarang/)   | `geojson` | [GCP storage](https://storage.googleapis.com/urbanshift/administrative_boundaries/raw/source_specific_updates/IDN-Semarang.geojson) |  `Updates integrated`| |
| Indonesia | Palembang |  `Yes` |  [bappedalitbang palembang](http://bappedalitbang.palembang.go.id/peta-batas-administrasi-kota-palembang.html) | `jpg` | [GCP storage](https://storage.googleapis.com/urbanshift/administrative_boundaries/raw/source_specific_updates/IDN-Palembang.jpg) | `Need geospatial format` | |
| Indonesia | Balikpapan |  `Yes` |  [peta adminitrasi balikpapan](http://web.balikpapan.go.id/uploaded/peta/petaadminitrasibalikpapan.jpg ) | `jpg` |  [GCP storage](https://storage.googleapis.com/urbanshift/administrative_boundaries/raw/source_specific_updates/IDN-Balikpapan.jpg)| `Need geospatial format`  | Dot line in  as it refer the legend mark the  city boundary |
| Indonesia | Bitung |  `Yes` |  [GIS PETA](http://103.12.84.58/gis/peta/slum) | `unknown` |  |  `Need access to data` | |
| China | Chengdu | `No`  |  `NA` | `NA` |`NA`  | `NA` | |
| China | Chongqing | `No`  |  `NA` | `NA` |`NA`  | `NA` | |
| China | Ningbo | `Yes`  | Open Street Map instead of Google map| `geojson`  | [GCP storage](https://storage.googleapis.com/urbanshift/administrative_boundaries/raw/source_specific_updates/CHN-Ningbo.geojson) |   `Updates integrated` | Ningbo is wrong. <br/> It already extends to Taizhou City which is outside the boundary. <br/>  Google Maps has the correct city boundary.|

## Administrative boundaries data schema

In order to have a coherent administrative boundaries data, we propose a standardized data model that we will use for storing the various boundaries information. The data will be stored as a `geojson` file combining both geograhical coordinates and porperties for the 23 cities and the different versions:


| Field | Type |  Description | Examples |
| ---- |---- | ---- | ---- |
| country_iso3 | `string`| Three letter ISO code of te countries (ISO 3166-1 alpha 3) | `ARG` `BRA`... |
| city_name | `string` | The city name as used for extarcting data from cities360 library with lower case letters and underscore instead of spaces. | `bitung` `san_jose` |
| city_name_viz | `string` | A city name format to use for visualization (with fisrt letter in uppercase and keeping empty spaces) | `Bitung` `San Jose` |
| boundary_data_source | `string` | The name of data source used for extracting the boundaries coordinates. | `geoBoundaries` `city_specific` |
| boundary_use | `string` | Since we different boundaries version may exist for one city depending in the data source, this field indicates which one to use in the analysis | `false` `true` |
| boundary_id | `string` | An Universal Uniaue Identifier (UUID) for each boundary generated randomly| `12345678-1234-5678-1234-567812345678`|
| geometry | `geom` | The geographical coordinates of the boundary (CRS: ESPG 4326). Allowed geometry types are: `POLYGON`, `MULTIPOLYGON` | |



The input raw data collected from `cities360` and `cities updtes` are stored in two distinct folders in google cloud buckets withing the following path: `urbanshift/administrative_boundaries/raw/`:

- **source_specific_updates/**: This folder contains the different files collected from cities coordinators. The differnt files have been converted to `geojson` format, with one file by city. The names of the files correpsond to the city names with respoect to existing codes.
- **source_lib_cities360/:** Thie folder contains the administative boundaries as obtained from `cities360` library (using geoBoundaries database). It consists of one unique file combining the different adminstrative boundaries.

In [1]:
#! pip install leafmap #geopandas leafmap

In [332]:
import geopandas as gpd
import plotly.express as px
import folium
import leafmap
import IPython
import uuid
from google.cloud import storage
from google.colab import auth
auth.authenticate_user()

# GeoBoundaries data

In [366]:
# load adata from google cloud storage
boundaries_raw_cities360 = gpd.read_file('https://storage.googleapis.com/urbanshift/administrative_boundaries/raw/source_lib_cities360/administrative_boundaries_cities360.geojson')
#boundaries_raw_cities360.head()

In [367]:
# map data with the new schema
boundaries_mapped = (boundaries_raw_cities360
      .filter(['country_iso', 'name','id','geometry'])
      .rename(columns={'country_iso': 'country_iso3', 
                       'name': 'city_name_viz',
                       'id': 'city_name'})
      .assign(boundary_data_source = 'cities360')
)

In [368]:
display(boundaries_mapped)

Unnamed: 0,country_iso3,city_name_viz,city_name,geometry,boundary_data_source
0,ARG,Mendoza,ARG-Mendoza,"POLYGON ((-70.09376 -33.05128, -70.09369 -33.0...",cities360
1,ARG,Mar del Plata,ARG-Mar_del_Plata,GEOMETRYCOLLECTION (LINESTRING (-57.52693 -37....,cities360
2,ARG,Ushuaia,ARG-Ushuaia,"MULTIPOLYGON (((-64.35062 -54.84401, -64.35014...",cities360
3,ARG,Salta,ARG-Salta,"POLYGON ((-65.53171 -25.02690, -65.53166 -25.0...",cities360
4,ARG,Buenos Aires,ARG-Buenos_Aires,"MULTIPOLYGON (((-58.36618 -34.59744, -58.36609...",cities360
5,BRA,Teresina,BRA-Teresina,"POLYGON ((-42.59900 -5.35000, -42.60100 -5.251...",cities360
6,BRA,Florianopolois,BRA-Florianopolois,"MULTIPOLYGON (((-48.58167 -27.76205, -48.57442...",cities360
7,BRA,Belem,BRA-Belem,"MULTIPOLYGON (((-48.54139 -1.35451, -48.53229 ...",cities360
8,CRI,San Jose,CRI-San_Jose,"POLYGON ((-83.76411 9.60486, -83.76250 9.60384...",cities360
9,RWA,Kigali,RWA-Kigali,"POLYGON ((29.97953 -1.88664, 29.98450 -1.89487...",cities360


In [369]:
m = leafmap.Map(zoom=1)
m.add_gdf(boundaries_mapped, layer_name="Cities 360 Boundaries", fill_colors=['red'])
m

Output hidden; open in https://colab.research.google.com to view.

# Specific boundaries to integrate

## Indonesia

### Jakarta

#### Comparison

In [370]:
#-----------------------------------------------------
# Read data
#-----------------------------------------------------

# specify city name
city_name = 'IDN-Jakarta'
# define path
data_path ='https://storage.googleapis.com/urbanshift/administrative_boundaries/raw/source_specific_updates/'+city_name+'.geojson'
# get updated boundaries
boundaries_IDN_jakarta = gpd.read_file(data_path)
boundaries_IDN_jakarta.head()

Unnamed: 0,id,KAB_NAME,Kota,geometry
0,dki_kota.1,JAKARTA BARAT,Jakarta Barat,"MULTIPOLYGON (((106.73745 -6.22378, 106.73747 ..."
1,dki_kota.2,JAKARTA PUSAT,Jakarta Pusat,"MULTIPOLYGON (((106.82274 -6.20264, 106.82261 ..."
2,dki_kota.3,JAKARTA SELATAN,Jakarta Selatan,"MULTIPOLYGON (((106.84413 -6.31748, 106.84412 ..."
3,dki_kota.4,JAKARTA TIMUR,Jakarta Timur,"MULTIPOLYGON (((106.84123 -6.32690, 106.84112 ..."
4,dki_kota.5,JAKARTA UTARA,Jakarta Utara,"MULTIPOLYGON (((106.87774 -6.09487, 106.87773 ..."


In [371]:
#-----------------------------------------------------
# Plot comaprison map
#-----------------------------------------------------

# add centroid coordinates to center map view
boundaries_IDN_jakarta["centroid_x"] = boundaries_IDN_jakarta.centroid.x
boundaries_IDN_jakarta["centroid_y"] = boundaries_IDN_jakarta.centroid.y
# define centroid for map view
map_center_x = boundaries_IDN_jakarta.centroid_x.mean()
map_center_y = boundaries_IDN_jakarta.centroid_y.mean()

# define map parameter
center = [map_center_y,map_center_x]
zoom = 6

# plot
m = leafmap.Map(center=center, zoom=zoom)
m.add_gdf(boundaries_mapped[boundaries_mapped.city_name == city_name], 
          layer_name="Cities360 version", 
          fill_colors=['red'])
m.add_gdf(boundaries_IDN_jakarta, 
          layer_name="Updated version", 
          fill_colors=['green'])
m


Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.



Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.




#### Integration

In [372]:
# Since Jakarta is composed of 5 sub-boundaries we need to unify the diffferent polygons

# create a union field
boundaries_IDN_jakarta['city_name'] = city_name
# unify the polygons
boundaries_IDN_jakarta_union = boundaries_IDN_jakarta.dissolve(by='city_name').reset_index()

In [373]:
# map data with the new schema
boundaries_IDN_jakarta_mapped = (boundaries_IDN_jakarta_union
      .filter(['geometry'])
      .assign(country_iso3 = 'IDN',
              city_name_viz = 'Jakarta',
              city_name = city_name,
              boundary_data_source = 'city_specific')
)


In [374]:
# append in the existing dataframe
boundaries_mapped = boundaries_mapped.append(boundaries_IDN_jakarta_mapped, ignore_index=True)

### Semarang


#### Comparison

In [375]:
#-----------------------------------------------------
# Read data
#-----------------------------------------------------

# specify city name
city_name = 'IDN-Semarang'
# define path
data_path ='https://storage.googleapis.com/urbanshift/administrative_boundaries/raw/source_specific_updates/'+city_name+'.geojson'
# get updated boundaries
boundaries_IDN_semarang = gpd.read_file(data_path)
boundaries_IDN_semarang.head()

Unnamed: 0,boundary,name,type,geometry
0,administrative,Semarang,boundary,"MULTIPOLYGON (((110.42534 -6.94929, 110.42651 ..."


In [376]:
#-----------------------------------------------------
# Plot comaprison map
#-----------------------------------------------------

# add centroid coordinates to center map view
boundaries_IDN_semarang["centroid_x"] = boundaries_IDN_semarang.centroid.x
boundaries_IDN_semarang["centroid_y"] = boundaries_IDN_semarang.centroid.y
# define centroid for map view
map_center_x = boundaries_IDN_semarang.centroid_x.mean()
map_center_y = boundaries_IDN_semarang.centroid_y.mean()

# define map parameter
center = [map_center_y,map_center_x]
zoom = 6

# plot
m = leafmap.Map(center=center, zoom=zoom)
m.add_gdf(boundaries_mapped[boundaries_mapped.city_name == city_name], 
          layer_name="Cities360 version", 
          fill_colors=['red'])
m.add_gdf(boundaries_IDN_semarang, 
          layer_name="Updated version", 
          fill_colors=['green'])
m


Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.



Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.




#### Integration

In [377]:
# map data with the new schema
boundaries_IDN_semarang_mapped = (boundaries_IDN_semarang
      .filter(['geometry'])
      .assign(country_iso3 = 'IDN',
              city_name_viz = 'Semarang',
              city_name = city_name,
              boundary_data_source = 'city_specific')
)


In [378]:
# append in the existing dataframe
boundaries_mapped = boundaries_mapped.append(boundaries_IDN_semarang_mapped, ignore_index=True)

### Palembang

#### Comparison


The administrative boundaries of Palembang are collected in `jpg` format. We propose here to compare visually with the existant boundaries collected from geoBoundaries database. In order to update the bouondaries we need data in geospatial format.

In [379]:
#-----------------------------------------------------
# Read data from geoBoundary
#-----------------------------------------------------

# specify city name
city_name = 'IDN-Palembang'

boundaries_mapped_IDN_palembang = boundaries_mapped[boundaries_mapped.city_name == city_name]

In [380]:
#-----------------------------------------------------
# Plot comaprison map
#-----------------------------------------------------

# add centroid coordinates to center map view
boundaries_mapped_IDN_palembang["centroid_x"] = boundaries_mapped_IDN_palembang.centroid.x
boundaries_mapped_IDN_palembang["centroid_y"] = boundaries_mapped_IDN_palembang.centroid.y
# define centroid for map view
map_center_x = boundaries_mapped_IDN_palembang.centroid_x.mean()
map_center_y = boundaries_mapped_IDN_palembang.centroid_y.mean()

# define map parameter
center = [map_center_y,map_center_x]
zoom = 6

# plot
m = leafmap.Map(center=center, zoom=zoom)
m.add_gdf(boundaries_mapped_IDN_palembang, 
          layer_name="Cities360 version", 
          fill_colors=['red'])
m


Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.




In [295]:

# define path to collect new boundaries
data_path ='https://storage.googleapis.com/urbanshift/administrative_boundaries/raw/source_specific_updates/'+city_name+'.jpg'
# display image
IPython.display.Image(data_path, width = 1000)

Output hidden; open in https://colab.research.google.com to view.

### Balikpapan

#### Comparison

The administrative boundaries of Balikpapan are collected in `jpg` format. We propose here to compare visually with the existant boundaries collected from geoBoundaries database. In order to update the boundaries we need data in geospatial format.

In [296]:
#-----------------------------------------------------
# Read data from geoBoundary
#-----------------------------------------------------

# specify city name
city_name = 'IDN-Balikpapan'

boundaries_mapped_IDN_balikpapan = boundaries_mapped[boundaries_mapped.city_name == city_name]

In [297]:
#-----------------------------------------------------
# Plot comaprison map
#-----------------------------------------------------

# add centroid coordinates to center map view
boundaries_mapped_IDN_balikpapan["centroid_x"] = boundaries_mapped_IDN_balikpapan.centroid.x
boundaries_mapped_IDN_balikpapan["centroid_y"] = boundaries_mapped_IDN_balikpapan.centroid.y
# define centroid for map view
map_center_x = boundaries_mapped_IDN_balikpapan.centroid_x.mean()
map_center_y = boundaries_mapped_IDN_balikpapan.centroid_y.mean()

# define map parameter
center = [map_center_y,map_center_x]
zoom = 6

# plot
m = leafmap.Map(center=center, zoom=zoom)
m.add_gdf(boundaries_mapped_IDN_balikpapan, 
          layer_name="Cities360 version", 
          fill_colors=['red'])
m


Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.




In [221]:
# define path to collect new boundaries
data_path ='https://storage.googleapis.com/urbanshift/administrative_boundaries/raw/source_specific_updates/'+city_name+'.jpg'
# display image
IPython.display.Image(data_path, width = 1000)

Output hidden; open in https://colab.research.google.com to view.

### Bitung

#### Comparison

We could not accees to Bitung data through the specified link: http://103.12.84.58/gis/peta/slu

We only show here the boundaries as collected fom geoBoundaries database.

In [381]:
#-----------------------------------------------------
# Read data from geoBoundary
#-----------------------------------------------------

# specify city name
city_name = 'IDN-Bitung'

boundaries_mapped_IDN_bitung = boundaries_mapped[boundaries_mapped.city_name == city_name]

In [299]:
#-----------------------------------------------------
# Plot comaprison map
#-----------------------------------------------------

# add centroid coordinates to center map view
boundaries_mapped_IDN_bitung["centroid_x"] = boundaries_mapped_IDN_bitung.centroid.x
boundaries_mapped_IDN_bitung["centroid_y"] = boundaries_mapped_IDN_bitung.centroid.y
# define centroid for map view
map_center_x = boundaries_mapped_IDN_bitung.centroid_x.mean()
map_center_y = boundaries_mapped_IDN_bitung.centroid_y.mean()

# define map parameter
center = [map_center_y,map_center_x]
zoom = 6

# plot
m = leafmap.Map(center=center, zoom=zoom)
m.add_gdf(boundaries_mapped_IDN_bitung, 
          layer_name="Cities360 version", 
          fill_colors=['red'])
m

Output hidden; open in https://colab.research.google.com to view.

## China

### Ningbo

In order to collect google like version of Ningbo administrative boundaries, we used open streetmap in two manual steps:

- Get OSM identifier of the boundary: https://nominatim.openstreetmap.org/ui/search.html?q=Ningbo+
- Get the geojson file of the boundary: http://polygons.openstreetmap.fr/index.py?id=3478607

The extracted geojson file is stored in google cloud storage.

#### Comparison

In [382]:
#-----------------------------------------------------
# Read data
#-----------------------------------------------------

# specify city name
city_name = 'CHN-Ningbo'
# define path
data_path ='https://storage.googleapis.com/urbanshift/administrative_boundaries/raw/source_specific_updates/'+city_name+'.geojson'
# get updated boundaries
boundaries_CHN_Ningbo = gpd.read_file(data_path)
boundaries_CHN_Ningbo.head()

Unnamed: 0,geometry
0,GEOMETRYCOLLECTION (MULTIPOLYGON (((120.87536 ...


In [383]:
#-----------------------------------------------------
# Plot comaprison map
#-----------------------------------------------------

# add centroid coordinates to center map view
boundaries_CHN_Ningbo["centroid_x"] = boundaries_CHN_Ningbo.centroid.x
boundaries_CHN_Ningbo["centroid_y"] = boundaries_CHN_Ningbo.centroid.y
# define centroid for map view
map_center_x = boundaries_CHN_Ningbo.centroid_x.mean()
map_center_y = boundaries_CHN_Ningbo.centroid_y.mean()

# define map parameter
center = [map_center_y,map_center_x]
zoom = 6

# plot
m = leafmap.Map(center=center, zoom=zoom)
m.add_gdf(boundaries_mapped[boundaries_mapped.city_name == city_name], 
          layer_name="Cities360 version", 
          fill_colors=['red'])
m.add_gdf(boundaries_CHN_Ningbo, 
          layer_name="Updated version", 
          fill_colors=['green'])
m

Output hidden; open in https://colab.research.google.com to view.

#### Integration

In [384]:
# map data with the new schema
boundaries_CHN_Ningbo_mapped = (boundaries_CHN_Ningbo
      .filter(['geometry'])
      .assign(country_iso3 = 'CHN',
              city_name_viz = 'Ningbo',
              city_name = city_name,
              boundary_data_source = 'city_specific')
)

In [385]:
# append in the existing dataframe
boundaries_mapped = boundaries_mapped.append(boundaries_CHN_Ningbo_mapped, ignore_index=True)

## Brazil

### Teresina

#### Comparison

In [386]:
#-----------------------------------------------------
# Read data
#-----------------------------------------------------

# specify city name
city_name = 'BRA-Teresina'
# define path
data_path ='https://storage.googleapis.com/urbanshift/administrative_boundaries/raw/source_specific_updates/'+city_name+'.geojson'
# get updated boundaries
boundaries_BRA_Teresina = gpd.read_file(data_path)
boundaries_BRA_Teresina.head()

Unnamed: 0,CD_MUN,NM_MUN,SIGLA_UF,AREA_KM2,geometry
0,2200400,Altos,PI,957.232,"POLYGON ((-42.39159 -5.16794, -42.39269 -5.169..."
1,2201606,Beneditinos,PI,937.098,"POLYGON ((-42.50650 -5.30194, -42.50587 -5.302..."
2,2202737,Coivaras,PI,484.46,"POLYGON ((-42.15223 -5.06586, -42.15249 -5.066..."
3,2203255,Curralinhos,PI,345.811,"POLYGON ((-42.78020 -5.52192, -42.77937 -5.522..."
4,2203305,Demerval Lobão,PI,216.807,"POLYGON ((-42.60092 -5.26894, -42.60035 -5.285..."


In [387]:
#-----------------------------------------------------
# Plot comaprison map
#-----------------------------------------------------

# add centroid coordinates to center map view
boundaries_BRA_Teresina["centroid_x"] = boundaries_BRA_Teresina.centroid.x
boundaries_BRA_Teresina["centroid_y"] = boundaries_BRA_Teresina.centroid.y
# define centroid for map view
map_center_x = boundaries_BRA_Teresina.centroid_x.mean()
map_center_y = boundaries_BRA_Teresina.centroid_y.mean()

# define map parameter
center = [map_center_y,map_center_x]
zoom = 6

# plot
m = leafmap.Map(center=center, zoom=zoom)
m.add_gdf(boundaries_mapped[boundaries_mapped.city_name == city_name], 
          layer_name="Cities360 version", 
          fill_colors=['red'])
m.add_gdf(boundaries_BRA_Teresina, 
          layer_name="Updated version", 
          fill_colors=['green'])
m


Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.



Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.




#### Integration

In [388]:
# Since Teresina is composed of 15 sub-boundaries we need to unify the diffferent polygons

# create a union field
boundaries_BRA_Teresina['city_name'] = city_name
# unify the polygons
boundaries_BRA_Teresina_union = boundaries_BRA_Teresina.dissolve(by='city_name').reset_index()

In [389]:
# map data with the new schema
boundaries_BRA_Teresina_mapped = (boundaries_BRA_Teresina_union
      .filter(['geometry'])
      .assign(country_iso3 = 'BRA',
              city_name_viz = 'Teresina',
              city_name = city_name,
              boundary_data_source = 'city_specific')
)

In [390]:
# append in the existing dataframe
boundaries_mapped = boundaries_mapped.append(boundaries_BRA_Teresina_mapped, ignore_index=True)

### Belem

#### Comparison

In [391]:
#-----------------------------------------------------
# Read data
#-----------------------------------------------------

# specify city name
city_name = 'BRA-Belem'
# define path
data_path ='https://storage.googleapis.com/urbanshift/administrative_boundaries/raw/source_specific_updates/'+city_name+'.geojson'
# get updated boundaries
boundaries_BRA_Belem = gpd.read_file(data_path)
boundaries_BRA_Belem.head()

Unnamed: 0,CD_MUN,NM_MUN,SIGLA_UF,AREA_KM2,geometry
0,1500800,Ananindeua,PA,190.581,"POLYGON ((-48.33466 -1.23983, -48.33379 -1.243..."
1,1501402,Belém,PA,1059.466,"POLYGON ((-48.35304 -1.22103, -48.35754 -1.224..."
2,1501501,Benevides,PA,187.826,"POLYGON ((-48.33104 -1.25695, -48.33038 -1.257..."
3,1502400,Castanhal,PA,1029.3,"POLYGON ((-47.92076 -1.04243, -47.91624 -1.043..."
4,1504422,Marituba,PA,103.214,"POLYGON ((-48.32070 -1.32335, -48.31987 -1.324..."


In [392]:
#-----------------------------------------------------
# Plot comaprison map
#-----------------------------------------------------

# add centroid coordinates to center map view
boundaries_BRA_Belem["centroid_x"] = boundaries_BRA_Belem.centroid.x
boundaries_BRA_Belem["centroid_y"] = boundaries_BRA_Belem.centroid.y
# define centroid for map view
map_center_x = boundaries_BRA_Belem.centroid_x.mean()
map_center_y = boundaries_BRA_Belem.centroid_y.mean()

# define map parameter
center = [map_center_y,map_center_x]
zoom = 6

# plot
m = leafmap.Map(center=center, zoom=zoom)
m.add_gdf(boundaries_mapped[boundaries_mapped.city_name == city_name], 
          layer_name="Cities360 version", 
          fill_colors=['red'])
m.add_gdf(boundaries_BRA_Belem, 
          layer_name="Updated version", 
          fill_colors=['green'])
m


Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.



Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.




#### Integration

In [393]:
# Since Teresina is composed of 7 sub-boundaries we need to unify the diffferent polygons

# create a union field
boundaries_BRA_Belem['city_name'] = city_name
# unify the polygons
boundaries_BRA_Belem_union = boundaries_BRA_Belem.dissolve(by='city_name').reset_index()

In [394]:
# map data with the new schema
boundaries_BRA_Belem_mapped = (boundaries_BRA_Belem_union
      .filter(['geometry'])
      .assign(country_iso3 = 'BRA',
              city_name_viz = 'Belem',
              city_name = city_name,
              boundary_data_source = 'city_specific')
)

In [395]:
# append in the existing dataframe
boundaries_mapped = boundaries_mapped.append(boundaries_BRA_Belem_mapped, ignore_index=True)

### Florianopolois

#### Comparison

In [396]:
#-----------------------------------------------------
# Read data
#-----------------------------------------------------

# specify city name
city_name = 'BRA-Florianopolois'
# define path
data_path ='https://storage.googleapis.com/urbanshift/administrative_boundaries/raw/source_specific_updates/'+city_name+'.geojson'
# get updated boundaries
boundaries_BRA_Florianopolois = gpd.read_file(data_path)
boundaries_BRA_Florianopolois.head()

Unnamed: 0,CD_MUN,NM_MUN,SIGLA_UF,AREA_KM2,geometry
0,4200606,Águas Mornas,SC,326.66,"MULTIPOLYGON (((-48.84483 -27.62630, -48.84495..."
1,4201208,Antônio Carlos,SC,234.422,"MULTIPOLYGON (((-48.83235 -27.39824, -48.83209..."
2,4202305,Biguaçu,SC,365.755,"MULTIPOLYGON (((-48.65210 -27.32174, -48.65189..."
3,4205407,Florianópolis,SC,674.844,"MULTIPOLYGON (((-48.44448 -27.85370, -48.44458..."
4,4206009,Governador Celso Ramos,SC,127.556,"MULTIPOLYGON (((-48.52561 -27.32985, -48.52559..."


In [397]:
#-----------------------------------------------------
# Plot comaprison map
#-----------------------------------------------------

# add centroid coordinates to center map view
boundaries_BRA_Florianopolois["centroid_x"] = boundaries_BRA_Florianopolois.centroid.x
boundaries_BRA_Florianopolois["centroid_y"] = boundaries_BRA_Florianopolois.centroid.y
# define centroid for map view
map_center_x = boundaries_BRA_Florianopolois.centroid_x.mean()
map_center_y = boundaries_BRA_Florianopolois.centroid_y.mean()

# define map parameter
center = [map_center_y,map_center_x]
zoom = 6

# plot
m = leafmap.Map(center=center, zoom=zoom)
m.add_gdf(boundaries_mapped[boundaries_mapped.city_name == city_name], 
          layer_name="Cities360 version", 
          fill_colors=['red'])
m.add_gdf(boundaries_BRA_Florianopolois, 
          layer_name="Updated version", 
          fill_colors=['green'])
m

Output hidden; open in https://colab.research.google.com to view.

#### Integration

In [398]:
# Since Teresina is composed of 9 sub-boundaries we need to unify the diffferent polygons

# create a union field
boundaries_BRA_Florianopolois['city_name'] = city_name
# unify the polygons
boundaries_BRA_Florianopolois_union = boundaries_BRA_Florianopolois.dissolve(by='city_name').reset_index()

In [399]:
# map data with the new schema
boundaries_BRA_Belem_mapped = (boundaries_BRA_Florianopolois_union
      .filter(['geometry'])
      .assign(country_iso3 = 'BRA',
              city_name_viz = 'Florianopolois',
              city_name = city_name,
              boundary_data_source = 'city_specific')
)

In [400]:
# append in the existing dataframe
boundaries_mapped = boundaries_mapped.append(boundaries_BRA_Belem_mapped, ignore_index=True)

## Costa Rica

### San Jose

#### Comparison

In [401]:

#-----------------------------------------------------
# Read data
#-----------------------------------------------------

# specify city name
city_name = 'CRI-San_Jose'
# define path
data_path ='https://storage.googleapis.com/urbanshift/administrative_boundaries/raw/source_specific_updates/'+city_name+'.geojson'
# get updated boundaries
boundaries_CRI_San_Jose = gpd.read_file(data_path)
#boundaries_CRI_San_Jose.head()
#df =gpd.GeoDataFrame(boundaries_CRI_San_Jose, crs="EPSG:4326")
# Set CRS
boundaries_CRI_San_Jose = boundaries_CRI_San_Jose.to_crs("EPSG:4326")

In [402]:
#-----------------------------------------------------
# Plot comaprison map
#-----------------------------------------------------

# add centroid coordinates to center map view
boundaries_CRI_San_Jose["centroid_x"] = boundaries_CRI_San_Jose.centroid.x
boundaries_CRI_San_Jose["centroid_y"] = boundaries_CRI_San_Jose.centroid.y
# define centroid for map view
map_center_x = boundaries_CRI_San_Jose.centroid_x.mean()
map_center_y = boundaries_CRI_San_Jose.centroid_y.mean()

# define map parameter
center = [map_center_y,map_center_x]
zoom = 6

# plot
m = leafmap.Map(center=center, zoom=zoom)
m.add_gdf(boundaries_mapped[boundaries_mapped.city_name == city_name], 
          layer_name="Cities360 version", 
          fill_colors=['red'])
m.add_gdf(boundaries_CRI_San_Jose, 
          layer_name="Updated version", 
          fill_colors=['green'])
m


Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.



Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.




#### Integration

In [403]:
# map data with the new schema
boundaries_CRI_San_Jose_mapped = (boundaries_CRI_San_Jose
      .filter(['geometry'])
      .assign(country_iso3 = 'CRI',
              city_name_viz = 'San Jose',
              city_name = city_name,
              boundary_data_source = 'city_specific')
)

In [404]:
# append in the existing dataframe
boundaries_mapped = boundaries_mapped.append(boundaries_CRI_San_Jose_mapped, ignore_index=True)

In [408]:
display(boundaries_mapped)

Unnamed: 0,country_iso3,city_name_viz,city_name,geometry,boundary_data_source,boundary_use
0,ARG,Mendoza,ARG-Mendoza,"POLYGON ((-70.09376 -33.05128, -70.09369 -33.0...",cities360,True
1,ARG,Mar del Plata,ARG-Mar_del_Plata,GEOMETRYCOLLECTION (LINESTRING (-57.52693 -37....,cities360,True
2,ARG,Ushuaia,ARG-Ushuaia,"MULTIPOLYGON (((-64.35062 -54.84401, -64.35014...",cities360,True
3,ARG,Salta,ARG-Salta,"POLYGON ((-65.53171 -25.02690, -65.53166 -25.0...",cities360,True
4,ARG,Buenos Aires,ARG-Buenos_Aires,"MULTIPOLYGON (((-58.36618 -34.59744, -58.36609...",cities360,True
5,BRA,Teresina,BRA-Teresina,"POLYGON ((-42.59900 -5.35000, -42.60100 -5.251...",cities360,True
6,BRA,Florianopolois,BRA-Florianopolois,"MULTIPOLYGON (((-48.58167 -27.76205, -48.57442...",cities360,True
7,BRA,Belem,BRA-Belem,"MULTIPOLYGON (((-48.54139 -1.35451, -48.53229 ...",cities360,True
8,CRI,San Jose,CRI-San_Jose,"POLYGON ((-83.76411 9.60486, -83.76250 9.60384...",cities360,True
9,RWA,Kigali,RWA-Kigali,"POLYGON ((29.97953 -1.88664, 29.98450 -1.89487...",cities360,True


# Build final geojson file

## Add unique identifier

In [415]:
boundaries_mapped['boundary_id'] = [uuid.uuid4().hex for _ in range(len(boundaries_mapped.index))]

## Update boundary use status

In [417]:
def update_boundaries(administrative_boundaries_gdf, 
                      cities_list_to_update):
  # initialise the colum in a way that by default we use cities360 data source
  administrative_boundaries_gdf['boundary_use'] = 'TRUE'
  # assign boundary use == true for city-specifuc boundaries in a list to update 
  #df.loc[(df['city_name'].isin(cities_list_to_update)) & (df['boundary_data_source'] == 'city_specific'), ['boundary_use']] = 'TRUE'
  # assign boundary use == false for cities360 boundaries in a list to update 
  administrative_boundaries_gdf.loc[(administrative_boundaries_gdf['city_name'].isin(cities_list_to_update)) & (administrative_boundaries_gdf['boundary_data_source'] == 'cities360'), ['boundary_use']] = 'FALSE'

  return administrative_boundaries_gdf

In [418]:
boundaries_mapped_updates = update_boundaries(administrative_boundaries_gdf = boundaries_mapped,
                                      cities_list_to_update = ['IDN-Jakarta',
                                                               'IDN-Semarang',
                                                               'CHN-Ningbo',
                                                               'BRA-Teresina',
                                                               'BRA-Belem', 
                                                               'CRI-San_Jose',
                                                               'BRA-Florianopolois'
                                                               ])

In [419]:
# verify number of cities
len(boundaries_mapped_updates[boundaries_mapped_updates.boundary_use == 'TRUE'])

23

## Display final data

In [420]:
# show geojson
display(boundaries_mapped_updates)

Unnamed: 0,country_iso3,city_name_viz,city_name,geometry,boundary_data_source,boundary_use,boundary_id
0,ARG,Mendoza,ARG-Mendoza,"POLYGON ((-70.09376 -33.05128, -70.09369 -33.0...",cities360,True,8072476448b14a78a4e99a9668bc9204
1,ARG,Mar del Plata,ARG-Mar_del_Plata,GEOMETRYCOLLECTION (LINESTRING (-57.52693 -37....,cities360,True,fb8fc9fc86bc48d5b988eeaaf08cf92a
2,ARG,Ushuaia,ARG-Ushuaia,"MULTIPOLYGON (((-64.35062 -54.84401, -64.35014...",cities360,True,bac857a3501243898f90328af77812f7
3,ARG,Salta,ARG-Salta,"POLYGON ((-65.53171 -25.02690, -65.53166 -25.0...",cities360,True,65cebcd50836402db7e91042c4a67f6c
4,ARG,Buenos Aires,ARG-Buenos_Aires,"MULTIPOLYGON (((-58.36618 -34.59744, -58.36609...",cities360,True,e769cb65c48841b4b1a616d9d011f904
5,BRA,Teresina,BRA-Teresina,"POLYGON ((-42.59900 -5.35000, -42.60100 -5.251...",cities360,False,4f574ea91be54950af0082d237007ab4
6,BRA,Florianopolois,BRA-Florianopolois,"MULTIPOLYGON (((-48.58167 -27.76205, -48.57442...",cities360,False,9b4a5f503f1b4da394f1f2895434a4dc
7,BRA,Belem,BRA-Belem,"MULTIPOLYGON (((-48.54139 -1.35451, -48.53229 ...",cities360,False,a06729cc563d4c2881ea5edd6063fc8d
8,CRI,San Jose,CRI-San_Jose,"POLYGON ((-83.76411 9.60486, -83.76250 9.60384...",cities360,False,6425c323735a4a24af845e432a6e0182
9,RWA,Kigali,RWA-Kigali,"POLYGON ((29.97953 -1.88664, 29.98450 -1.89487...",cities360,True,bdc6b3c0ce5c47db8afacf1b83df7e40


In [421]:
# plot map

m = leafmap.Map(zoom=1)
m.add_gdf(boundaries_mapped_updates[boundaries_mapped_updates.boundary_use == 'TRUE'], 
          layer_name="Updated Boundaries", 
          fill_colors=['green'])
m

Output hidden; open in https://colab.research.google.com to view.

## Store final data

In [422]:
# convert into geojson
boundaries_mapped_updates_geojson = boundaries_mapped_updates.to_json()

In [423]:
# instatiate a google storage client and specify reauired bucket and file
storgae_client = storage.Client("wri-gee")
bucket = storgae_client.get_bucket('urbanshift')
# define path for writing data
blobName = 'administrative_boundaries/integrated/admin_boundaries_integrated.geojson'

In [424]:
# Create a new blob and upload the file's content.
blob = bucket.blob(blobName)
blob.upload_from_string(boundaries_mapped_updates_geojson)