<a href="https://colab.research.google.com/github/CarlosMendez1997Col/GeoDatabases-And-Cloud-Computing-For-Water-Resources-Management/blob/main/1-Creation%20Geodatabase/Download_and_Geoprocessing_Databases_in_Google_Colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Download and Geoprocessing Databases in Google Colab

---

> Water Resources Management using PostgreSQL and PgAdmin4

> Area of Interest (South America)

> Developed by MSc Carlos Mendez

MOST TABLES AND DATASETS USED:

1. South America Countries and Boundary [Url Data](https://international.ipums.org/international/gis.shtml)

2. First Level Administrative Units (FLAU) [Url Data](https://www.geoboundaries.org/globalDownloads.html)

3. Second Level Administrative Units (SLAU) [Url Data](https://www.geoboundaries.org/globalDownloads.html)

4. HydroSHEDS [Url Data](https://www.hydrosheds.org/products/hydrosheds)

5. HydroBASINS (Level 1 to 12) [Url Data](https://www.hydrosheds.org/products/hydrobasins)

6. HydroRIVERS [Url Data](https://www.hydrosheds.org/products/hydrorivers)

7. HydroLAKES [Url Data](https://www.hydrosheds.org/products/hydrolakes)

8. Global Lakes and Wetlands Database (GLWD) [Url Data](https://www.hydrosheds.org/products/glwd)

9. HydroWASTE [Url Data](https://www.hydrosheds.org/products/hydrowaste)

10. Global River Classification (GloRiC) [Url Data](https://www.hydrosheds.org/products/gloric)

11. Lake TEMP [Url Data](https://www.hydrosheds.org/products/laketemp)

12. Global Power Plant Database (GPPD) [Url Data](https://datasets.wri.org/datasets/global-power-plant-database)

## Install and import ArcGIS API for Python

In [2]:
# If you need to install any library, please delete commit and then install it
#pip install arcgis
#!pip install geopandas
#!pip install rasterio
#!pip install shapely

## Import libraries and packages

In [2]:
import numpy as np
import pandas as pd
import geopandas as gpd
import rasterio
import xarray as xr
import matplotlib.pyplot as plt
import math
import zipfile
import os
import time
from datetime import datetime as dt
from osgeo import gdal, ogr, osr
from shapely.geometry import box

In [3]:
import arcgis
from arcgis.features import FeatureLayer, FeatureLayerCollection
from arcgis.geometry import SpatialReference

from google.colab import output
output.enable_custom_widget_manager()

# connect to GIS
from arcgis.gis import GIS

## Connect and Login in Arcgis Account

In [6]:
# Prompt user to provide username and password
import getpass
username = input('Enter username: ')
password = getpass.getpass("Enter your password: ")
gis = GIS("https://udistritalfjc.maps.arcgis.com/home", username, password)

Enter username: Camendezv_UDFJC
Enter your password: ··········


## Import and extract Databases in your local computer



### Connect to Google Drive

In [4]:
import os
os.makedirs('/content', exist_ok=True) # Create the parent directory if it doesn't exist

from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [5]:
# Set Directory or WorkSpace
%cd /content/drive/MyDrive/Geodatabase

/content/drive/MyDrive/Geodatabase


### 1. South America Countries and Boundary (SACB)

In [9]:
!wget https://international.ipums.org/international/resources/gis/IPUMSI_world_release2024.zip

--2025-09-11 00:26:33--  https://international.ipums.org/international/resources/gis/IPUMSI_world_release2024.zip
Resolving international.ipums.org (international.ipums.org)... 128.101.163.176
Connecting to international.ipums.org (international.ipums.org)|128.101.163.176|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 61705330 (59M) [application/zip]
Saving to: ‘IPUMSI_world_release2024.zip’


2025-09-11 00:26:36 (21.6 MB/s) - ‘IPUMSI_world_release2024.zip’ saved [61705330/61705330]



In [10]:
!unzip "/content/drive/MyDrive/Geodatabase/IPUMSI_world_release2024.zip" -d "/content/drive/MyDrive/Geodatabase"

Archive:  /content/drive/MyDrive/Geodatabase/IPUMSI_world_release2024.zip
 extracting: /content/drive/MyDrive/Geodatabase/IPUMSI_world_release2024.CPG  
  inflating: /content/drive/MyDrive/Geodatabase/IPUMSI_world_release2024.dbf  
  inflating: /content/drive/MyDrive/Geodatabase/IPUMSI_world_release2024.prj  
  inflating: /content/drive/MyDrive/Geodatabase/IPUMSI_world_release2024.sbn  
  inflating: /content/drive/MyDrive/Geodatabase/IPUMSI_world_release2024.sbx  
  inflating: /content/drive/MyDrive/Geodatabase/IPUMSI_world_release2024.shp  
  inflating: /content/drive/MyDrive/Geodatabase/IPUMSI_world_release2024.shp.xml  
  inflating: /content/drive/MyDrive/Geodatabase/IPUMSI_world_release2024.shx  


In [11]:
SACB = gpd.read_file('IPUMSI_world_release2024.shp')
SACB.head()

Unnamed: 0,OBJECTID,CNTRY_NAME,CNTRY_CODE,BPL_CODE,geometry
0,1,Algeria,12,13010.0,"MULTIPOLYGON (((-2.05592 35.0737, -2.05675 35...."
1,2,Angola,24,12010.0,"MULTIPOLYGON (((12.7976 -4.41685, 12.79875 -4...."
2,3,In dispute South Sudan/Sudan,9999,99999.0,"POLYGON ((28.08408 9.34722, 28.03889 9.34722, ..."
3,4,Benin,204,15010.0,"MULTIPOLYGON (((1.93753 6.30122, 1.93422 6.299..."
4,5,Botswana,72,14010.0,"POLYGON ((25.16312 -17.77816, 25.16383 -17.778..."


In [12]:
SACB.drop(['OBJECTID','CNTRY_CODE','BPL_CODE'], axis=1, inplace=True)
SACB.rename(columns={'CNTRY_NAME': 'Country'}, inplace=True)
SACB.info()

<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 285 entries, 0 to 284
Data columns (total 2 columns):
 #   Column    Non-Null Count  Dtype   
---  ------    --------------  -----   
 0   Country   285 non-null    object  
 1   geometry  285 non-null    geometry
dtypes: geometry(1), object(1)
memory usage: 4.6+ KB


In [13]:
SA_countries =  ['Argentina', 'Bolivia', 'Brazil', 'Chile', 'Colombia', 'Ecuador', 'Guyana', 'French Guiana', 'Paraguay', 'Peru', 'Suriname', 'Uruguay', 'Venezuela']
SACB_SA = SACB[SACB['Country'].isin(SA_countries)]

In [14]:
# If you want to verify the 14 countries of SA
#SACB_SA.head(14)

In [15]:
# If you want to display and visualize the 14 countries of SA
#SACB_SA.plot(column='Country', figsize=(16,8))

In [16]:
# Export data to Google Drive (.shp)
output_path_SACB = '/content/drive/MyDrive/Geodatabase/SA_Countries.shp'
SACB_SA.to_file(output_path_SACB)

In [17]:
## Delete original files (zip and .shp) to reduce space and volume
!rm '/content/drive/MyDrive/Geodatabase/IPUMSI_world_release2024.zip'

shapefile_prefix = '/content/drive/MyDrive/Geodatabase/IPUMSI_world_release2024'

# List of common shapefile extensions
extensions = ['.CPG', '.dbf', '.prj', '.sbn', '.sbx', '.shp', '.shp.xml', '.shx']

for ext in extensions:
    file_path = shapefile_prefix + ext
    if os.path.exists(file_path):
            os.remove(file_path)
            print(f"Deleted: {file_path}")
    else:
            print(f"File not found: {file_path}")

Deleted: /content/drive/MyDrive/Geodatabase/IPUMSI_world_release2024.CPG
Deleted: /content/drive/MyDrive/Geodatabase/IPUMSI_world_release2024.dbf
Deleted: /content/drive/MyDrive/Geodatabase/IPUMSI_world_release2024.prj
Deleted: /content/drive/MyDrive/Geodatabase/IPUMSI_world_release2024.sbn
Deleted: /content/drive/MyDrive/Geodatabase/IPUMSI_world_release2024.sbx
Deleted: /content/drive/MyDrive/Geodatabase/IPUMSI_world_release2024.shp
Deleted: /content/drive/MyDrive/Geodatabase/IPUMSI_world_release2024.shp.xml
Deleted: /content/drive/MyDrive/Geodatabase/IPUMSI_world_release2024.shx


### 2. First Level Administrative Units (FLAU)

In [18]:
!wget https://github.com/wmgeolab/geoBoundaries/raw/main/releaseData/CGAZ/geoBoundariesCGAZ_ADM1.zip

--2025-09-11 00:28:07--  https://github.com/wmgeolab/geoBoundaries/raw/main/releaseData/CGAZ/geoBoundariesCGAZ_ADM1.zip
Resolving github.com (github.com)... 140.82.116.4
Connecting to github.com (github.com)|140.82.116.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://media.githubusercontent.com/media/wmgeolab/geoBoundaries/main/releaseData/CGAZ/geoBoundariesCGAZ_ADM1.zip [following]
--2025-09-11 00:28:07--  https://media.githubusercontent.com/media/wmgeolab/geoBoundaries/main/releaseData/CGAZ/geoBoundariesCGAZ_ADM1.zip
Resolving media.githubusercontent.com (media.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to media.githubusercontent.com (media.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 103470246 (99M) [application/zip]
Saving to: ‘geoBoundariesCGAZ_ADM1.zip’


2025-09-11 00:28:10 (39.9 MB/s) - ‘geoBoundariesCGAZ_ADM1.zip’ saved [1

In [19]:
!unzip "/content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM1.zip" -d "/content/drive/MyDrive/Geodatabase"

Archive:  /content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM1.zip
  inflating: /content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM1.shp  
  inflating: /content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM1.shx  
  inflating: /content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM1.dbf  
  inflating: /content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM1.prj  


In [20]:
FLAU = gpd.read_file('geoBoundariesCGAZ_ADM1.shp')
FLAU.head()

Unnamed: 0,shapeName,shapeID,shapeGroup,shapeType,geometry
0,Kandahar,12653393B40111500734429,AFG,ADM1,"POLYGON ((65.24153 32.2863, 65.72553 32.48037,..."
1,Zabul,12653393B56617740339660,AFG,ADM1,"POLYGON ((67.60666 31.44378, 67.60882 31.44909..."
2,Uruzgan,12653393B46006342616872,AFG,ADM1,"POLYGON ((66.27519 32.4255, 65.72553 32.48037,..."
3,Daykundi,12653393B78791504725813,AFG,ADM1,"POLYGON ((66.76157 33.25547, 66.38975 33.30701..."
4,Ghanzi,12653393B29313712249365,AFG,ADM1,"POLYGON ((68.06945 32.04564, 67.86439 32.14395..."


In [21]:
FLAU.drop(['shapeID','shapeGroup', 'shapeType'], axis=1, inplace=True)
FLAU.rename(columns={'shapeName': 'Department'}, inplace=True)
FLAU.info()

<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 3224 entries, 0 to 3223
Data columns (total 2 columns):
 #   Column      Non-Null Count  Dtype   
---  ------      --------------  -----   
 0   Department  3224 non-null   object  
 1   geometry    3224 non-null   geometry
dtypes: geometry(1), object(1)
memory usage: 50.5+ KB


In [22]:
# Check CRSs
print(FLAU.crs)
print(SACB_SA.crs)

EPSG:4326
EPSG:4326


In [23]:
FLAU_Intersect = FLAU.overlay(SACB_SA, how='intersection')



In [24]:
# If you want to check the Departments or States of SA countries
#FLAU_Intersect.head(30)

In [25]:
# If you want to display and visualize the data
#FLAU_Intersect.plot(column='Department', figsize=(16,8))

In [26]:
# Export data to Google Drive (.shp)
output_path_FLAU = '/content/drive/MyDrive/Geodatabase/SA_FLAU.shp'
FLAU_Intersect.to_file(output_path_FLAU)

In [27]:
## Delete Zip to reduce space and volume

!rm '/content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM1.zip'

shapefile_prefix = '/content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM1'

# List of common shapefile extensions
extensions = ['.dbf', '.prj', '.shp', '.shx']

for ext in extensions:
    file_path = shapefile_prefix + ext
    if os.path.exists(file_path):
            os.remove(file_path)
            print(f"Deleted: {file_path}")
    else:
            print(f"File not found: {file_path}")

Deleted: /content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM1.dbf
Deleted: /content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM1.prj
Deleted: /content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM1.shp
Deleted: /content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM1.shx


### 3. Second Level Administrative Units (SLAU)

In [28]:
!wget https://github.com/wmgeolab/geoBoundaries/raw/main/releaseData/CGAZ/geoBoundariesCGAZ_ADM2.zip

--2025-09-11 00:30:47--  https://github.com/wmgeolab/geoBoundaries/raw/main/releaseData/CGAZ/geoBoundariesCGAZ_ADM2.zip
Resolving github.com (github.com)... 140.82.116.4
Connecting to github.com (github.com)|140.82.116.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://media.githubusercontent.com/media/wmgeolab/geoBoundaries/main/releaseData/CGAZ/geoBoundariesCGAZ_ADM2.zip [following]
--2025-09-11 00:30:47--  https://media.githubusercontent.com/media/wmgeolab/geoBoundaries/main/releaseData/CGAZ/geoBoundariesCGAZ_ADM2.zip
Resolving media.githubusercontent.com (media.githubusercontent.com)... 185.199.109.133, 185.199.110.133, 185.199.111.133, ...
Connecting to media.githubusercontent.com (media.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 155911064 (149M) [application/zip]
Saving to: ‘geoBoundariesCGAZ_ADM2.zip’


2025-09-11 00:30:51 (46.3 MB/s) - ‘geoBoundariesCGAZ_ADM2.zip’ saved [

In [29]:
!unzip "/content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM2.zip" -d "/content/drive/MyDrive/Geodatabase"

Archive:  /content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM2.zip
  inflating: /content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM2.shp  
  inflating: /content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM2.shx  
  inflating: /content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM2.dbf  
  inflating: /content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM2.prj  


In [30]:
SLAU = gpd.read_file('geoBoundariesCGAZ_ADM2.shp')
SLAU.head()

Unnamed: 0,shapeName,shapeID,shapeGroup,shapeType,geometry
0,Deh Bala,17698898B67359070524975,AFG,ADM2,"POLYGON ((70.51142 33.9492, 70.49439 33.94035,..."
1,Gulran,17698898B98443198567384,AFG,ADM2,"POLYGON ((62.0076 35.44597, 62.00902 35.39081,..."
2,Koshk,17698898B82675281335003,AFG,ADM2,"POLYGON ((61.99554 34.74465, 62.00493 34.75342..."
3,Chaparhar,17698898B74585757664988,AFG,ADM2,"POLYGON ((70.41933 34.23071, 70.41025 34.21431..."
4,Koshki Kohna,17698898B84066352785355,AFG,ADM2,"POLYGON ((62.41693 34.67412, 62.42883 34.73035..."


In [31]:
SLAU.drop(['shapeID','shapeGroup', 'shapeType'], axis=1, inplace=True)
SLAU.rename(columns={'shapeName': 'Municipality'}, inplace=True)
SLAU.info()

<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 49349 entries, 0 to 49348
Data columns (total 2 columns):
 #   Column        Non-Null Count  Dtype   
---  ------        --------------  -----   
 0   Municipality  49312 non-null  object  
 1   geometry      49349 non-null  geometry
dtypes: geometry(1), object(1)
memory usage: 771.2+ KB


In [32]:
SLAU_Intersect = SLAU.overlay(SACB_SA, how='intersection')



In [33]:
# If you want to check the Departments or States of SA countries
#SLAU_Intersect.head(30)

In [34]:
# If you want to display and visualize the data
#SLAU_Intersect.plot(column='Municipality', figsize=(16,8))

In [35]:
# Export data to Google Drive (.shp)
output_path_SLAU = '/content/drive/MyDrive/Geodatabase/SA_SLAU.shp'
SLAU_Intersect.to_file(output_path_SLAU)

  SLAU_Intersect.to_file(output_path_SLAU)
  ogr_write(


In [36]:
## Delete Zip to reduce space and volume

!rm '/content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM2.zip'

shapefile_prefix = '/content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM2'

# List of common shapefile extensions
extensions = ['.dbf', '.prj', '.shp', '.shx']

for ext in extensions:
    file_path = shapefile_prefix + ext
    if os.path.exists(file_path):
            os.remove(file_path)
            print(f"Deleted: {file_path}")
    else:
            print(f"File not found: {file_path}")

Deleted: /content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM2.dbf
Deleted: /content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM2.prj
Deleted: /content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM2.shp
Deleted: /content/drive/MyDrive/Geodatabase/geoBoundariesCGAZ_ADM2.shx


### 4. HydroSHEDS

Due to the capacity and volume of geoprocessing raster files to vector (points, polylines, and polygons), the conversion of these files is performed in the ArcGIS API for Python.      


#### Void Filled DEM

In [37]:
!wget https://data.hydrosheds.org/file/hydrosheds-v1-dem/hyd_sa_dem_30s.zip

--2025-09-11 00:35:54--  https://data.hydrosheds.org/file/hydrosheds-v1-dem/hyd_sa_dem_30s.zip
Resolving data.hydrosheds.org (data.hydrosheds.org)... 172.67.158.28, 104.21.14.61, 2606:4700:3036::ac43:9e1c, ...
Connecting to data.hydrosheds.org (data.hydrosheds.org)|172.67.158.28|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 31834815 (30M) [application/zip]
Saving to: ‘hyd_sa_dem_30s.zip’


2025-09-11 00:35:55 (41.8 MB/s) - ‘hyd_sa_dem_30s.zip’ saved [31834815/31834815]



In [38]:
!unzip "/content/drive/MyDrive/Geodatabase/hyd_sa_dem_30s.zip" -d "/content/drive/MyDrive/Geodatabase"

Archive:  /content/drive/MyDrive/Geodatabase/hyd_sa_dem_30s.zip
 extracting: /content/drive/MyDrive/Geodatabase/hyd_sa_dem_30s.tif  
 extracting: /content/drive/MyDrive/Geodatabase/HydroSHEDS_TechDoc_v1_4.pdf  


In [41]:
!rm '/content/drive/MyDrive/Geodatabase/hyd_sa_dem_30s.zip'
!rm '/content/drive/MyDrive/Geodatabase/HydroSHEDS_TechDoc_v1_4.pdf'

path_hydroSHEDS_DEM = '/content/drive/MyDrive/Geodatabase/hyd_sa_dem_30s.tif'
with rasterio.open(path_hydroSHEDS_DEM) as src:
  hydroSHEDS_DEM_SA = src.read(1)

In [None]:
## If you want to plot or visualize data.
prof_hydroSHEDS_DEM = src.profile
print("Void Filled DEM Profile:", prof_hydroSHEDS_DEM)
'''
plt.imshow(hydroSHEDS_DEM_SA, cmap='gray')
plt.title('Void Filled DEM')
plt.colorbar(label='Value')
plt.show()
'''

'\n## If you want to plot or visualize the DEM of SA.\nprof_hydroSHEDS_DEM = src.profile\nprint("Void Filled DEM Profile:", prof_hydroSHEDS_DEM)\nplt.imshow(hydroSHEDS_DEM_SA, cmap=\'gray\')\nplt.title(\'Void Filled DEM\')\nplt.colorbar(label=\'Value\')\nplt.show()\n'

#### Flow Direction

In [42]:
!wget https://data.hydrosheds.org/file/hydrosheds-v1-dir/hyd_sa_dir_30s.zip

--2025-09-11 00:42:59--  https://data.hydrosheds.org/file/hydrosheds-v1-dir/hyd_sa_dir_30s.zip
Resolving data.hydrosheds.org (data.hydrosheds.org)... 104.21.14.61, 172.67.158.28, 2606:4700:3036::ac43:9e1c, ...
Connecting to data.hydrosheds.org (data.hydrosheds.org)|104.21.14.61|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 9349529 (8.9M) [application/zip]
Saving to: ‘hyd_sa_dir_30s.zip’


2025-09-11 00:43:00 (30.7 MB/s) - ‘hyd_sa_dir_30s.zip’ saved [9349529/9349529]



In [43]:
!unzip "/content/drive/MyDrive/Geodatabase/hyd_sa_dir_30s.zip" -d "/content/drive/MyDrive/Geodatabase"

Archive:  /content/drive/MyDrive/Geodatabase/hyd_sa_dir_30s.zip
 extracting: /content/drive/MyDrive/Geodatabase/hyd_sa_dir_30s.tif  
 extracting: /content/drive/MyDrive/Geodatabase/HydroSHEDS_TechDoc_v1_4.pdf  


In [44]:
!rm '/content/drive/MyDrive/Geodatabase/hyd_sa_dir_30s.zip'
!rm '/content/drive/MyDrive/Geodatabase/HydroSHEDS_TechDoc_v1_4.pdf'

path_hydroSHEDS_Dir = '/content/drive/MyDrive/Geodatabase/hyd_sa_dir_30s.tif'
with rasterio.open(path_hydroSHEDS_Dir) as src:
  hydroSHEDS_Dir_SA = src.read(1)

In [51]:
## If you want to plot or visualize data.
prof_hydroSHEDS_Dir = src.profile
print("Flow Direction Profile}:", prof_hydroSHEDS_Dir)
'''
fig = plt.figure()
fig.patch.set_facecolor('none')
plt.imshow(hydroSHEDS_Dir_SA, cmap='Spectral')
plt.title('Flow Direction Profile')
plt.colorbar(label='Value')
plt.show()
'''

'\n## If you want to plot or visualize the DEM of SA.\nprof_hydroSHEDS_Dir = src.profile\nprint("Flow Direction Profile}:", prof_hydroSHEDS_Dir)\nfig = plt.figure()\nfig.patch.set_facecolor(\'none\')\nplt.imshow(hydroSHEDS_Dir_SA, cmap=\'Spectral\')\nplt.title(\'Flow Direction Profile\')\nplt.colorbar(label=\'Value\')\nplt.show()\n'

#### Flow Accumulation

In [52]:
!wget https://data.hydrosheds.org/file/hydrosheds-v1-acc/hyd_sa_acc_30s.zip

--2025-09-11 00:54:34--  https://data.hydrosheds.org/file/hydrosheds-v1-acc/hyd_sa_acc_30s.zip
Resolving data.hydrosheds.org (data.hydrosheds.org)... 104.21.14.61, 172.67.158.28, 2606:4700:3036::6815:e3d, ...
Connecting to data.hydrosheds.org (data.hydrosheds.org)|104.21.14.61|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 19947429 (19M) [application/zip]
Saving to: ‘hyd_sa_acc_30s.zip’


2025-09-11 00:54:35 (38.5 MB/s) - ‘hyd_sa_acc_30s.zip’ saved [19947429/19947429]



In [53]:
!unzip "/content/drive/MyDrive/Geodatabase/hyd_sa_acc_30s.zip" -d "/content/drive/MyDrive/Geodatabase"

Archive:  /content/drive/MyDrive/Geodatabase/hyd_sa_acc_30s.zip
 extracting: /content/drive/MyDrive/Geodatabase/hyd_sa_acc_30s.tif  
 extracting: /content/drive/MyDrive/Geodatabase/HydroSHEDS_TechDoc_v1_4.pdf  


In [54]:
!rm '/content/drive/MyDrive/Geodatabase/hyd_sa_acc_30s.zip'
!rm '/content/drive/MyDrive/Geodatabase/HydroSHEDS_TechDoc_v1_4.pdf'

path_hydroSHEDS_Flow = '/content/drive/MyDrive/Geodatabase/hyd_sa_dir_30s.tif'
with rasterio.open(path_hydroSHEDS_Flow) as src:
  hydroSHEDS_Flow_SA = src.read(1)

In [65]:
## If you want to plot or visualize data.
prof_hydroSHEDS_Flow = src.profile
print("Flow Accumulation Profile}:", prof_hydroSHEDS_Flow)
'''
fig = plt.figure()
plt.imshow(hydroSHEDS_Flow_SA, cmap='Spectral')
plt.title('Flow Accumulation Profile')
plt.colorbar(label='Value')
plt.show()
'''

'\n## If you want to plot or visualize data.\nprof_hydroSHEDS_Flow = src.profile\nprint("Flow Accumulation Profile}:", prof_hydroSHEDS_Flow)\nfig = plt.figure()\nplt.imshow(hydroSHEDS_Flow_SA, cmap=\'Spectral\')\nplt.title(\'Flow Accumulation Profile\')\nplt.colorbar(label=\'Value\')\nplt.show()\n'

#### Flow Length Upstream

In [66]:
!wget https://data.hydrosheds.org/file/hydrosheds-v1-lup/hyd_sa_lup_15s.zip

--2025-09-11 02:13:21--  https://data.hydrosheds.org/file/hydrosheds-v1-lup/hyd_sa_lup_15s.zip
Resolving data.hydrosheds.org (data.hydrosheds.org)... 172.67.158.28, 104.21.14.61, 2606:4700:3036::6815:e3d, ...
Connecting to data.hydrosheds.org (data.hydrosheds.org)|172.67.158.28|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 100988898 (96M) [application/zip]
Saving to: ‘hyd_sa_lup_15s.zip’


2025-09-11 02:13:25 (28.5 MB/s) - ‘hyd_sa_lup_15s.zip’ saved [100988898/100988898]



In [67]:
!unzip "/content/drive/MyDrive/Geodatabase/hyd_sa_lup_15s.zip" -d "/content/drive/MyDrive/Geodatabase"

Archive:  /content/drive/MyDrive/Geodatabase/hyd_sa_lup_15s.zip
 extracting: /content/drive/MyDrive/Geodatabase/hyd_sa_lup_15s.tif  
 extracting: /content/drive/MyDrive/Geodatabase/HydroSHEDS_TechDoc_v1_4.pdf  


In [5]:
!rm '/content/drive/MyDrive/Geodatabase/hyd_sa_lup_15s.zip'
!rm '/content/drive/MyDrive/Geodatabase/HydroSHEDS_TechDoc_v1_4.pdf'

path_hydroSHEDS_Lup = '/content/drive/MyDrive/Geodatabase/hyd_sa_lup_15s.tif'
with rasterio.open(path_hydroSHEDS_Lup) as src:
  hydroSHEDS_Lup_SA = src.read(1)

rm: cannot remove '/content/drive/MyDrive/Geodatabase/hyd_sa_lup_15s.zip': No such file or directory
rm: cannot remove '/content/drive/MyDrive/Geodatabase/HydroSHEDS_TechDoc_v1_4.pdf': No such file or directory


In [1]:
## If you want to plot or visualize data.
prof_hydroSHEDS_Lup = src.profile
print("Flow Length Upstream Profile}:", prof_hydroSHEDS_Lup)
'''
fig = plt.figure()
plt.imshow(hydroSHEDS_Lup_SA, cmap='viridis')
plt.title('Flow Length Upstream Profile')
plt.colorbar(label='Value')
plt.show()
'''

'\n## If you want to plot or visualize data.\nprof_hydroSHEDS_Lup = src.profile\nprint("Flow Length Upstream Profile}:", prof_hydroSHEDS_Lup)\nfig = plt.figure()\nplt.imshow(hydroSHEDS_Lup_SA, cmap=\'viridis\')\nplt.title(\'Flow Length Upstream Profile\')\nplt.colorbar(label=\'Value\')\nplt.show()\n'

#### Flow Length Downstream

In [8]:
!wget https://data.hydrosheds.org/file/hydrosheds-v1-ldn/hyd_sa_ldn_15s.zip

--2025-09-11 02:24:07--  https://data.hydrosheds.org/file/hydrosheds-v1-ldn/hyd_sa_ldn_15s.zip
Resolving data.hydrosheds.org (data.hydrosheds.org)... 104.21.14.61, 172.67.158.28, 2606:4700:3036::6815:e3d, ...
Connecting to data.hydrosheds.org (data.hydrosheds.org)|104.21.14.61|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 241146941 (230M) [application/zip]
Saving to: ‘hyd_sa_ldn_15s.zip’


2025-09-11 02:24:13 (37.4 MB/s) - ‘hyd_sa_ldn_15s.zip’ saved [241146941/241146941]



In [9]:
!unzip "/content/drive/MyDrive/Geodatabase/hyd_sa_ldn_15s.zip" -d "/content/drive/MyDrive/Geodatabase"

Archive:  /content/drive/MyDrive/Geodatabase/hyd_sa_ldn_15s.zip
 extracting: /content/drive/MyDrive/Geodatabase/hyd_sa_ldn_15s.tif  
 extracting: /content/drive/MyDrive/Geodatabase/HydroSHEDS_TechDoc_v1_4.pdf  


In [6]:
!rm '/content/drive/MyDrive/Geodatabase/hyd_sa_ldn_15s.zip'
!rm '/content/drive/MyDrive/Geodatabase/HydroSHEDS_TechDoc_v1_4.pdf'

path_hydroSHEDS_Ldn = '/content/drive/MyDrive/Geodatabase/hyd_sa_ldn_15s.tif'
with rasterio.open(path_hydroSHEDS_Ldn) as src:
  hydroSHEDS_Ldn_SA = src.read(1)

In [8]:
## If you want to plot or visualize data.
prof_hydroSHEDS_Ldn = src.profile
print("Flow Length Upstream Profile}:", prof_hydroSHEDS_Ldn)
'''
fig = plt.figure()
plt.imshow(hydroSHEDS_Ldn_SA, cmap='magma')
plt.title('Flow Length Downstream Profile')
plt.colorbar(label='Value')
plt.show()
'''

Flow Length Upstream Profile}: {'driver': 'GTiff', 'dtype': 'uint32', 'nodata': 4294967295.0, 'width': 14640, 'height': 17040, 'count': 1, 'crs': CRS.from_wkt('GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]'), 'transform': Affine(0.00416666666666667, 0.0, -93.0,
       0.0, -0.00416666666666667, 15.000000000000057), 'tiled': []}


### 5. HydroBASINS (Level 1 to 12)

In [9]:
!wget https://data.hydrosheds.org/file/hydrobasins/standard/hybas_sa_lev01-12_v1c.zip

--2025-09-11 03:01:48--  https://data.hydrosheds.org/file/hydrobasins/standard/hybas_sa_lev01-12_v1c.zip
Resolving data.hydrosheds.org (data.hydrosheds.org)... 104.21.14.61, 172.67.158.28, 2606:4700:3036::ac43:9e1c, ...
Connecting to data.hydrosheds.org (data.hydrosheds.org)|104.21.14.61|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 334160720 (319M) [application/zip]
Saving to: ‘hybas_sa_lev01-12_v1c.zip’


2025-09-11 03:01:55 (46.1 MB/s) - ‘hybas_sa_lev01-12_v1c.zip’ saved [334160720/334160720]



In [10]:
!unzip "/content/drive/MyDrive/Geodatabase/hybas_sa_lev01-12_v1c.zip" -d "/content/drive/MyDrive/Geodatabase"

Archive:  /content/drive/MyDrive/Geodatabase/hybas_sa_lev01-12_v1c.zip
  inflating: /content/drive/MyDrive/Geodatabase/hybas_sa_lev01_v1c.dbf  
  inflating: /content/drive/MyDrive/Geodatabase/hybas_sa_lev01_v1c.prj  
  inflating: /content/drive/MyDrive/Geodatabase/hybas_sa_lev01_v1c.sbn  
  inflating: /content/drive/MyDrive/Geodatabase/hybas_sa_lev01_v1c.sbx  
  inflating: /content/drive/MyDrive/Geodatabase/hybas_sa_lev01_v1c.shp  
  inflating: /content/drive/MyDrive/Geodatabase/hybas_sa_lev01_v1c.shp.xml  
  inflating: /content/drive/MyDrive/Geodatabase/hybas_sa_lev01_v1c.shx  
  inflating: /content/drive/MyDrive/Geodatabase/hybas_sa_lev02_v1c.dbf  
  inflating: /content/drive/MyDrive/Geodatabase/hybas_sa_lev02_v1c.prj  
  inflating: /content/drive/MyDrive/Geodatabase/hybas_sa_lev02_v1c.sbn  
  inflating: /content/drive/MyDrive/Geodatabase/hybas_sa_lev02_v1c.sbx  
  inflating: /content/drive/MyDrive/Geodatabase/hybas_sa_lev02_v1c.shp  
  inflating: /content/drive/MyDrive/Geodatabase/h

In [12]:
hydroBASINS_SA_lev1 = gpd.read_file('hybas_sa_lev01_v1c.shp')
hydroBASINS_SA_lev1

Unnamed: 0,HYBAS_ID,NEXT_DOWN,NEXT_SINK,MAIN_BAS,DIST_SINK,DIST_MAIN,SUB_AREA,UP_AREA,PFAF_ID,ENDO,COAST,ORDER,SORT,geometry
0,6010000010,0,6010000010,6010000010,0.0,0.0,17853507.4,17853507.0,6,0,1,0,1,"MULTIPOLYGON (((-78.99722 9.45417, -79.00478 9..."


In [14]:
hydroBASINS_SA_lev1.drop(['NEXT_DOWN','NEXT_SINK','MAIN_BAS','DIST_SINK','DIST_MAIN','SUB_AREA','UP_AREA','PFAF_ID','ENDO','COAST','ORDER','SORT'], axis=1, inplace=True)
hydroBASINS_SA_lev1.rename(columns={'HYBAS_ID': 'HYBAS_ID_1'}, inplace=True)
hydroBASINS_SA_lev1.info()

<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 1 entries, 0 to 0
Data columns (total 2 columns):
 #   Column      Non-Null Count  Dtype   
---  ------      --------------  -----   
 0   HYBAS_ID_1  1 non-null      int64   
 1   geometry    1 non-null      geometry
dtypes: geometry(1), int64(1)
memory usage: 148.0 bytes


In [15]:
# Export data to Google Drive (.shp)
output_path_basin1 = '/content/drive/MyDrive/Geodatabase/SA_HydroBASINS_1.shp'
hydroBASINS_SA_lev1.to_file(output_path_basin1)

In [16]:
shapefile_prefix = '/content/drive/MyDrive/Geodatabase/hybas_sa_lev01_v1c'

# List of common shapefile extensions
extensions = ['.dbf', '.prj', '.sbn', '.sbx','.shp','.shp.xml','.shx']

for ext in extensions:
    file_path = shapefile_prefix + ext
    if os.path.exists(file_path):
            os.remove(file_path)
            print(f"Deleted: {file_path}")
    else:
            print(f"File not found: {file_path}")

Deleted: /content/drive/MyDrive/Geodatabase/hybas_sa_lev01_v1c.dbf
Deleted: /content/drive/MyDrive/Geodatabase/hybas_sa_lev01_v1c.prj
Deleted: /content/drive/MyDrive/Geodatabase/hybas_sa_lev01_v1c.sbn
Deleted: /content/drive/MyDrive/Geodatabase/hybas_sa_lev01_v1c.sbx
Deleted: /content/drive/MyDrive/Geodatabase/hybas_sa_lev01_v1c.shp
Deleted: /content/drive/MyDrive/Geodatabase/hybas_sa_lev01_v1c.shp.xml
Deleted: /content/drive/MyDrive/Geodatabase/hybas_sa_lev01_v1c.shx


In [None]:
!rm '/content/drive/MyDrive/Geodatabase/hybas_sa_lev01-12_v1c.zip'
!rm '/content/drive/MyDrive/Geodatabase/HydroBASINS_TechDoc_v1c.pdf'

### 6. HydroRIVERS

In [None]:
!wget https://data.hydrosheds.org/file/HydroRIVERS/HydroRIVERS_v10_sa_shp.zip

### 7. HydroLAKES

In [None]:
!wget https://data.hydrosheds.org/file/hydrolakes/HydroLAKES_polys_v10_shp.zip

### 8. Global Lakes and Wetlands Database (GLWD)

In [None]:
!wget https://figshare.com/ndownloader/files/54001748

### 9. HydroWASTE

In [None]:
!wget https://figshare.com/ndownloader/files/31910714

### 10. Global River Classification (GloRiC)

In [None]:
!wget https://data.hydrosheds.org/file/hydrosheds-associated/gloric/GloRiC_v10_shapefile.zip

### 11. LakeTEMP

In [None]:
!wget https://figshare.com/ndownloader/files/46397785

### 12. Global Power Plant Database (GPPD)

In [None]:
!wget https://datasets.wri.org/private-admin/dataset/53623dfd-3df6-4f15-a091-67457cdb571f/resource/66bcdacc-3d0e-46ad-9271-a5a76b1853d2/download/globalpowerplantdatabasev130.zip