# Bulk Download of Google 2.5D Buildings Data (Outside Earth Engine)

Google's Open Buildings dataset offers 2.5D information such as building height, count, and presence. While Earth Engine can be used to query this dataset, it becomes impractical for national or multi-country downloads due to file size and task quotas.

This notebook provides a simple Python-based downloader using Google's official public URLs. It supports downloading `.tif` tiles for entire countries and any available reference year (between 2016 and 2023).

The original source can be found here: https://data.humdata.org/dataset/google-open-buildings-temporal

---

# 1. User configuration

In [None]:
# ==== USER INPUT SECTION ====
from pathlib import Path

# ISO3 country codes to download. Refer to the ISO3 Country Codes csv file to know the code of your country of interest
selected_countries = ["KEN", "BRA"] #Mention the ISO3 Country codes of the countries you wish to download 

# Reference year: "2016", "2020", or "2023"
target_year = "2023" #Mention the year you wish to download

# Path to the folder where .txt files with URLs are stored. The urls text files are provided in the urls folder.
url_folder = Path(r'D:\VSG\GEE DATA DOWNLOAD\google-open-buildings-temporal\urls')

# Path to where the downloaded .tif files will be saved. This can be any path where you wish to save the files.
output_base = Path(r'D:\VSG\GEE DATA DOWNLOAD\google-open-buildings-temporal\tifs')

# 2. Import Dependencies 

In [None]:
from urllib.request import urlopen
from urllib.error import URLError, HTTPError
from tqdm.notebook import tqdm

# 3. Download Logic

In [None]:
def download_file(url: str, dest_path: Path):
    """
    Downloads a single .tif file from a URL if not already downloaded.
    """
    if dest_path.exists():
        return  # Skip if already downloaded
    try:
        with urlopen(url) as response:
            data = response.read()
            with open(dest_path, "wb") as out_file:
                out_file.write(data)
    except (HTTPError, URLError) as e:
        print(f"⚠️  Failed: {dest_path.name} — {e}")


# 4. Process countries (main download loop)

In [None]:
for iso3 in tqdm(selected_countries, desc=f"Downloading for year {target_year}"):
    txt_file = url_folder / f"{iso3}_{target_year}.txt"
    output_dir = output_base / f"{iso3}_{target_year}"
    output_dir.mkdir(parents=True, exist_ok=True)

    if not txt_file.exists():
        print(f"⛔ Missing URL file: {txt_file.name}")
        continue

    with open(txt_file, 'r') as f:
        urls = f.read().splitlines()

    for url in tqdm(urls, desc=f"{iso3}", leave=False):
        filename = url.split("/")[-1]
        output_path = output_dir / filename
        download_file(url, output_path)

print("✅ All downloads completed.")


---

## ✅ Output Summary

All `.tif` files have been saved in subfolders inside your defined `output_base` directory.
Each country will have its own folder, e.g., `KEN_2023/`.

You can now:
- Load the tiles using `rasterio` or `rioxarray`
- Perform zonal stats, visualizations, or composite mosaicking
- Share tiles offline with collaborators

If any tiles failed to download, you can re-run this notebook — existing files will be skipped.

---
