<a href="https://colab.research.google.com/github/danielsparing/colab-duckdb-spatial-cookbook/blob/main/notebooks/install_gdal_conda.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Install GDAL command line tools in Colab with Parquet support

If we need Parquet support for e.g. [ogr2ogr](https://gdal.org/en/stable/programs/ogr2ogr.html) or other command-line GDAL tool, we can install the required packages via `conda`. Or to be more precise, via `conda-forge` and `miniforge`.

The `libgdal-arrow-parquet` extension package that we need is not available via `apt`, but [it can be installed](https://gdal.org/en/stable/download.html#conda) via conda-forge. So let's first install conda-forge [^1] (and update `$PATH` [^2]):

[^1]: The curl download link comes from [conda-forge](https://conda-forge.org/download) and their installation [instructions](https://github.com/conda-forge/miniforge?tab=readme-ov-file#install) on GitHub.

[^2]: the reason we edit environment variables in Colab in a Python cell, not in a shell cell, is so that it persists across different cells in the notebook.

In [1]:
import os

In [2]:
!curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
!bash Miniforge3-$(uname)-$(uname -m).sh -b -p /usr/local/miniforge

os.environ["PATH"] = "/usr/local/miniforge/bin:" + os.environ["PATH"]

# if we don't set this environment variable, `ogr2ogr` will emit PROJ-related warnings
os.environ["PROJ_LIB"] = "/usr/local/miniforge/share/proj"

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 89.3M  100 89.3M    0     0  45.8M      0  0:00:01  0:00:01 --:--:-- 78.3M
PREFIX=/usr/local/miniforge
Unpacking payload ...
Extracting _libgcc_mutex-0.1-conda_forge.tar.bz2
Extracting ca-certificates-2025.4.26-hbd8a1cb_0.conda
Extracting ld_impl_linux-64-2.43-h712a8e2_4.conda
Extracting libgomp-15.1.0-h767d61c_2.conda
Extracting pybind11-abi-4-hd8ed1ab_3.tar.bz2
Extracting python_abi-3.12-7_cp312.conda
Extracting tzdata-2025b-h78e105d_0.conda
Extracting _openmp_mutex-4.5-2_gnu.tar.bz2
Extracting libgcc-15.1.0-h767d61c_2.conda
Extracting c-ares-1.34.5-hb9d3cd8_0.conda
Extracting libexpat-2.7.0-h5888daf_0.conda
Extracting libffi-3.4.6-h2dba641_1.conda
Extracting libgcc-ng-

Now we can add arrow/parquet support:

In [3]:
!conda install libgdal-arrow-parquet -q -y

Channels:
 - conda-forge
Platform: linux-64
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done

## Package Plan ##

  environment location: /usr/local/miniforge

  added / updated specs:
    - libgdal-arrow-parquet


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    aws-c-auth-0.9.0           |      h3318fae_10         108 KB  conda-forge
    aws-c-cal-0.9.1            |       h5e3027f_0          50 KB  conda-forge
    aws-c-common-0.12.3        |       hb9d3cd8_0         231 KB  conda-forge
    aws-c-compression-0.3.1    |       hafb2847_5          21 KB  conda-forge
    aws-c-event-stream-0.5.4   |      haaa725d_10          56 KB  conda-forge
    aws-c-http-0.10.1          |       hd7992d4_3         218 KB  conda-forge
    aws-c-io-0.19.1            |       h7b43961_3         175 KB  conda-forge
    aws-c-mqtt-0.13.0          |       h

And that's it:

In [4]:
!ogr2ogr --formats | grep parquet
# Returns:
#   Parquet -vector- (rw+v): (Geo)Parquet (*.parquet)

  Parquet -vector- (rw+v): (Geo)Parquet (*.parquet)
