Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time
July 22, 2018 23:46
July 22, 2018 20:09
January 30, 2019 10:14
January 7, 2021 17:59

How to get data into your Binder


This example demonstrates a few ways to get data into your binder.

Small public data

The simplest approach for small data files that are public is to add them directly to your GitHub repository. This way they are directly baked into the environment and versioned together with your code.

Works well for files with sizes up to maybe 10MB.

An example of this is data/gapminder_all.csv

Medium public files

For medium sized files, a few 10s of megabytes to a few hundred megabytes, you can add a special file named postBuild to your repository. This will let you fetch the data when the container is built. It increases the image size but means users don't have to download the dataset each time they start the binder. And you know it will always be the same data, even if the source becomes unavailable.

More details on the postBuild file.

How to do it

Go to your GitHub repository and create a file called postBuild. In your postBuild add the following line:

wget -q -O bikes-2016.csv ""

This will download a dataset measuring cycling and walking activity in the city of Zurich in the year 2016.

Other methods for fetching data files will also work. We used wget because it is a well known tool, no other reason.

Large public files

For large files it is not practical to place them in your GitHub repository nor to include them directly in the container image.

Note: We can't use technical measures to stop you from including very large files in your image. However large images take longer to launch, as well as taking up storage space that has to pay for. Please be considerate.

The best option for large files is to use a library specific to the data format to stream the data as you are using it. An alternative is to download each file on demand as part of your code, this way we only create network traffic when it is really needed.

There are a few restrictions on outgoing traffic from your Binder that are imposed by the team operating Currently only connections to HTTP and Git are allowed. This comes up when people want to use FTP sites to fetch data. For security reasons FTP will never be allowed on

Note: to start a discussion of opening additional ports create a new issue on the repository.

How to do it

This really depends on your data format and libraries that support accessing it over a network. An example of accessing Sentinel 2 images that are several gigabyte in size is in Sentinel2.ipynb.

Private files

There currently is no way to access files which are not public from

For security reasons you should consider all information in a Binder as public. This means:

  • there should be no secrets (passwords, tokens, keys, etc) in your GitHub repository
  • you should not type passwords into a running Binder on
  • you should not upload your private SSH key or API token to a running Binder

To support access to private files you will have to create a local deployment of BinderHub where you can then decide on the security trade offs yourself.

Contributing and other examples

If you know of other examples for the large files section please contribute them to this repository. If you find a mistake in this repository feel free to open an issue or directly contribute a fix.


How to get data into your Binder







No releases published


No packages published