Diversity of ISCB Honorees
This is a data analysis repository for the study at https://greenelab.github.io/iscb-diversity-manuscript/.
Datasets are stored in the
This repository uses Git LFS to store large / binary datasets.
Make sure to have Git LFS installed locally before cloning the repository,
if you'd like to download the datasets.
You can also download datasets directly from the GitHub website by clicking "Raw".
The source code saves large files using XZ compression (denoted by an
Since not all users are familiar with XZ-compression,
we have also created gzip exports of all XZ-compressed files
These files are placed alongside their XZ source in the
The source code pipelines use XZ compression since gzip encodes a timestamp causing non-deterministic output files.
This repository has a corresponding Docker image with the required dependencies.
environment for the Docker image specification.
Note that the following Docker commands have a
--mount argument to give the Docker container access to files in this repository.
Therefore, any changes to the repository content created while running the Docker container will persist in this directory after the container is stopped.
The Docker image is automatically built and published by a GitHub Action. Even though this repository is public, GitHub requires authentication to download from its package registry. Therefore, you will need a GitHub account to pull the image.
Use the following steps to authenticate your local docker with your GitHub.
Go to https://github.com/settings/tokens and create a new personal access token, selecting only the
You can name the token anything, for example "docker login read-only token".
Then run the following command, substituting your username and token from above:
docker login --username USERNAME --password TOKEN docker.pkg.github.com
For interactive development in Python notebooks, run the following command:
# This command must be run with the repository root as your working directory. # Requires docker version >= 17.06. docker run \ --name iscb-diversity \ --detach --rm \ --env JUPYTER_TOKEN=ksbegpqzrurktbkikyo \ --publish 8899:8888 \ --mount type=bind,source="$(pwd)",target=/user/jupyter \ docker.pkg.github.com/greenelab/iscb-diversity/iscb-diversity
Then navigate to the following URL in your browser: http://localhost:8899?token=ksbegpqzrurktbkikyo
You should see a Jupyter Notebook landing page where you can open, edit, and run any of the notebooks.
When you are done, you shutdown the Jupyter notebook server and remove the Docker container by running
docker stop iscb-diversity in a new terminal.
Similarly, for the R notebooks:
# This command must be run with the repository root as your working directory. # Requires docker version >= 17.06. docker run \ --name iscb-diversity-r \ --detach --rm \ --publish 8787:8787 \ --env DISABLE_AUTH=true \ --mount type=bind,source="$(pwd)",target=/home/rstudio/repo \ docker.pkg.github.com/greenelab/iscb-diversity/iscb-diversity-r
Navigate to http://localhost:8787 and you should be logged into RStudio as the rstudio user.
When you are done, shutdown the RStudio server and remove the Docker container by running
docker stop iscb-diversity-r.
python utils/prepare_docs.py --nbviewer --readme
utils/prepare_docs.py to change the template for
The entire repository is released under the CC BY 4.0 License available in
All code files and snippets are additionally released under the BSD 3-Clause License available in