Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time


Sotoki (Stack Overflow to Kiwix) is an openZIM scraper to create offline versions of Stack Exchange websites such as Stack Overflow.

It is based on Stack Exchange's Data Dumps hosted by The Internet Archive.

CodeFactor Docker License: GPL v3 PyPI version


Sotoki works off a domain that you must provide. That is the domain-name of the stackexchange website you want to scrape. Run sotoki --list-all to get a list of those


docker run -v my_dir:/output sotoki --help


sotoki is a Python3 software. If you are not using the Docker image, you are advised to use it in a virtual environment to avoid installing software dependencies on your system.

python3 -m venv ./env  # creates a virtual python environment in ./env folder
./env/bin/pip install -U pip  # upgrade pip (package manager). recommended
./env/bin/pip install -U sotoki  # install/upgrade sotoki inside virtualenv

# direct access to in-virtualenv sotoki binary, without shell-attachment
./env/bin/sotoki --help
# alias or link it for convenience
sudo ln -s $(pwd)/env/bin/sotoki /usr/local/bin/

# alternatively, attach virtualenv to shell
source env/bin/activate
sotoki --help
deactivate  # unloads virtualenv from shell


Anybody is welcome to improve the Sotoki.

To run Sotoki off the git repository, you'll need to download a few external dependencies that we pack in Python releases. Just run python src/sotoki/

See requirements.txt for the list of python dependencies.


You don't have to make your own ZIM files of Stack Exchange's Web sites. Updated ZIM files are built on a regular basis for all of them. Look at to download them.