# Publishing

Publishing your results can mean several things:
- writing a manuscript and submitting the results to a journal for (double-blind) peer review
- creating a data publication, ideally submitted with your text manuscript, for data transparency & reproducibility
- publish your git repository, making it available for others to find and read your notebooks

We provide a short step-by-step guide on how to publish Jupyter notebooks together with the generated visuals and output as a data publication.

## Publish data

### Make a release version

When you are finished with your work, e.g. before submitting your manuscript for the first round of review, create a git release for your notebook repository and give it a version number:
```bash
git tag -a v1.0.0 -m "Major release before submitting to Journal"
git push --tags
```

```{hint}
This adds a marker to your Git repository that can be easily found and referenced at any later stage. If you submit a minor or major revision at a later date, add another version tag to describe your progress.
```

After pushing your tag to Github or Gitlab, you can (and should!) create a **Release** from it, where you can attach data and other output information. Releases can be [cited with (e.g.) Zenodo or ioerDATA](#ioerdata).

```{figure} ../resources/release.png
:name: gitlab-release

A release from our [Gitlab repository](https://gitlab.hrz.tu-chemnitz.de/ioer/fdz/jupyter-book-nfdi4biodiversity/-/releases) based on the version <code>v0.6.5</code>-tag of the training materials.
````


## Create HTML versions of all your notebooks

This is an optional step, but recommended because reviewers may not have Jupyter Lab to open your `*.ipynb` notebooks. By converting notebooks to HTML format, you can archive any code together with the generated visuals. You can convert notebooks directly in Jupyter with the below command:

In [1]:
!jupyter nbconvert --to html_toc \
    --output-dir=../out/ ./205_publish.ipynb \
    --template=../nbconvert.tpl \
    --ExtractOutputPreprocessor.enabled=False

[NbConvertApp] Converting notebook ./205_publish.ipynb to html_toc
[NbConvertApp] Writing 306783 bytes to ../out/205_publish.html


```{admonition} Add conversion command at the end of every notebook
:class: hint
It is a good idea to add this command to every notebook, so it is run after every notebook change.
```

```{admonition} Attach the static HTML files to your publication
:class: note
These HTML versions of the notebooks are ideal for attaching directly to your publications as Supplementary Materials (SM). They are like a portable archive version that contains your documentation, code and output graphics at the time of publication. This is the most important information and should be attached directly to your paper. This also helps reviewers to have a look at your workflow if they do not want to run the notebooks themselves.

In addition to these HTML files, the original notebook files (<kbd>\*.ipynb</kbd>) and accompanying data should be made available in a proper data publication, which we show below.
```

### Create a zip file with all your output data

When you have exported all notebook HTMLs and Figures, create a ZIP file that includes all your data, notebooks, HTMLs and figures. Again, you can directly create this ZIP file in Jupyter, based on the latest git-version that we created above.

Remove any previous releases.

- `!rm ../out/*.zip`: clean up from any previous releases. The `!` indicates that this is a `bash` command, not Python.

In [2]:
!rm ../out/*.zip

rm: cannot remove '../out/*.zip': No such file or directory


Create a new release `*.zip`.

- `git config --system --add safe.directory '*'` makes sure that we are not asked to confirm different user owners in our repository
- `RELEASE_VERSION=$(git describe --tags --abbrev=0)` gets the latest version tag from git
- `7z a -tzip -mx=9 out/release_$RELEASE_VERSION.zip` compresses all files to a file with the version in the name
- With `py/* out/* resources/* notebooks/*.ipynb` (etc.) we explicitly select the folders that we want to include in the release. Note that we explicitly _include_ the `00_data/` directory, which is not committed to the git repository itself (due to the `.gitignore` file)
- at the end, we exclude a number of temporary files that we do not need to archive (`-x!py/__pycache__ -x!py/modules/__pycache__` etc.)
- and we turn off any output logging by piping to `/dev/null`

In [3]:
%%time
!cd .. && git config --system --add safe.directory '*' \
    && RELEASE_VERSION=$(git describe --tags --abbrev=0) \
    && 7z a -tzip -mx=9 out/release_$RELEASE_VERSION.zip \
    py/* out/* resources/* *.bib notebooks/*.ipynb 00_data/* \
    *.md *.yml *.ipynb nbconvert.tpl conf.json pyproject.toml \
    -x!py/__pycache__ -x!py/modules/__pycache__ -x!py/modules/.ipynb_checkpoints \
    -y > /dev/null

CPU times: user 461 ms, sys: 83.2 ms, total: 544 ms
Wall time: 1min 4s


```{admonition} <code>%%time</code>
:class: hint
Above, we activate the IPython `%%time` cell magic, to measure execution time of the cell. See [Built-in magic commands](https://ipython.readthedocs.io/en/stable/interactive/magics.html).
```

In [4]:
!RELEASE_VERSION=$(git describe --tags --abbrev=0) \
    && ls -alh ../out/release_$RELEASE_VERSION.zip

-rw-r--r-- 1 root 1002 150M Apr  4 06:23 ../out/release_v1.2.0.zip


```{figure} ../resources/download.png
:name: download-figure

In the Explorer on the left, right click and select <kbd>download</kbd>. Archive this replication package with your data repository of choice.
```

### List the directory file tree

Before uploading data to a repository, it is useful to print a file tree of your current working directory. This will help others to understand how your files were organised at the time of execution. For example, you may have forgotten to add a data file to the repository which is in a folder which is also excluded with the `.gitignore` file. Without being transparent about where these files were and how they were named at the time of build, it would be impossible to reproduce your work.

There are several ways to do this. For example, you could create a file tree using a Jupyter cell and the bash command `!tree --prune -I "_build|tmp"`. This would output a tree of files, but exclude the `_build` and `tmp` directories, which only contain temporary files.

As an alternative, we wrote a Python method that does something similar, but with more formatting options.

In [4]:
import sys
from pathlib import Path

module_path = str(Path.cwd().parents[0] / "py")
if module_path not in sys.path:
    sys.path.append(module_path)
from modules import tools

In [5]:
ignore_files_folders = ["_build", "tmp", "et-book"]
ignore_match = ["*.gdbtabl*", "*a0000000*"]
tools.tree(
    Path.cwd().parents[0],
    ignore_files_folders=ignore_files_folders, ignore_match=ignore_match)

- `Path.cwd().parents[0]` specifies the origin directory for the tree, which is the base path of our repository
- `ignore_files_folders` is a list of full folder or file names that should not be listed
- `ignore_match` is a list of wildcard patterns that can be used to exclude a wider range of files, such as most of the proprietary ESRI files in `*.gdb` folders.

```{admonition} Use this file list in the <kbd>README.md</kbd> of your ioerDATA upload below
:class: hint
In most scientific data repositories you will be asked to provide a list of files and descriptions. The directory tree returned by the above command can be used as a starting point.
```

### ioerDATA

With this file you are ready to upload your data to a data repository and create a DOI so that it can be properly archived, cited and referenced.

The ioerDATA is one such repository. It is available to all IOER collaborators at [https://data.fdz.ioer.de](https://data.fdz.ioer.de).

```{admonition} See the ioerDATA documentation
:class: hint
If you are an IOER colleague, have a look at the (internal) ioerDATA documentation at [https://docs.fdz.ioer.info/documentation/ioerdata/](https://docs.fdz.ioer.info/documentation/ioerdata/).
```

Other data repositories include [Zenodo](https://zenodo.org/).

## Publishing code

In addition to a data repository, you can (and should!) make your git repository available through (for example) Gitlab or Github. If you are in the middle of peer review, you may want to temporarily remove or redact any names.

```{admonition} Using Github pages
:class: hint
You can configure Github to publish your HTML converted notebooks to Github Pages at github.io. See the [Quickstart for GitHub Pages](https://docs.github.com/en/pages/quickstart).
```

✨ Then, spread the love! 💖 Share your notebook links with others on social media 📢, in communities 🤝, and beyond! 🚀