Skip to content

Commit

Permalink
Minor faq improvements (#251)
Browse files Browse the repository at this point in the history
* Back to top links in faq.rst
* Back to top links for dockstore-cli-faq.rst
* Make GHA FAQ links more consistent
* Fix unintended deletion of fullstop
* Don't use alpine
* De-CWL-ize, overhaul big ref files, entrypoints, security
* Move environment question to CLI FAQ
* Improve FAQs on branches/releases not showing up
* Add glossary entry for parent image
* Fix formatting on XKCD reference
* Ubuntu 22.04 instead of 18.04
* Add "how do I use workflows" to general FAQ
* Stage/Staging typo fixed
* Remove regex question thanks to dockstore/dockstore#5354

---------

Co-authored-by: Ash O'Farrell <aofarrel@ucsc.edu>
  • Loading branch information
denis-yuen and aofarrel committed Jun 7, 2023
1 parent c386665 commit f0565cc
Show file tree
Hide file tree
Showing 5 changed files with 45 additions and 46 deletions.
4 changes: 4 additions & 0 deletions docs/_attic/glossary_entries.py
Original file line number Diff line number Diff line change
Expand Up @@ -506,6 +506,10 @@
institute="",
pronunciation='')

ParentImage = GlossEntry("parent image",
definition="A [Docker image] which acts as the base upon which another Docker image is built. For example, including ``FROM ubuntu:22.04`` in a [Dockerfile] means that the resulting image will include everything inside ubuntu:22.04, plus any changes made by other commands in the Dockerfile. Parent images are sometimes called base images, but strictly speaking a base image is different (see further reading).",
furtherreading="https://docs.docker.com/glossary/#parent-image")

Preemptible = GlossEntry("preemptible",
acronym_full="",
definition="A type of [GCP] [VM] which may have its running jobs interrupted at any given time, and will be shut down if running for more than 24 hours. A preemptible machine is significantly cheaper than a standard VM, at the cost of possibly stopping before your computational work is finished. You can use preemptible machines when running workflows on GCP backends to save on compute costs.",
Expand Down
1 change: 1 addition & 0 deletions docs/_attic/glossary_entries_list_dynamic.txt

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 11 additions & 0 deletions docs/dictionary.rst

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

58 changes: 24 additions & 34 deletions docs/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,19 +13,9 @@ For questions relating to the Dockstore CLI, please see :doc:`Dockstore CLI FAQ
General Dockstore Questions
^^^^^^^^^^^^^^^^^^^^^^^^^^^

What environment do you test tools in?
--------------------------------------

Typically, we test running tools in Ubuntu Linux 16.04 LTS on VMs in
`OpenStack <https://www.openstack.org/>`__ with 8 vCPUs and 96 GB of RAM
and above. We've also begun testing on Ubuntu 18.04 LTS and so far it's
been successful. If you are only listing and editing tools, we have
achieved success with much lower system requirements. However, launching
tools will have higher system requirements dependent on the specific
tool. Consult a tool's README or CWL/WDL description when in doubt.

:ref:`(back to top) <topFAQ>`

How do I use workflows that I find on Dockstore?
------------------------------------------------
If you want to run the workflow locally, you can :doc:`use the Dockstore CLI to download and run the workflow </launch-with/launch>`. Alternatively, you can run the workflow using a cloud platform by clicking the "Launch With" button on the right side of the workflow entry.

.. _what-is-a-verified-tool-or-workflow:

Expand Down Expand Up @@ -134,8 +124,8 @@ You will find a variety of citation styles and ways to export it at
Integration with GitHub
^^^^^^^^^^^^^^^^^^^^^^^

What is the difference between logging in with GitHub or logging in with Google?
--------------------------------------------------------------------------------
What is the difference between logging in with GitHub versus logging in with Google?
-------------------------------------------------------------------------------------

The intent here is that you should be able to login with either login
method and still conveniently get into the same Dockstore account. With
Expand Down Expand Up @@ -324,43 +314,42 @@ Do you have tips on creating Dockerfiles?
client/server issues, it is also not compatible with CWL)
- do not depend on changes to ``hostname`` or ``/etc/hosts``, Docker
will interfere with this
- use a well-known and secure :ref:`dict parent image` such as official `debian <https://hub.docker.com/_/debian>`__ or `Python <https://hub.docker.com/_/python>`__ images
- try to keep your Docker images small
- however, do not use alpine images, or other images that lack bash, as your parent image unless you will be installing bash in the Dockerfile


:ref:`(back to top) <topFAQ>`

How should I handle large reference files when designing workflows and Dockerfiles?
-----------------------------------------------------------------------------------
Generally speaking, you can choose to either "package" reference files in your Docker image, or you can treat them as "inputs" so they can be staged outside and mounted into the running container. We generally recommend having them serve as inputs.

Do you have tips on creating CWL files?
:ref:`(back to top) <topFAQ>`

Do you have tips on creating workflows?
---------------------------------------

When writing CWL tools and workflows, there are a few common workarounds
that can be used to deal with the restrictions that CWL places on the
use of docker. These include:
When writing workflows, there are a few common workarounds
that can be used to deal with the restrictions that workflow languages such as CWL and WDL place on the
use of Docker. These include:

* cwltool (which we use to run tools) is restrictive and locks down much of ``/`` as read only, use the current working directory or $TMPDIR for file writes

* You can also use `Docker volumes <https://docs.docker.com/engine/reference/builder/#/volume>`__ in your Dockerfile to specify additional writeable directories

* Do not rely on the hostname inside a container, Docker dynamically generates this when starting containers
* Do not rely on the hostname inside a container; Docker dynamically generates this when starting containers

* Do not rely on an entrypoint inside a container; workflow executors tend to override custom entrypoints with /bin/bash

Additionally:

- you need to "collect" output from your tools/workflows inside docker
and drop them into the current working directory in order for CWL to
"find" them and pull them back outside of the container
- in order for the workflow executor to "find" your outputs and pull them back outside the container, you may want to "collect" output from your tools/workflows inside Docker and drop them into the current working directory
- related to this, it's often times easiest to write a simple wrapper
script that maps the command line arguments specified by CWL to
however your tool expects to be parameterized. This script can handle
moving output to the current working directory and renaming if need
be
- genomics workflows work with large data files, this can have a few
ramifications:

- do not "package" large data reference files in your Docker image.
Instead, treat them as "inputs" so they can be staged outside and
mounted into the running container
- the ``$TMPDIR`` variable can be used as a scratch space inside
your container. Make sure your host running Docker has sufficient
scratch space for processing your genomics data.

:ref:`(back to top) <topFAQ>`

Expand All @@ -384,11 +373,12 @@ Any last tips on using Dockstore?
---------------------------------

- the Dockstore CLI uses ``./datastore`` in the working directory for
temp files so if you're processing large files make sure this
temp files, so if you're processing large files, make sure this
partition hosting the current directory is large.
- you can use a single Docker image with multiple tools, each of them
registered via a different CWL
- you can use a Git repository with multiple CWL files
- you can also use a single Docker image with multiple workflows
- you can use a Git repository with multiple workflow files
- related to the two above, you can use non-standard file paths if you
customize your registrations in the Version tab of Dockstore

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -124,15 +124,17 @@ Navigate to the ``/my-workflows`` (or ``/my-tools``) page.
Expand the GitHub account that the repository belongs to on the left hand side. Then click on the bottom where it says ``Apps Logs``.

.. image:: /assets/images/docs/github-app-logs-button.png
:width: 30 %
:width: 40 %

Once loaded, the following window will be displayed.

.. image:: /assets/images/docs/github-app-logs-window.png
:width: 80 %

Here you can view all the GitHub app events Dockstore is aware of and whether they failed or were successful. If there was a failure, you can expand that row and view the error message as shown below.

.. image:: /assets/images/docs/github-app-logs-error-message.png
:width: 80 %

In the case shown above, the error message is from parsing the following .dockstore.yml file.

Expand Down Expand Up @@ -173,19 +175,10 @@ A new separate workflow/tool will be registered if the original name isn't inclu
How can I convert my entire existing workflow/tool at once?
-------------------------------------------------------------
Currently you cannot convert all existing branches/versions at once. You must add a .dockstore.yml to each branch in order for the GitHub app
automatically detect and sync changes with the corresponding version on Dockstore.
automatically detect and sync changes with the corresponding version on Dockstore. A side effect of this is that, unless you edit tagged commits to include .dockstore.yml files, old releases created prior to you creating a .dockstore.yml will not be selectable in Dockstore.

If you have a .dockstore.yml file in your master or develop branches on GitHub, any new branches you create from these as your template
will have a .dockstore.yml.

:ref:`(back to top) <topGHAFAQ>`


Why are only some branches appearing on my workflow/tool?
----------------------------------------------------------
The Dockstore GitHub App is currently unable to parse branches that use special characters besides numerical digits, non-leading dashes, forward slashes, periods, and underscores. "Special characters" includes alphabetical characters with accents, tildes, circumflexes, umlauts, or non-English letters such as ß and ø. These limitations are stricter than what GitHub itself allows. As a result, if you have a GitHub branch named something like `Ó-Fearghail`, `branch-with-{curly-braces}`, or `Robert');-DROP-TABLE-Students;`, that branch will not appear on Dockstore. If you check the Dockstore GitHub App logs, you'll see these branches throw an error such as `Reference refs/heads/branch-with-{curly-braces} is not of the valid form`.

However, even if you have branches with unsupported names, other branches with names like `main` and `develop` will continue to update on Dockstore as normal. The public view of your published entry will not show any errors -- it will simply not show the branches with unsupported names.
will have a .dockstore.yml, as will future releases based on master or develop.

:ref:`(back to top) <topGHAFAQ>`

Expand Down

0 comments on commit f0565cc

Please sign in to comment.