From 73343f0eddcbd06c77e057bfb09608c651115cc4 Mon Sep 17 00:00:00 2001 From: Manuel Holtgrewe Date: Thu, 13 Jun 2024 08:13:07 +0200 Subject: [PATCH] feat: adding developer documentation (#16) --- src/dev_install.rst | 535 ++++++++++++++++++++++++++++++++++++++++++ src/dev_notes.rst | 207 ++++++++++++++++ src/dev_style.rst | 50 ++++ src/index.rst | 29 ++- src/misc_glossary.rst | 13 + 5 files changed, 833 insertions(+), 1 deletion(-) create mode 100644 src/dev_install.rst create mode 100644 src/dev_notes.rst create mode 100644 src/dev_style.rst create mode 100644 src/misc_glossary.rst diff --git a/src/dev_install.rst b/src/dev_install.rst new file mode 100644 index 0000000..04a95c1 --- /dev/null +++ b/src/dev_install.rst @@ -0,0 +1,535 @@ +.. _dev_install: + +====================== +Developer Installation +====================== + + +The VarFish installation for developers should be set up differently from the installation for production use. + +The reason being is that the installation for production use runs completely in a Docker environment. +All containers are assigned to a Docker network that the host by default has no access to, except for the reverse proxy that gives access to the VarFish webinterface. + +The developers installation is intended not to carry the full VarFish database such that it is light-weight and fits on a laptop. +We advise to install the services not running in a Docker container. + +Please find the instructions for the Windows installation at the end of the page. + + +.. _dev_install_postgres: + +---------------- +Install Postgres +---------------- + +Follow the instructions for your operating system to install `Postgres `__. +Make sure that the version is 12 (11, 13 and 14 also work). +Ubuntu 20 already includes postgresql 12. +In case of older Ubuntu versions, this would be. + +.. code-block:: bash + + sudo apt install postgresql-12 + + +Adapt the postgres configuration file, for postgres 14 this would be: + +.. code-block:: bash + + sudo sed -i \ + -e 's/.*max_locks_per_transaction.*/max_locks_per_transaction = 1024 # min 10/' \ + /etc/postgresql/14/main/postgresql.conf + + +.. _dev_install_redis: + +------------- +Install Redis +------------- + +`Redis `_ is the broker that celery uses to manage the queues. +Follow the instructions for your operating system to install Redis. +For Ubuntu, this would be: + +.. code-block:: bash + + sudo apt install redis-server + + +.. _dev_install_python_pipenv: + +--------------------- +Install Python Pipenv +--------------------- + +We use `pipenv `__ for managing dependencies. +The advantage over ``pip`` is that also the versions of "dependencies of dependencies" will be tracked in a ``Pipfile.lock`` file. +This allows for better reprocubility. + +Also, note that VarFish is developed using Python 3.10+ only. +To install Python 3.10+, you can use `pyenv `__. +If you already have Python 3.10 (check with ``python --version`` then you can skip this step). + +.. code-block:: bash + + git clone https://github.com/pyenv/pyenv.git ~/.pyenv + echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc + echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc + echo -e 'if command -v pyenv 1>/dev/null 2>&1; then\n eval "$(pyenv init -)"\nfi' >> ~/.bashrc + exec $SHELL + pyenv install 3.10 + pyenv global 3.10 + +Now, install the latest version of pip and pipenv: + +.. code-block:: bash + + pip install --upgrade pip pipenv + + +.. _dev_install_clone_git: + +-------------------- +Clone git repository +-------------------- + +Clone the VarFish Server repository and switch into the checkout. + +.. code-block:: bash + + git clone --recursive https://github.com/varfish-org/varfish-server + cd varfish-server + + +.. _dev_install_frontend_deps: + +----------------------------- +Install Frontend Dependencies +----------------------------- + + +Execute the ``utils/install_frontend_os_dependencies.sh`` script to install OS package dependencies of Node/TypeScript packages. +Essentially, this installs NodeJS in a current version. +The script was written for Ubuntu, you will have to adjust it for other OS. + +.. code-block:: bash + + sudo bash utils/install_frontend_os_dependencies.sh + +Now, you can install the Node/TypeScript dependencies as follows: + +.. code-block:: bash + + ## go into frontend directory + cd frontend + ## setup pipenv environment + make deps + +.. _dev_prepare_frontend: + +------------------------- +(Optional) Build Frontend +------------------------- + +Execute the following command to build the frontend. +This is not required as during development, the Vite server will create the necessary files on the fly. + +.. code-block:: bash + + ## go into frontend directory + cd frontend + ## setup pipenv environment + make serve + + +.. _dev_serve_frontend: + +--------------- +Server Frontend +--------------- + +You can now start the Vite server to serve the Vite/Typescript based frontend. +Note that this is not accessible on its own as it is embedded into websites served by the backend. + +.. code-block:: bash + + ## go into frontend directory + cd frontend + ## start server + make serve + +For the remainder of the installation steps, use a new terminal and keep the frontend server running. + + +.. _dev_install_backend_deps: + +---------------------------- +Install Backend Dependencies +---------------------------- + +Execute the ``utils/install_backend_os_dependencies.sh`` script to install OS package dependencies of Python packages. +The script was written for Ubuntu, you will have to adjust it for other OS. + +.. code-block:: bash + + sudo bash utils/install_backend_os_dependencies.sh + +Now, you can install the Python dependencies as follows: + +.. code-block:: bash + + ## go into backend directory + cd backend + ## setup pipenv environment + make deps + +Afterwards, you can either enter the Pipenv environment or directly run helper ``make`` commands. + +.. code-block:: bash + + ## go into backend directory + cd backend + ## start pipenv shell + pipenv shell + ## OR + make lint + make format + make test + + +.. _dev_install_setup_db: + +-------------- +Setup Database +-------------- + +Use the tool provided in ``utils/`` to set up the database. +The name for the database should be ``varfish`` (create new user: yes, name: varfish, password: varfish). + +.. code-block:: bash + + bash utils/setup_database.sh + + +.. _dev_prepare_backend: + +--------------- +Prepare Backend +--------------- + +Next, create a ``backend/.env`` file with the following content. + +.. code-block:: bash + + export DATABASE_URL="postgres://varfish:varfish@127.0.0.1/varfish" + export CELERY_BROKER_URL=redis://localhost:6379/0 + export PROJECTROLES_ADMIN_OWNER=root + export DJANGO_SETTINGS_MODULE=config.settings.local + +To create the tables in the VarFish database, run the ``migrate`` command. +This step can take a few minutes. + +.. code-block:: bash + + ## go into backend directory + cd backend + ## run migrations + make migrate + +Once done, create a superuser for your VarFish instance. +By default, the VarFish root user is named ``root`` (the setting can be changed in the ``.env`` file with the ``PROJECTROLES_ADMIN_OWNER`` variable). + +.. code-block:: bash + + cd backend + pipenv run python manage.py createsuperuser + +Last, download the icon sets for VarFish and make scripts, stylesheets and icons available. + +.. code-block:: bash + + make geticons + make collectstatic + + +.. _dev_database_import: + +--------------- +Database Import +--------------- + +First, download the pre-build database files that we provide and unpack them. +Please make sure that you have enough space available. +The packed file consumes 31 Gb. +When unpacked, it consumed additional 188 GB. + +.. code-block:: bash + + cd /plenty/space + wget https://file-public.bihealth.org/transient/varfish/varfish-server-background-db-20201006.tar.gz{,.sha256} + sha256sum -c varfish-server-background-db-20201006.tar.gz.sha256 + tar xzvf varfish-server-background-db-20201006.tar.gz + +We recommend to exclude the large databases: frequency tables, extra annotations and dbSNP. +Also, keep in mind that importing the whole database takes >24h, depending on the speed of your disk. + +This is a list of the possible imports, sorted by its size: + +=================== ==== ================== ============================= +Component Size Exclude Function +=================== ==== ================== ============================= +gnomAD_genomes 80G highly recommended frequency annotation +extra-annos 50G highly recommended diverse +dbSNP 32G highly recommended SNP annotation +thousand_genomes 6,5G highly recommended frequency annotation +gnomAD_exomes 6,0G highly recommended frequency annotation +knowngeneaa 4,5G highly recommended alignment annotation +clinvar 3,3G highly recommended pathogenicity classification +ExAC 1,9G highly recommended frequency annotation +dbVar 573M recommended SNP annotation +gnomAD_SV 250M recommended SV frequency annotation +ncbi_gene 151M gene annotation +ensembl_regulatory 77M frequency annotation +DGV 43M SV annotation +hpo 22M phenotype information +hgnc 15M gene annotation +gnomAD_constraints 13M frequency annotation +mgi 10M mouse gene annotation +ensembltorefseq 8,3M identifier mapping +hgmd_public 5,0M gene annotation +ExAC_constraints 4,6M frequency annotation +refseqtoensembl 2,0M identifier mapping +ensembltogenesymbol 1,6M identifier mapping +ensembl_genes 1,2M gene annotation +HelixMTdb 1,2M MT frequency annotation +refseqtogenesymbol 1,1M identifier mapping +refseq_genes 804K gene annotation +mim2gene 764K phenotype information +MITOMAP 660K MT frequency annotation +kegg 632K pathway annotation +mtDB 336K MT frequency annotation +tads_hesc 108K domain annotation +tads_imr90 108K domain annotation +vista 104K orthologous region annotation +acmg 16K disease gene annotation +=================== ==== ================== ============================= + +You can find the ``import_versions.tsv`` file in the root folder of the package. +This file determines which component (called ``table_group`` and represented as folder in the package) gets imported when the import command is issued. +To exclude a table, simply comment out (``#``) or delete the line. +Excluding tables that are not required for development can reduce time and space consumption. +Also, the GRCh38 tables can be excluded. + +A space-consumption-friendly version of the file would look like this + +.. code-block:: + + build table_group version + GRCh37 acmg v2.0 + #GRCh37 clinvar 20200929 + #GRCh37 dbSNP b151 + #GRCh37 dbVar latest + GRCh37 DGV 2016 + GRCh37 ensembl_genes r96 + GRCh37 ensembl_regulatory latest + GRCh37 ensembltogenesymbol latest + GRCh37 ensembltorefseq latest + GRCh37 ExAC_constraints r0.3.1 + #GRCh37 ExAC r1 + #GRCh37 extra-annos 20200704 + GRCh37 gnomAD_constraints v2.1.1 + #GRCh37 gnomAD_exomes r2.1 + #GRCh37 gnomAD_genomes r2.1 + #GRCh37 gnomAD_SV v2 + GRCh37 HelixMTdb 20190926 + GRCh37 hgmd_public ensembl_r75 + GRCh37 hgnc latest + GRCh37 hpo latest + GRCh37 kegg april2011 + #GRCh37 knowngeneaa latest + GRCh37 mgi latest + GRCh37 mim2gene latest + GRCh37 MITOMAP 20200116 + GRCh37 mtDB latest + GRCh37 ncbi_gene latest + GRCh37 refseq_genes r105 + GRCh37 refseqtoensembl latest + GRCh37 refseqtogenesymbol latest + GRCh37 tads_hesc dixon2012 + GRCh37 tads_imr90 dixon2012 + #GRCh37 thousand_genomes phase3 + GRCh37 vista latest + #GRCh38 clinvar 20200929 + #GRCh38 dbVar latest + #GRCh38 DGV 2016 + +To perform the import, issue: + +.. code-block:: bash + + cd backend + pipenv python manage.py import_tables \ + --tables-path /plenty/space/varfish-server-background-db-20201006 + +Performing the import twice will automatically skip tables that are already +imported. To re-import tables, add the ``--force`` parameter to the command: + +.. code-block:: bash + + cd backend + pipenv python manage.py import_tables \ + --tables-path varfish-db-downloader --force + + +.. _dev_run_server_celery: + +--------------------- +Run Server and Celery +--------------------- + +Now, open two terminals and start the VarFish server and the celery server. + +.. code-block:: bash + + ## in terminal 1 + make serve + ## in a separate terminal 2 + make celery + +Continue the tutorial in a new terminal. + + +.. _dev_install_backing_services: + +--------------------------- +Install Annotation Services +--------------------------- + +VarFish uses a number of internal annotation services that you need to install as well. +The instructions below will provide you with a development subset that contains information on all genes but variant information on genes BRCA1 and TGDS only. + +First, install Docker and docker compose `following the official manual `__. + +Then, install the ``s5cmd`` tool for downloading data later on. + +.. code-block:: bash + + wget -O /tmp/s5cmd_2.1.0_Linux-64bit.tar.gz \ + https://github.com/peak/s5cmd/releases/download/v2.1.0/s5cmd_2.1.0_Linux-64bit.tar.gz + tar -C /tmp -xf /tmp/s5cmd_2.1.0_Linux-64bit.tar.gz + sudo cp /tmp/s5cmd /usr/local/bin/ + +Next, follow the `instructions on the varfish-docker-compose-ng README `__. + +.. code-block:: bash + + ## clone + git clone https://github.com/varfish-org/varfish-docker-compose-ng.git + + ## go into directory + cd varfish-docker-compose-ng + + ## create volumes directories + mkdir -p .dev/volumes/{minio,varfish-static}/data + ## create secrets + mkdir -p .dev/secrets + echo password >.dev/secrets/db-password + echo postgresql://varfish:password@postgres/varfish >.dev/secrets/db-url + echo minio-root-password >.dev/secrets/minio-root-password + echo minio-varfish-password >.dev/secrets/minio-varfish-password + ## ensure that pwgen is installed first + pwgen + ## generate a 100 character secret + pwgen 100 1 >.prod/secrets/varfish-server-django-secret-key + ## copy environment file + cp env.tpl .env + ## copy docker-compose override file + cp docker-compose.override.yml-dev docker-compose.override.yml + + ## setup some configuration + mkdir -p .dev/config/nginx + cp utils/nginx/nginx.conf .dev/config/nginx + + ## download dev data + bash download-data.sh + +Now you can take up the backing services using: + +.. code-block:: bash + + docker compose up + + +.. _dev_try_it_out: + +---------- +Try It Out +---------- + +You now have the system services Postgres and Redis running. +You also have frontend vite development service, the backend Django server, and the Celery worker running. +You can now try out VarFish by going to `localhost:8080 `__ and login with the superuser account you created above. + + +.. _dev_install_windows: + +---------------------- +Installation (Windows) +---------------------- + +The setup was done on a recent version of Windows 10 with Windows Subsystem for Linux Version 2 (WSL2). + + +.. _dev_install_windows_wsl2: + +Installation WSL2 +================= + +Following [this tutorial](https://www.omgubuntu.co.uk/how-to-install-wsl2-on-windows-10) to install WSL2. + +- Note that the whole thing appears to be a bit convoluted, you start out with `wsl.exe --install` +- Then you can install latest LTS Ubuntu 22.04 with the Microsoft Store +- Once complete, you probably end up with a WSL 1 (one!) that you can conver to version 2 (two!) with `wsl --set-version Ubuntu-22.04 2` or similar. +- WSL2 has some advantages including running a full Linux kernel but is even slower in I/O to the NTFS Windows mount. +- Everything that you do will be inside the WSL image. + + +.. _dev_install_docker_desktop: + +Installation Docker Desktop +=========================== + +Follow the `Install Docker Desktop `__ instructions. +Then, ensure that the Docker Engine is running. + + +.. _dev_install_windows_os_deps: + +Install OS Dependencies +======================= + +.. code-block:: bash + + ## install dependencies + sudo apt install libsasl2-dev python3-dev libldap2-dev libssl-dev gcc make rsync + ## install postgres and redis + sudo apt install postgresql postgresql-server-dev-14 postgresql-client redis + ## start postgres, must be done after each WSL2 start + sudo service postgresql start + sudo service postgresql status + ## start redis, must be done after each WSL2 start + sudo service redis-server start + sudo service redis-server status + ## update postgres configuration and restart, only do this once + sudo sed -i -e 's/.*max_locks_per_transaction.*/max_locks_per_transaction = 1024 # min 10/' /etc/postgresql/14/main/postgresql.conf + sudo service postgresql restart + +Create a postgres user `varfish` with password `varfish` and a database. + +.. code-block:: + + sudo -u postgres createuser -s -r -d varfish -P + [enter varfish as password] + sudo -u postgres createdb --owner=varfish varfish + +From here on, you can follow the instructions for the Linux installation, starting at `ref:dev_install_python_pipenv`. diff --git a/src/dev_notes.rst b/src/dev_notes.rst new file mode 100644 index 0000000..553a033 --- /dev/null +++ b/src/dev_notes.rst @@ -0,0 +1,207 @@ +.. _dev_notes: + +================= +Development Notes +================= + + +.. _dev_notes_sodar_core: + +----------------------- +Working With Sodar Core +----------------------- + +VarFish is based on the Sodar Core framework which has a `developer manual `_ itself. +It is worth reading its development instructions. +The following lists the most important topics: + +- `Models `_ +- `Rules `_ +- `Views `_ +- `Templates `_ + - `Icons `_ +- `Forms `_ + + +.. _dev_notes_tests: + +------------- +Running Tests +------------- + +Running the VarFish test suite is easy, but can take a long time to finish (>10 minutes). + +.. code-block:: bash + + cd backend + make test + ## OR + cd frontend + make test + ## or from root + make test + +You can exclude time-consuming UI tests in the backend: + +.. code-block:: bash + + cd backend + make test-noselenium + +If you are working on one only a few tests, it is better to run them directly. +To specify them, follow the path to the test file, add the class name and the test function, all separated by a dot: + +.. code-block:: bash + + cd backend + pipenv run python manage.py test -v2 --settings=config.settings.test \ + variants.tests.test_ui.TestVariantsCaseFilterView.test_variant_filter_case_multi_bookmark_one_variant + +This would run the UI tests in the variants app for the case filter view. + +To speedup your tests, you can use the ``--keepdb`` parameter. +This will only run the migrations on the first test run. + + +.. _dev_notes_style_linting: + +--------------- +Style & Linting +--------------- + +We use `black `__ for formatting Python code, `flake8 `__ for linting, and `isort `__ for sorting includes. +To ensure that your Python code follows all restrictions and passes CI, use + +.. code-block:: bash + + cd backend + ## run lint + make lint + ## run formatting + make format + +We use `prettier `__ for Javascript formatting and `eslint `__ for linting the code. +Similarly, you can use the following for the Javascript/Vue code: + +.. code-block:: bash + + cd frontend + ## run lint + make lint + ## run formatting + make format + +Or, all together (from checkout root) + +.. code-block:: bash + + ## run lint + make lint + ## run formatting + make format + + +.. _dev_storybook: + +--------- +Storybook +--------- + +We use `Storybook.js `__ to develop Vue components in isolation. +You can launch the Storybook server by calling: + +.. code-block:: bash + + cd frontend + ## ensure dependencies are the + make deps + ## run server + make storybook + + +.. _dev_git: + +---------------- +Working With Git +---------------- + +In this section we will briefly describe the workflow how to contribute to VarFish. +This is not a git tutorial and we expect basic knowledge. +We recommend `gitready `_ for any questions regarding git. +We do use `git rebase `_ a lot. + +In general, we recommend to work with ``git gui`` and ``gitk``. + +The first thing for you to do is to create a fork of our github repository in your github space. +To do so, go to the `VarFish repository `_ and click on the ``Fork`` button in the top right. + +.. _dev_git_main: + +Update Main +=========== + +Als refer to `Pull with rebase on gitready `__ + +.. code-block:: bash + + git pull --rebase + +.. _dev_git_working_branch: + +Create Working Branch +===================== + +Always create your working branch from the latest main branch. +Use the ticket number and description as name, following the format ``-``, e.g. + +.. code-block:: bash + + git checkout -b 123-adding-useful-feature + + +.. _dev_git_commit_msg: + +Write A Sensible Commit Message +=============================== + +A commit message should only have 72 characters per line. +As the first line is the representative, it should sum up everything the commit does. +Leave a blank line and add three lines of github directives to reference the issue. + +.. code-block:: + + Fixed serious bug that prevented user from doing x. + + Closes: #123 + Related-Issue: #123 + Projected-Results-Impact: none + + +.. _dev_git_single_commit_pr: + +Single Commit in PR +=================== + +Our GitHub repositories are configured to enforce squash commits. +That is, all commits in a PR will be squashed into one. + + +.. _dev_git_semantic_prs: + +Semantic Pull Requests +====================== + +We use semantic pull requests / `ConventionalCommits.org `__, enforced by this `GitHub Action `__. + +Use one of the following prefixes to get an entry in the README: + +- ``fix:`` - bug fix, bump patch version +- ``feat:`` - feature, bump minor version + +The following do not create entries in the README: + +- ``ci:`` - continuous integration change +- ``docs:`` - documentation +- ``chore:`` - misc chore + +To force the latter to create an entry in the README, add ``Release-As: THE.NEXT.VERSION`` in the squash commit message. diff --git a/src/dev_style.rst b/src/dev_style.rst new file mode 100644 index 0000000..e6d6203 --- /dev/null +++ b/src/dev_style.rst @@ -0,0 +1,50 @@ +.. _dev_style: + +================ +Style Guidelines +================ + +This documentation contains a short summary of the coding style guidelines used in VarFish. + + +.. _dev_style_rst: + +---------------- +RestructuredText +---------------- + +- follow the current state +- put two new lines before each heading +- put each sentence on its own line (it is a semantic unit and should appear as such in revision control) +- add labels to each section consisting of ``_${file_name}_${section_short}`` +- use double-underscore links to prevent collisions + + +.. _dev_style_py: + +------ +Python +------ + +- see linting configuration in ``varfish-server/backend`` +- black code style, line length 100 +- use isort +- use flake8 + + +.. _dev_style_ts: + +---------- +TypeScript +---------- + +- see ESlint configuration in ``varfish-server/frontend`` + +.. _dev_style_rs: + +---- +Rust +---- + +- see `The Rust Style Guide `__ +- adhere to all stable clippy hints diff --git a/src/index.rst b/src/index.rst index fba94ab..d40e5e3 100644 --- a/src/index.rst +++ b/src/index.rst @@ -3,9 +3,30 @@ VarFish Development Docs ======================== +This is the developer-aimed documentation for VarFish. +The documentation aimed at end-users and operators/admins can be found at `varfish-server.rtd.io `__. + +The first section "Developer Documentation" provides hands-on documentation for developers. + +----- + .. toctree:: :maxdepth: 1 - :caption: Documents + :caption: Developer Documentation + + dev_install + dev_style + dev_notes + +----- + +The section "Overview Documents" provides high-level overview documents. + +----- + +.. toctree:: + :maxdepth: 1 + :caption: Overview Documents doc_architecture doc_dataflows @@ -13,6 +34,12 @@ VarFish Development Docs doc_repooverview doc_datasources +----- + +And finally, some miscellaneous documents. + +----- + .. toctree:: :maxdepth: 1 :caption: Misc diff --git a/src/misc_glossary.rst b/src/misc_glossary.rst new file mode 100644 index 0000000..0251101 --- /dev/null +++ b/src/misc_glossary.rst @@ -0,0 +1,13 @@ +.. _misc_glossary: + +======== +Glossary +======== + +.. glossary:: + + Public Domain (Author Email) + License as public domain confirmed by author via email. + + N/A (Emailed Author) + License not available, emailed author about it.