Skip to content

Latest commit

 

History

History
311 lines (223 loc) · 12.6 KB

CONTRIBUTING.md

File metadata and controls

311 lines (223 loc) · 12.6 KB

Contributing to the Kive Project

If you like this project and want to make it better, help out. You could report a bug, or pitch in with some development work.

Bug Reports and Enhancement Requests

Please create issue descriptions on GitHub. Be as specific as possible. Which version are you using? What did you do? What did you expect to happen? Are you planning to submit your own fix in a pull request?

Development

You will need to follow all the installation instructions in the INSTALL file, then open the source code in a Python IDE. You will also need to install some packages to run the tests.

pip install -r requirements-dev.txt

If you want to see what's currently being worked on, check out the active tasks in our milestones.

Performance Testing

It can be useful to track where time is spent when running a pipeline or a set of tests. Python comes with a profiler module:

python -m cProfile -s cumtime manage.py test --settings=kive.test_settings >timing.txt

Another option is to install the gprof2dot package with pip. Then you can generate a call graph with timing information:

python -m cProfile -o timing.dat manage.py test --settings=kive.test_settings \
&& (echo strip ; echo "sort cumtime" ; echo "stats 500") | python -m pstats timing.dat >timing.txt \
&& gprof2dot -f pstats timing.dat -o timing.dot

Deploying a Release

See the vagrant scripts for examples of how to start a production server. Once you have set up your production server, this is how to deploy a new release:

  1. Make sure the code works in your development environment. Run all the Javascript tests and all the Django unit tests, or check that they ran successfully in the latest TravisCI build.

     cd /path/to/git/Kive
     npm run test:travis
     cd kive
     ./manage.py test --settings kive.settings_test_pg
    
  2. Check if the kiveapi package needs to update its version number by looking for new commits in the /api folder.

  3. Check that all the issues in the current milestone are closed.

  4. Check that all your code is committed to git, configure your settings file with a STATIC_ROOT of /path/to/git/Kive/static_root, and then build and package your static files.

     cd /path/to/git/Kive
     npm install
     cd kive
     ./manage.py collectstatic -c
     cd ..
     tar -czvf static_root.tar.gz static_root
     rm -rf static_root
    
  5. Create a release on Github. Use "vX.Y" as the tag, where X.Y matches the version on the milestone. If you have to redo a release, you can create additional releases with tags vX.Y.1, vX.Y.2, and so on. Mark the release as pre-release until you finish deploying it.

  6. Attach the static_root.tar.gz file to the release on GitHub. Attach it as a binary below the description, not as an attachment in the description.

  7. Check on the site that there are no active runs (as an administrator, go to the Runs page under the User portal, and click the lock to give yourself the ability to view all runs), then kill the fleet.

     ssh user@server
     ps aux|grep runfleet
     sudo kill -int <pid for runfleet>
    
  8. Stop the web server and scheduled jobs.

     sudo systemctl stop httpd
     sudo systemctl stop kive_purge.timer
     sudo systemctl stop kive_purge_synch.timer
    
  9. (Optional, but skip at your own peril!) Make a complete backup of the Kive installation. We use a tool called barman to backup PostgreSQL.

     sudo su barman
     barman backup main
     ./manage.py dumpdata --indent=4 > db_backup.json
    

    This covers everything stored in the database, including models and records automatically generated by Django itself. These files should be kept unchanged, as the records inside may be highly interdependent and any changes may cause system-wide problems.

    That doesn't cover everything in the system, however, as files tracked by Kive are stored on the filesystem, in the directory specified by MEDIA_ROOT in kive/settings.py. To preserve these files, make an exact copy of the following subdirectories:

    • CodeResources
    • Datasets
    • Logs
    • Sandboxes

    Do not restructure or rename anything in these folders: the file paths are stored in the database, so it's important to not let the files' actual locations become desynchronized from the stored locations.

    We wrote some rsync scripts to do the backup (not provided in the repo):

     sudo su kivefleet
     cd ~/bin
     ./backup_coderesources && ./backup_datasets && ./backup_logs && ./backup_sandboxes
    

    If you ever need to restore this backup, see "Restoring the system after something's gone wrong".

  10. Get the code from Github onto the server.

     ssh user@server
     cd /usr/local/share/Kive/kive
     sudo chgrp -R kive .. # Do this if other users also deploy.
     git fetch
     git checkout tags/vX.Y
    
  11. Check if you need to set any new environment variables by running diff kive/settings_default.py kive/settings.py. Then copy settings_default.py over settings.py.

  12. Upgrade dependencies if requirements.txt has changed. This assumes that you have activated a virtual environment.

      ssh user@server
      cd /usr/local/share/Kive
      sudo /opt/venv_kive/bin/pip install -r requirements.txt
    
  13. Migrate the database as described in the Creating Database Tables section of INSTALL.md, and deploy the static files:

     ssh user@server
     cd /usr/local/share/Kive/kive
     ./manage.py migrate
     echo $KIVE_STATIC_ROOT
     cd /path/to/static/..
     sudo rm -Rf static
     sudo wget https://github.com/cfe-lab/Kive/releases/download/vX.Y/static_root.tar.gz -O static_root.tar.gz
     sudo tar -xzvf static_root.tar.gz
     sudo mv static_root static
     sudo rm static_root.tar.gz
    
  14. Launch the fleet. This assumes that kiveuser is the user account used to run the fleet. The su -l will run that user's login scripts that should activate any needed virtual environment:

     sudo su -l kiveuser
     cd /usr/local/share/Kive/kive
     ./manage.py runfleet </dev/null &
    
  15. Start the web server and scheduled jobs.

    sudo systemctl start httpd
    sudo systemctl start kive_purge.timer
    sudo systemctl start kive_purge_synch.timer
    
  16. Update the Kive API library if needed.

    cd /usr/local/share/Kive/api
    cat setup.py  # look at the new version number
    pip show kiveapi  # compare with the version installed in Python 2
    sudo python setup.py install  # if needed
    pip3 show kiveapi  # compare with the version installed in Python 3
    sudo python3 setup.py install  # if needed
    
  17. When the release is stable, remove the pre-release flag from the release. Check that it's included on the Zenodo page. If you included more than one tag in the same release, the new tags have not triggered Zenodo versions. Edit the release on GitHub, copy the description text, download the static_root archive, update the release, then click the Delete button. Then create a new release with the same description and static_root, and that will trigger a Zenodo version.

  18. Close the milestone for this release, create one for the next release, and decide which issues you will include in that milestone.

Restoring the system after something's gone wrong

If something goes wrong with the system, you can restore it using the backups created as per step 5 of the "Deploying a Release" section.

To do this, you must restore the system to the version it was in when the backup was made.
If you can use reverse migrations, then that's preferable, but depending on the specific migrations, this may fail. If so, then you can drop and re-create the database as per the instructions in INSTALL.md. From there, you can migrate forwards back to the state the system was in at the time of the backup.

Then, you can restore the data. First, flush the database:

./manage.py flush

(respond "yes" when it asks you whether to proceed or not). This will remove everything from the database, including records automatically generated by Kive or Django; we want to get rid of these as they may clash with the data in the backup. If your database backup file was called db_backup.json, you can then call

./manage.py loaddata db_backup.json

Lastly, clear out the four data subdirectories (CodeResources, Datasets, Logs, and Sandboxes), and replace them with your backed-up versions.

Unit tests

To run all the unit tests, run ./manage.py test. Note that running the full test suite can take around half an hour.

The front-end tests are run separately. run node tests-server.node.js in the root directory, which should open your browser. In this directory view, run both SpecRunner.html and SpecRunner_SystemJS.html.

Faster unit tests

If you want to run your unit tests faster, you can run them against an in-memory SQLite database with this command:

./manage.py test --settings kive.settings_test

This also reduces the amount of console output produced by the testing.

That still takes several minutes to run, so you may want to run a subset of the fastest tests: the mock tests. These tests don't access a database, so they are extremely fast. You can run them all with this command:

./manage.py test --settings kive.settings_mocked

Testing with a SQLite database may have slightly different behaviour from the PostgreSQL database, so you should occasionally run the tests with the default settings. Alternatively, to run the tests with all the default settings but with reduced console output:

./manage.py test --settings kive.settings_test_pg

All of these options disable some system tests that cover the parts of the Pipeline execution code that use Slurm. To enable the entire suite of tests:

./manage.py test --settings kive.settings_test_pg_slurm

These tests should be run before making a new release.

See the Django documentation for details on running specific tests.

If you want to time your unit tests to see which ones are slowest, install HotRunner.

sudo pip install unittest-xml-reporting

Then add these two lines to settings.py:

TEST_RUNNER = 'xmlrunner.extra.djangotestrunner.XMLTestRunner'
TEST_OUTPUT_DIR = '/path/to/git/Kive/utils'

Finally, run the unit tests and the script to summarize them.

./manage.py test --settings kive.settings_test
./slow_test_report.py

Updating test fixtures

Fixtures are a Django feature which allow for test data to be persistently stored in the database during development, to avoid having to reload it every time a unit test is run. This is especially convenient for unit tests which involve actually running a pipeline, which can take a long time.

The fixtures which are used in our unit tests are created with the custom command ./manage.py update_test_fixtures. The code that is executed for this command can be found in portal/management/commands/update_test_fixtures.py, and the fixture files are in portal/fixtures. If this code or any functions it calls are modified, the fixtures will need to be re-created by running update_test_fixtures again.

Updating TypeScript and Sass files

Kive uses Webpack to bundle Javascript files. These bundles are generated on install, and should not be committed to the repository.

Be sure to adhere to Kive's TypeScript style by running grunt tslint on your contributions.

Run both the Webpack and Sass watchers simultaneously with npm run watch:all.

To debug TypeScript code in the browser, change the devtool setting in webpack.config.js.

Updating embedded icon files

Some icon files are stored as base64-encoded strings which describe PNG images inside Javascript files.

Find the original icon in raw_assets in the project root. Make your modifications and then run grunt pngicons to compile them into the Javascript files. This command is also run on install.

* Recommended: If pngquant is available on your system, Grunt will use it to compress the icons.