Skip to content

e-mission/e-mission-eval-private-data

Repository files navigation

This repository contains ipython notebooks for the evaluation of the e-mission platform. These notebooks re-use code from the e-mission-server codebase, so it needs to be included while running them.

Running.

  1. Install the e-mission server, including setting it up https://github.com/e-mission/e-mission-server

  2. Set the home environment variable

    $ export EMISSION_SERVER_HOME=<path_to_emission_server_repo>
    

    To verify, check the environment variables using

     $ env
    

    and ensure ENV_SERVER_HOME is present in the list and has the right path (as mentioned above).

  3. If you haven't setup before, set up the evaluation system

    $ source setup.sh
    
  4. If you have, activate

    $ source activate.sh
    
  5. Access the visualizations of interest and copy the config over. The <eval_folder> mentioned below can be any folder containing notebooks and/or .py files for visualisation or other purposes. E.g. : TRB_label_assist is one such folder.

$ cd <eval_folder>
$ cp -r ../conf .
  1. Start the notebook server
$ ../bin/em-jupyter-notebook.sh

Loading data

Cleaning up

After completing analysis, tear down

$ source teardown.sh

Checking in notebooks

Note that all notebooks checked in here are completely public. All results included in them can be viewed by anybody, even malicious users. Therefore, you need to split your analysis into two groups:

  • aggregate only: results are not specific for a single user. The scripts in such notebooks should not include uuids, and should use the aggregate timeseries instead of the default timeseries.
    • example: number of walking and biking trips over all users in the control group
  • individual analyses: results are specific for a single user. The scripts in such notebooks can include uuids, and potentially even user emails or tokens.
    • example: varation in walking and biking trips over time for user uuid1

Notebooks that include aggregate analyses can be checked in with outputs included. This is because it is hard to tease out the contributions by individuals to the aggregate statistics, and so the chances of leaking information are low. However, notebooks that include individual analyses should be checked in after deleting all outputs (Kernel -> Restart and clear output).

Aggregate results Individual results
with outputs Y N
after clearing outputs Y Y

About

Evaluate the e-mission platform along various metrics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages