Skip to content

Gestion des données utilisateurs

Patrick GENDRE edited this page Apr 30, 2021 · 6 revisions

DRAFT (WORK IN PROGRESS)

user data in e-mission :

cf. e-mission architecture
On the server the data is stored in a mongodb instances in the following collections:

  • Stage_uuids is a table with user names and identifiers (uuid)
  • Stage_profiles describe some technical data attached to each uuid : the phone OS, the it des données techniques: l'OS, le token (d'appel du serveur) du téléphone, duration between each sync
  • Stage_usercache contains the users' phoone tracking data des utilisateurs still not processed by the pipeline (sometimes useful for some debugging)
  • Stage_timeseries : the raw data
  • Stage_timeseries-error in practice few data there, such as geofence errors
  • Stage_pipeline-state : for each user and for each pipeline state, 3 fields : the last time in the currrent run (curr_run_ts : only set if the stage is currently running so should be null usually), last time the state was run (last_ts_run), last datetime of the processed data (last_processed_ts)
  • Stage_analysis_timeseries : the processed data (see. a description of the data model here in French)
  • a Stage_surveys was added for processing LimeSurvey data in our traceur fabmob version in 2019, but this is not used in the current tracemob version dpeloyed for the Agremob project in la Rochelle

How to export one user's data

This could be used for a GDPR request. It would give the user ALL his/her data, not only the trips timelines (which is an app feature already).

In the e-mission-server code base there is a bin directory with several useful back-office scripts for managing the application, in particular in the debug subdirectory.
The extract_timeline_for_day_range_and_user.py script generates two files compressed gz files, one with the actual JSON data (timeseries and analysis_timeseries, the other with the pipeline state data for the user identified by its token (email field in the uuids collection) or its uuid.

How to delete one user's data

Use purge_user.py with the -p option (--pipeline-purge)

How to import one user's data

The same data exported from the database can be imported, too, with the load_multi_timeline_for_range.py script.

How to merge two data sets

In the tracemob app, the user doesn't have a true "account" with user / password credentials. When he/she first uses the app, a automatically generated token is associated with his/her data. If he/she uninstalls the app, or uses a new phone, a new token will generated when he/she wants to install again the tracemob app. Also, if he/she doesn't use the app for 30 days, the user data will be deleted from the server.
So in the tracemob project, it may happen that a beta-tester in la Rochelle asks for merging data collecting from the app with a previous token into the current data. Here is the procedure to follow :

  • check that the former data is strictly before the current data for avoiding any complication : i.e. the last timeserie for the former data is before the first timeserie for the current data
  • find the uuids for the former and current tokens
  • only add the former data in the timeseries_analysis collection directly with a mongodb query (below)
  • finally purge the former token/user data

db.getCollection('Stage_timeseries').updateMany({user_id:LUUID("formeruuid")},{$set: {user_id:LUUID("currrentid")}})