This report is identical to the "sites" report (number of sites logged in), except for one difference. While the "sites" report lets users switch between "area" and "line" mode for the graph display, this report loads into "line" mode by default and hides the option to switch modes. This was done because area mode is cumulative, which doesn't make sense with fractions that this report uses. (e.g., if we had a segmentation with three browsers, each at 0.5 success, the cumulative graph would display 1.5=150% success, which is nonsensical.) Closes #15
The report is also converted to display the mean number of sites logged in instead of the median (closes #29). The reason behind it (from #29): Report #1 is the median number of sites a user logs into with Persona. As part of migrating to CouchDB as the backend (#27), finding the median of the data series becomes a significantly harder technical challenge. (To do it in a map/reduce framework requires a quick-select algorithm, which there doesn't seem to be a good way to do in CouchDB.) Alternately, the median value for each day could be precalculated when data arrives and then stored in the database. However, this would require either a new database (cumbersome) or a change to the data format and code of the current one (very undesirable). Calculating the mean of the dataset, however, is much easier. While the median is a more sensible value to look at (it is less sensitive to outliers), it has been agreed, before, that this entire report is not hugely meaningful. The median value itself doesn't really say anything. The only way we'd use it is to watch the number and hope it trends up. In that case, however, the mean is just about as good: we can look at it and watch its trend.
After being toggled off, then toggled on, segments in the "sites logged in" and "sign-in attempts" reports weren't shown. The issue was occurring because, in the process of updating displayed segments, the series in the report object were being overwritten with the "filtered" series — the ones with the segment removed. Thus, this data was being lost permanently. To fix this, the full (unfiltered) series are stored in a temporary variable before the graph is updated; then they are restored. This regression was introduced in fa809ba.
"Known" means the ones listed in the config file. Fixes the way we populate the database to be consistent with how data aggregation used to happen.
Segmentation by OS
Prior process: On every request, data was downloaded, the report was prepared, then sent back to the user. Now: Separately from the server, data is downloaded and stored in CouchDB. (This is done, manually for now, using the script server/bin/update.) The server sets up views in CouchDB that map/reduce the data into a form ready for the report. On user request, the data is retrieved from the appropriate view and sent back. Benefit: Data download and report calculation only needs to happen once. Scope: This commit only switches over the "new user flow over time" report. This is the first major part of implementing #10.
This means: - The fake data server generates data using that format. - The data downloaded from the server is expected to be in that format too The difference: KPIggybank's wraps the payload of KPI data into an object that has an ID and where the data itself is under the "value" field. (This is how the data is extracted from CouchDB.)
To make the new user flow over time report (336dfa1) less noisy.