# Post deployment QA: Wikitext no JS instrumentation

[Task](https://phabricator.wikimedia.org/T281409)

In [23]:
shhh <- function(expr) suppressPackageStartupMessages(suppressWarnings(suppressMessages(expr)))
shhh({
    library(tidyverse); library(wmfdata)
})

# Post-Deployment QA

## Confirm that events are logging in VEFU as expected

In [19]:
# collect May VEFU events
query <-
"
SELECT
    TO_DATE(dt) as `date`,
    event.editingSessionID AS session_id,
    event.integration AS integration,
    event.editor_interface AS interface,
    event.platform AS platform,
    event.user_id AS user_id,
    event.feature AS feature,
    event.action AS action,
    wiki AS wiki,
    COUNT(*) AS num_events
FROM event.visualeditorfeatureuse
WHERE
    YEAR = 2021
    AND MONTH = 05
    AND useragent.is_bot = false 
GROUP BY
    TO_DATE(dt),
    event.editingSessionID,
    event.integration,
    event.editor_interface,
    event.feature,
    event.action,
    event.user_id,
    event.platform,
    wiki
"



In [20]:
wikitext_js_edits <-  wmfdata::query_hive(query)

Don't forget to authenticate with Kerberos using kinit



In [61]:
# convert date
wikitext_js_edits$date <- as.Date(wikitext_js_edits$date)

## By Date

In [34]:
# Daily events
wikitext_js_edits_bydate <- wikitext_js_edits %>%
    filter(action == 'source-has-js') %>%
    group_by(date) %>%
    summarize(n_events = sum(num_events),
             n_sessions = n_distinct(session_id))

wikitext_js_edits_bydate

`summarise()` ungrouping output (override with `.groups` argument)



date,n_events,n_sessions
<chr>,<int>,<int>
2021-05-12,82,82
2021-05-13,12327,12237


We have recorded a total of 12,409 events and 12,2389 sessions by users without JS to date (12 May to 13 May 2021). We started recording events on 12 May 2021 following the deployment of the fix to the instrumentation. This QA was done towards the end of 13 May 2021 but we have not yet recorded a full day of events

## Check that events are only recorded for the `mwSave` feature

In [31]:
wikitext_js_edits_byfeature <- wikitext_js_edits %>%
    filter(action == 'source-has-js') %>%
    group_by(feature) %>%
    summarize(n_events = sum(num_events),
             n_sessions = n_distinct(session_id))

wikitext_js_edits_byfeature

`summarise()` ungrouping output (override with `.groups` argument)



feature,n_events,n_sessions
<chr>,<int>,<int>
mwSave,12409,12319


Confirmed that events are only recorded for the mwSave feature.

## All mwSave Actions during same time period

In [64]:
mwsave_events <- wikitext_js_edits %>%
    filter(feature == 'mwSave',
          date == '2021-05-13' ) %>% # first full day of no js events
    group_by(action) %>%
    summarize(n_events = sum(num_events),
             n_sessions = n_distinct(session_id))

mwsave_events

`summarise()` ungrouping output (override with `.groups` argument)



action,n_events,n_sessions
<chr>,<int>,<int>
checkbox-wpMinoredit,2350,2216
checkbox-wpReviewEdit,4,4
checkbox-wpWatchthis,435,347
dialog-abort,1385,1071
dialog-approve,378,240
dialog-preview,230,118
dialog-report,34,33
dialog-resolve,308,83
dialog-review,588,491
dialog-save,12081,10687


There were 10,687 dialog-save events and 12,237 no js save events recorded on May 13th. They should each be about as common so this appears correct - There are about 14% more no js save events. 

## By Platform and Interface

In [57]:
wikitext_js_edits_byplatform <- wikitext_js_edits %>%
    filter(action == 'source-has-js') %>%
    group_by(platform, interface) %>%
    summarize(n_events = sum(num_events))

wikitext_js_edits_byplatform

`summarise()` regrouping output by 'platform' (override with `.groups` argument)



platform,interface,n_events
<chr>,<chr>,<int>
desktop,wikitext,12409


Confirmed that we only record events for wikitext. No events have been recorded on mobile.

I checked to see if any mobile events were recorded for dialog-save on mobile to get an idea of a normal level of mobile save events we tend to see.

## Compare Dialog-Save and Source-has-js Events by platform

In [72]:
dialog_save_byfeature <- wikitext_js_edits %>%
    filter(action %in% c('source-has-js', 'dialog-save'),
          date == '2021-05-13') %>%
    group_by( platform, action) %>%
    summarize(n_events = sum(num_events)) 

dialog_save_byfeature

`summarise()` regrouping output by 'platform' (override with `.groups` argument)



platform,action,n_events
<chr>,<chr>,<int>
desktop,dialog-save,2916
desktop,source-has-js,12327
phone,dialog-save,9165


There were 9,165 dialog-save events on mobile during the same time. Need to confirm if this is an instrumentation error or reflect real user behavior.

## Wiki

In [45]:
wikitext_js_edits_bywiki <- wikitext_js_edits %>%
    filter(action == 'dialog-save') %>%
    group_by(wiki) %>%
    summarize(n_events = sum(num_events),
             n_sessions = n_distinct(session_id)) %>%
    arrange(desc(n_events))

head(wikitext_js_edits_bywiki, 10)

`summarise()` ungrouping output (override with `.groups` argument)



wiki,n_events,n_sessions
<chr>,<int>,<int>
enwiki,55911,49936
fawiki,17294,15177
eswiki,15217,13504
frwiki,9717,8747
arwiki,9692,6123
itwiki,7888,7138
ruwiki,7807,7031
jawiki,7768,7121
ptwiki,5854,5035
dewiki,5112,4708


We've recorded events for 233 wiki projects. The larger wikis (English, German, and French) have the most total events logged as expected. A look at percentage of edits by js users for each language (which will be done in the analysis) will provide more info on how frequently non-js edits occur.

## By User 

In [41]:
wikitext_js_edits_users <- wikitext_js_edits %>%
    filter(action == 'source-has-js',
          user_id != 0) %>% # remove anon users
    group_by(wiki) %>%
    summarize(n_users = n_distinct(user_id)) %>%
    arrange(desc(n_users))

head(wikitext_js_edits_users, 10)

`summarise()` ungrouping output (override with `.groups` argument)



wiki,n_users
<chr>,<int>
enwiki,1781
dewiki,466
commonswiki,319
frwiki,297
ruwiki,244
jawiki,212
eswiki,205
zhwiki,195
itwiki,179
nlwiki,97


Confirmed events are recorded with associated userid info. The wikis with the most events correspond with the wikis with the most users with some variation.

## By Logged In status

In [50]:
wikitext_js_edits_users <- wikitext_js_edits %>%
    filter(action == 'dialog-save') %>%
    mutate(is_anon = ifelse(user_id == 0, "logged-out", "logged-in")) %>%
    group_by(is_anon) %>%
    summarize(n_sessions = n_distinct(session_id),
            n_users = n_distinct(user_id))

head(wikitext_js_edits_users, 10)

`summarise()` ungrouping output (override with `.groups` argument)



is_anon,n_sessions,n_users
<chr>,<int>,<int>
logged-in,116363,28415
logged-out,58590,1


The majority (85% of sessions) are recorded by logged-in users. Around 15% of all sessions with edits completed without js are logged-out. For comparison, 66% of all sessions with dialog-save events are by logged-in users. 

Note: The one user identifed for logged-out is because all logged-out users are given a user_id of 0. This is expected.