# Post deployment QA: Wikitext no JS instrumentation

[Task](https://phabricator.wikimedia.org/T281409)

In [5]:
shhh <- function(expr) suppressPackageStartupMessages(suppressWarnings(suppressMessages(expr)))
shhh({
    library(tidyverse); library(wmfdata)
})

# Post-Deployment QA

## Confirm that events are logging in VEFU as expected

In [31]:
# collect May VEFU events
query <-
"
SELECT
    TO_DATE(dt) as `date`,
    event.editingSessionID AS session_id,
    event.integration AS integration,
    event.editor_interface AS interface,
    event.platform AS platform,
    event.user_id AS user_id,
    event.feature AS feature,
    event.action AS action,
    event.is_oversample,
    wiki AS wiki,
    COUNT(*) AS num_events
FROM event.visualeditorfeatureuse
WHERE
    YEAR = 2021
    AND MONTH = 05
    AND useragent.is_bot = false 
GROUP BY
    TO_DATE(dt),
    event.editingSessionID,
    event.integration,
    event.editor_interface,
    event.is_oversample,
    event.feature,
    event.action,
    event.user_id,
    event.platform,
    wiki
"



In [32]:
wikitext_js_edits <-  wmfdata::query_hive(query)

Don't forget to authenticate with Kerberos using kinit



In [33]:
# convert date
wikitext_js_edits$date <- as.Date(wikitext_js_edits$date)

## By Date

In [10]:
# Daily events
wikitext_js_edits_bydate <- wikitext_js_edits %>%
    filter(action == 'source-has-js') %>%
    group_by(date) %>%
    summarize(n_events = sum(num_events),
             n_sessions = n_distinct(session_id))

wikitext_js_edits_bydate

`summarise()` ungrouping output (override with `.groups` argument)



date,n_events,n_sessions
<date>,<int>,<int>
2021-05-12,82,82
2021-05-13,15411,15306
2021-05-14,14663,14588
2021-05-15,14944,14869
2021-05-16,15703,15630
2021-05-17,15305,15191
2021-05-18,15651,15549
2021-05-19,15384,15252
2021-05-20,15471,15380
2021-05-21,15347,15257


We have recorded a total of 25,492 events and 25,336 sessions by users with JS to date (12 May to 14 May 2021). We started recording events on 12 May 2021 following the deployment of the fix to the instrumentation. This QA was done towards the end of 13 May 2021 but we have not yet recorded a full day of events

## Check that events are only recorded for the `mwSave` feature

In [11]:
wikitext_js_edits_byfeature <- wikitext_js_edits %>%
    filter(action == 'source-has-js') %>%
    group_by(feature) %>%
    summarize(n_events = sum(num_events),
             n_sessions = n_distinct(session_id),
            n_users = n_distinct(user_id))

wikitext_js_edits_byfeature

`summarise()` ungrouping output (override with `.groups` argument)



feature,n_events,n_sessions,n_users
<chr>,<int>,<int>,<int>
mwSave,225697,224320,37517


Confirmed that events are only recorded for the mwSave feature.

## All mwSave Actions during same time period

In [12]:
mwsave_events <- wikitext_js_edits %>%
    filter(feature == 'mwSave',
          date == '2021-05-13' ) %>% # first full day of no js events
    group_by(action) %>%
    summarize(n_events = sum(num_events),
             n_sessions = n_distinct(session_id))

mwsave_events

`summarise()` ungrouping output (override with `.groups` argument)



action,n_events,n_sessions
<chr>,<int>,<int>
checkbox-wpMinoredit,2949,2781
checkbox-wpReviewEdit,5,5
checkbox-wpWatchthis,568,456
dialog-abort,1768,1370
dialog-approve,512,316
dialog-preview,301,155
dialog-report,57,53
dialog-resolve,380,102
dialog-review,812,673
dialog-save,15258,13531


There were 10,687 dialog-save events and 12,237 js save events recorded on May 13th. They should each be about as common so this appears correct - There are about 14% more js save events. 

## By Platform and Interface

In [13]:
wikitext_js_edits_byplatform <- wikitext_js_edits %>%
    filter(action == 'source-has-js') %>%
    group_by(platform, interface) %>%
    summarize(n_events = sum(num_events))

wikitext_js_edits_byplatform

`summarise()` regrouping output by 'platform' (override with `.groups` argument)



platform,interface,n_events
<chr>,<chr>,<int>
desktop,wikitext,225697


Confirmed that we only record events for wikitext. No events have been recorded on mobile.

I checked to see if any mobile events were recorded for dialog-save on mobile to get an idea of a normal level of mobile save events we tend to see.

## Compare Dialog-Save and Source-has-js Events by platform

In [14]:
dialog_save_byfeature <- wikitext_js_edits %>%
    filter(action %in% c('source-has-js', 'dialog-save'),
          date == '2021-05-13') %>%
    group_by( platform, action) %>%
    summarize(n_events = sum(num_events)) 

dialog_save_byfeature

`summarise()` regrouping output by 'platform' (override with `.groups` argument)



platform,action,n_events
<chr>,<chr>,<int>
desktop,dialog-save,3665
desktop,source-has-js,15411
phone,dialog-save,11593


There were 9,165 dialog-save events on mobile during the same time. Need to confirm if this is an instrumentation error or reflect real user behavior.

Update: Confirmed that this is expexcted. WikiEditor only logs events as being desktop. See [T249944](https://phabricator.wikimedia.org/T249944) for a discussion of the issue.

## Wiki

In [15]:
wikitext_js_edits_bywiki <- wikitext_js_edits %>%
    filter(action == 'dialog-save') %>%
    group_by(wiki) %>%
    summarize(n_events = sum(num_events),
             n_sessions = n_distinct(session_id)) %>%
    arrange(desc(n_events))

head(wikitext_js_edits_bywiki)

`summarise()` ungrouping output (override with `.groups` argument)



wiki,n_events,n_sessions
<chr>,<int>,<int>
enwiki,119230,106909
eswiki,32454,28936
fawiki,27368,23868
arwiki,20893,12905
frwiki,19896,17970
itwiki,16700,15113


We've recorded events for 233 wiki projects. The larger wikis (English, German, and French) have the most total events logged as expected. A look at the percentage of all edits by js vs non js users for each project (which will be done in the analysis) will provide more info on how frequently non-js edits occur.

## By User 

In [16]:
wikitext_js_edits_users <- wikitext_js_edits %>%
    filter(action == 'source-has-js',
          user_id != 0) %>% # remove anon users
    group_by(wiki) %>%
    summarize(n_users = n_distinct(user_id)) %>%
    arrange(desc(n_users))

head(wikitext_js_edits_users, 10)

`summarise()` ungrouping output (override with `.groups` argument)



wiki,n_users
<chr>,<int>
enwiki,14093
dewiki,2714
commonswiki,2368
frwiki,1873
jawiki,1686
eswiki,1526
ruwiki,1458
zhwiki,1257
itwiki,1019
ptwiki,639


Confirmed events are recorded with associated userid info. The wikis with the most events correspond with the wikis with the most users with some variation.

## By Logged In status

In [17]:
wikitext_js_edits_users <- wikitext_js_edits %>%
    filter(action == 'dialog-save') %>%
    mutate(is_anon = ifelse(user_id == 0, "logged-out", "logged-in")) %>%
    group_by(is_anon) %>%
    summarize(n_sessions = n_distinct(session_id),
            n_users = n_distinct(user_id))

head(wikitext_js_edits_users, 10)

`summarise()` ungrouping output (override with `.groups` argument)



is_anon,n_sessions,n_users
<chr>,<int>,<int>
logged-in,238202,51690
logged-out,123450,1


The majority (66% of sessions) are recorded by logged-in users. Around 33% of all sessions with edits completed with js are logged-out. This is comparable to the trends seen for dialog-save events as well. 

Note: The one user identifed for logged-out is because all logged-out users are given a user_id of 0. This is expected.

In [None]:
# By Integration

In [18]:
wikitext_js_edits_byintegration <- wikitext_js_edits %>%
    filter(action == 'source-has-js') %>%
    group_by(integration) %>%
    summarize(n_sessions = n_distinct(session_id),
            n_users = n_distinct(user_id))

wikitext_js_edits_byintegration

`summarise()` ungrouping output (override with `.groups` argument)



integration,n_sessions,n_users
<chr>,<int>,<int>
page,224320,37517


All events are recorded for page integration (e.g. no discussiontool, flow or app events)

# Join with EditAttemptStep to confirm edits are saved

In [21]:

query <-
"
--- collect all saved wikitext edits
WITH edits_saved AS(
SELECT DISTINCT
    event.editing_session_id AS session_id
FROM event.editattemptstep
WHERE 
    YEAR = 2021
    AND month = 05
    AND useragent.is_bot = false 
    AND event.action = 'saveSuccess'
    AND event.editor_interface = 'wikitext'
)
SELECT
    TO_DATE(dt) as `date`,
    event.editingSessionID AS session_id,
    event.user_id AS user_id,
    wiki AS wiki,
    IF(edits_saved.session_id is NULL, 'not_saved', 'saved') AS is_saved,
    COUNT(*) AS num_events
FROM event.visualeditorfeatureuse vefu
LEFT JOIN edits_saved
ON  event.editingSessionID = edits_saved.session_id
WHERE
    YEAR = 2021
    AND MONTH = 05
    AND vefu.useragent.is_bot = false
    AND vefu.event.action = 'source-has-js'
    AND vefu.event.feature = 'mwSave'
    AND vefu.event.editor_interface = 'wikitext'
GROUP BY
    TO_DATE(dt),
    event.editingSessionID,
    event.user_id,
    wiki,
    IF(edits_saved.session_id is NULL, 'not_saved', 'saved')
"


In [22]:
wikitext_js_edits_completed <-  wmfdata::query_hive(query)

Don't forget to authenticate with Kerberos using kinit



In [None]:
# check to make sure all recorded 'source-has-js' events are marked as save in EditAttemptStep

In [23]:
check_unsaved_events <- wikitext_js_edits_completed %>%
    filter(is_saved == 'not_saved')

check_unsaved_events

date,session_id,user_id,wiki,is_saved,num_events
<chr>,<chr>,<int>,<chr>,<chr>,<int>


Confirmed that all no js events recorded in VEFU are recorded in EditAttemptStep as saved. Also, serves as confirmation that I can use EditAttemptStep to identify percent of wikitext edits from users with js enabled.

# Test Query to Determine % of JS Edits 

In [29]:
# rought query to confirm approach
# will be refined later
query <-
"
WITH js_edits AS(
  SELECT
    event.editingSessionid AS session_id,
    event.user_id As `user`,
    wiki AS wiki,
    COUNT(*) AS num_js_events
FROM event.visualeditorfeatureuse vefu
WHERE
    YEAR = 2021
    AND MONTH = 05
    AND Day >= 13
    AND useragent.is_bot = false
    AND event.action = 'source-has-js'
    AND event.feature = 'mwSave'
    AND event.editor_interface = 'wikitext'
    AND event.platform = 'desktop'
    AND event.integration = 'page'
GROUP BY
    event.editingSessionID,
    event.user_id,
    wiki
)

SELECT 
    eas.event.editing_session_id AS session_id,
    eas.event.user_id As `user`,
    eas.wiki AS wiki,
    IF(js_edits.session_id IS NULL, 'no_js', 'js') AS js_used,
    COUNT(*) as num_events
FROM event.editattemptstep eas
LEFT JOIN js_edits
ON   event.editing_session_id  = js_edits.session_id
WHERE 
    YEAR = 2021
    AND month = 05
    AND Day >= 13
    AND useragent.is_bot = false 
    AND event.action = 'saveSuccess'
    AND event.editor_interface = 'wikitext'
    AND event.platform = 'desktop'
    AND event.integration = 'page'
GROUP BY
    eas.event.editing_session_id,
    eas.event.user_id,
    eas.wiki,
    IF(js_edits.session_id IS NULL, 'no_js', 'js')
   
  "
    

In [30]:
wikitext_all_edits_completed <-  wmfdata::query_hive(query)

Don't forget to authenticate with Kerberos using kinit



In [27]:
wikitext_all_edits_completed %>%
    group_by(js_used) %>%
    summarize(n_sessions = n_distinct(session_id),
             n_users = n_distinct(user))

`summarise()` ungrouping output (override with `.groups` argument)



js_used,n_sessions,n_users
<chr>,<int>,<int>
js,226536,37741
no_js,16784,2254


In [3]:
Confirmed approach will work.

Check changes to desktop and page

ERROR: Error in parse(text = x, srcfile = src): <text>:1:11: unexpected symbol
1: Confirmed approach
              ^
