# Differentiate between events emitted from the Reply Tool and the New Discussion Tool

[Task](https://phabricator.wikimedia.org/T265099)

The EditAttemptStep schema's existing init_type field will be used to differentiate between events emitted from the Reply Tool and the New Discussion Tool.

Events from the Reply Tool and New Discussion Tool should be logged as follows:

* Reply Tool events: event.action = 'init', event.integration = 'discussiontools', event.init_type = 'page'
* New Discussion Tool events: event.action = 'init', event.integration = 'discussiontools', event.init_type = 'section'

The change to the the `init_type` field was made on 12 January 2021.

In [13]:
shhh <- function(expr) suppressPackageStartupMessages(suppressWarnings(suppressMessages(expr)))
shhh({
    library(magrittr); library(zeallot); library(glue); library(tidyverse); library(zoo); library(lubridate)
    library(scales)
})

In [87]:
# Collect init events by discussion tool type
query <-
"
SELECT 
  CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')) as date,
  wiki AS wiki,
  event.editing_session_id AS session_id,
  event.platform as platform,
  event.editor_interface as interface,
  event.init_mechanism as init_mechanism,
  IF(event.init_type = 'section', 'new discussion tool', 'reply tool') as dt_type,
  COUNT(*) as n_events
FROM event.editattemptstep
WHERE
  event.action = 'init'
  AND event.integration = 'discussiontools'
  AND year = 2021
  AND dt >= '2021-01-01'
GROUP BY
  CONCAT(year, '-', LPAD(month, 2, '0'), '-', LPAD(day, 2, '0')),
  wiki, 
  event.editing_session_id,
  event.init_mechanism,
  event.platform,
  event.editor_interface,
  IF(event.init_type = 'section', 'new discussion tool', 'reply tool') 
"

In [88]:
collect_init_events <- wmfdata::query_hive(query)

Don't forget to authenticate with Kerberos using kinit



## Reply Tool vs New Discussion Tool Events by Date 

In [89]:
dt_events_bytype <- collect_init_events %>%
    group_by(date, dt_type) %>%
    summarise(total_events = sum(n_events))

dt_events_bytype

`summarise()` regrouping output by 'date' (override with `.groups` argument)



date,dt_type,total_events
<chr>,<chr>,<int>
2021-01-01,reply tool,80
2021-01-02,reply tool,83
2021-01-03,reply tool,91
2021-01-04,reply tool,78
2021-01-05,reply tool,89
2021-01-06,reply tool,76
2021-01-07,reply tool,68
2021-01-08,reply tool,64
2021-01-09,reply tool,59
2021-01-10,reply tool,79


Both reply and new discussion tool events are being logged and it possible to differentiate based on the `init_type`. There are fewer `init_type = section` events as these are associated with the new discussion tool, which has not been deployed as long as the reply tool events. 

A total of 24 new discussion tool events have been logged since 21 January 2021 as expected.


## Reply Tool vs New Discussion Tool Events by Platform and Editor Interface

In [90]:
dt_events_byplatform <- collect_init_events %>%
    group_by(dt_type, platform, interface) %>%
    summarise(total_events = sum(n_events))

dt_events_byplatform

`summarise()` regrouping output by 'dt_type', 'platform' (override with `.groups` argument)



dt_type,platform,interface,total_events
<chr>,<chr>,<chr>,<int>
new discussion tool,desktop,visualeditor,8
new discussion tool,desktop,wikitext,16
reply tool,desktop,visualeditor,827
reply tool,desktop,wikitext,1227


Events are recorded for both visualeditor and wikitext and only on platform as expected.

##  New Discussion Tool Events and Unique Sessions by Wiki

In [91]:
dt_events_bywiki <- collect_init_events %>%
    filter(dt_type == "new discussion tool") %>%
    group_by(dt_type, wiki) %>%
    summarise(total_events = sum(n_events),
             distinct_sessions = n_distinct(session_id))

dt_events_bywiki

`summarise()` regrouping output by 'dt_type' (override with `.groups` argument)



dt_type,wiki,total_events,distinct_sessions
<chr>,<chr>,<int>,<int>
new discussion tool,cswiki,7,7
new discussion tool,cswikinews,1,1
new discussion tool,enwiki,16,16


In [None]:
New discussion tool events have been recorded on enwiki, cswikinews, and cswiki.

## Reply Tool vs New Discussion Tool Events by Init Mechanism

In [54]:
dt_events_bymechanism <- collect_init_events %>%
    group_by(dt_type, init_mechanism) %>%
    summarise(total_events = sum(n_events))

dt_events_bymechanism 

`summarise()` regrouping output by 'dt_type' (override with `.groups` argument)



dt_type,init_mechanism,total_events
<chr>,<chr>,<int>
new discussion tool,click,24
reply tool,click,2048


Both the new discussion tool and reply tool events to date have been recorded as click events; however, `Init_mechanism` is not not needed to distinguish these two event types so this is fine. Changes will be needed to track new section events using the existing workflow, which will be done as part of [T272544](https://phabricator.wikimedia.org/T272544)

## Reply Tool vs New Discussion Tool Edit Completion Rate

Check to make sure it will be possible to calculate edit completion rate for each tool type, which is one of the key metrics for this tool.

In [71]:
query <- 
"WITH init_sessions AS (
--first find all dt and reply tool events based on init type
SELECT 
  event.editing_session_id AS session_id,
  IF(event.init_type = 'section', 'new discussion tool', 'reply tool') as dt_type,
  wiki AS wiki
FROM event.editattemptstep
WHERE
  year = 2021 
  AND dt >= '2021-01-12'  -- when instrumetation was deployed
  AND event.action = 'init'
  AND event.integration= 'discussiontools'
)

-- Find associated savesuccess events
SELECT
  eas.event.user_editcount AS edit_count,
  eas.event.user_id AS user,
  init_sessions.dt_type as dt_type,
  eas.event.editing_session_id AS session_id,
  eas.wiki AS wiki,
  COUNT(*) AS save_events
FROM event.editattemptstep eas
INNER JOIN
    init_sessions 
    ON eas.event.editing_session_id = init_sessions.session_id 
    AND eas.wiki = init_sessions.wiki
WHERE
  year = 2021 
-- events since deployment date
  AND dt >= '2021-01-12'
  AND eas.event.action = 'saveSuccess'
  AND eas.event.integration= 'discussiontools'
-- remove anonymous users
  AND eas.event.user_id != 0
GROUP BY 
  eas.event.user_id,
  init_sessions.dt_type,
  eas.event.user_editcount,
  eas.event.editing_session_id,
  eas.wiki
"

In [72]:
collect_savesuccess_events <- wmfdata::query_hive(query)

Don't forget to authenticate with Kerberos using kinit



In [99]:
dt_save_events_bytype <- collect_savesuccess_events %>%
    group_by ( dt_type)  %>%
    summarize (num_save_sessions = n_distinct(session_id),
              num_save_events = sum(save_events))

dt_save_events_bytype

`summarise()` ungrouping output (override with `.groups` argument)



dt_type,num_save_sessions,num_save_events
<chr>,<int>,<int>
new discussion tool,6,6
reply tool,856,856


In [100]:
new_dt_save_events_bywiki <- collect_savesuccess_events %>%
    filter(dt_type == 'new discussion tool') %>%
    group_by (wiki, dt_type)  %>%
    summarize (num_save_sessions = n_distinct(session_id))

new_dt_save_events_bywiki

`summarise()` regrouping output by 'wiki' (override with `.groups` argument)



wiki,dt_type,num_save_sessions
<chr>,<chr>,<int>
cswiki,new discussion tool,1
enwiki,new discussion tool,5


A total of 6 new discussion tool sessions met `saveSuccess`. These are the same wikis where new discussion init events were also logged.
