# New Discussion Tool Contributor Opt-Out Analysis

## Data:

We reviewed the mediawiki [user_properties table](https://www.mediawiki.org/wiki/Manual:User_properties_table) to determine the current number of new discussion tool users that have the `discussiontools-betaenable` property currently disabled.

Some notes regarding this dataset:
* This reflects all current nondefault user preferences.  User property records are added to the database when they differ from their default value.
* Data reflects the current state and does not account for users that have changed this preference multiple times.
* There are contributors that have used the new discussion tool but don't have a preference set in the user properties table. Possible reasons for this include: (1) the user  disabled the setting by selecting 'restore all default preferences' in their user preferences or (2) the user enabled discussion tools in their global preferences but not in their local preferences. 

In [4]:
import pandas as pd
import numpy as np

import datetime as dt

from wmfdata import hive, mariadb

In [5]:
HIVE_SNAPSHOT = "2021-07"
START_OF_DATA = "2021-02-18"
END_OF_DATA = "2021-08-01"

## Collect new discussion tool contributors

In [6]:
#all users that made at least 1 edit with the new discussion tool since deployment

query = """

SELECT
    event_user_id as new_dt_user,
    wiki_db as wiki,
    CASE
        WHEN min(event_user_revision_count) < 100 THEN 'under 100'
        WHEN (min(event_user_revision_count) >=100 AND min(event_user_revision_count <= 500)) THEN '100-500'
        ELSE 'over 500'
        END AS edit_count_group
FROM wmf.mediawiki_history AS mh
WHERE 
    ARRAY_CONTAINS(revision_tags, 'discussiontools-newtopic') 
    AND snapshot = '2021-07' 
-- date first deployed
    AND event_timestamp >= '2021-02-18'  
    AND event_timestamp <= '2021-07-31'  
-- only on desktop
    AND NOT array_contains(revision_tags, 'iOS')
    AND NOT array_contains(revision_tags, 'Android')
    AND NOT array_contains(revision_tags, 'Mobile Web')
     -- find all edits on talk pages 
    AND page_namespace_historical % 2 = 1
    AND event_entity = 'revision' AND 
    event_type = 'create'
    AND event_user_is_anonymous = FALSE
GROUP BY
    event_user_id,
    wiki_db
"""

In [7]:
new_dt_user = hive.run(
    query.format(
        hive_snapshot = HIVE_SNAPSHOT,
        START_OF_DATA= START_OF_DATA,
        END_OF_DATA=END_OF_DATA
    )
)

In [9]:
#Total new dt users

Total_dt_users = new_dt_user['new_dt_user'].count()

print('Total number of new discussion users for whom we will be checking beta preferences:' , Total_dt_users)

Total number of new discussion users for whom we will be checking beta preferences: 5392


## New Discussion Tool Users Current Preference Status

In [10]:
#Querying user_properties for getting the discussion tools preference set by the new dt contributors we got in the above query

query='''
SELECT 
  up_value AS preference,
  up_user AS user
FROM user_properties
WHERE up_user in ({users})
AND up_property = "discussiontools-betaenable"
'''

In [11]:
# Looping through each wiki for the list of users for each skin

wikis = new_dt_user['wiki'].unique()
up_pref=list()
for wiki in wikis:
    user_ids = new_dt_user[new_dt_user['wiki'] == wiki]["new_dt_user"]
    user_list = ','.join([str(u) for u in user_ids])
    prefs = mariadb.run(
      query.format(users=user_list),
      wiki
    )
    up_pref.append(prefs)

pref= pd.concat(up_pref)

In [12]:
# Join with edit count data from mediawiki_history
new_dt_user_pref = new_dt_user.join(pref.set_index('user'), on = 'new_dt_user', how = 'left')

In [13]:
# covert skin column to string type 
new_dt_user_pref['preference'] = new_dt_user_pref['preference'].astype(str)

In [14]:
# rename values. 
pref_aliases = {
    "b\'0\'":"explicitly disabled",
    "b\'1\'":"explicitly enabled",
    "nan": "no local preference recorded"
}

new_dt_user_pref= new_dt_user_pref.replace({"preference": pref_aliases})

## Overall Opt-In Rate

In [15]:
new_dt_pref_overall= new_dt_user_pref[['preference', 'new_dt_user']].groupby('preference').count()

new_dt_pref_overall

Unnamed: 0_level_0,new_dt_user
preference,Unnamed: 1_level_1
explicitly disabled,327
explicitly enabled,3609
no local preference recorded,1466


In [16]:
pct_user_opt_rate =(100. * new_dt_pref_overall / new_dt_pref_overall.sum()).round(2).astype(str) + '%'
pct_user_opt_rate.sort_values(by=['new_dt_user'],ascending=False)

Unnamed: 0_level_0,new_dt_user
preference,Unnamed: 1_level_1
explicitly enabled,66.81%
explicitly disabled,6.05%
no local preference recorded,27.14%


Overall, only 6.05% of all new discussion tool users currently have the 'discussiontool-betaenable' preference explicitly set as disabled in the User Properties table. 

## By Experience Level

In [17]:
# calculate total enabled and disabled
new_dt_pref_byexp = new_dt_user_pref[['preference', 'edit_count_group' ,'new_dt_user']].groupby(['edit_count_group','preference']).count()

new_dt_pref_byexp

Unnamed: 0_level_0,Unnamed: 1_level_0,new_dt_user
edit_count_group,preference,Unnamed: 2_level_1
100-500,explicitly disabled,22
100-500,explicitly enabled,483
100-500,no local preference recorded,205
over 500,explicitly disabled,259
over 500,explicitly enabled,2357
over 500,no local preference recorded,939
under 100,explicitly disabled,46
under 100,explicitly enabled,769
under 100,no local preference recorded,322


In [18]:
pct_user_opt_rate_byexp =(100. * new_dt_pref_byexp/ new_dt_pref_byexp.groupby('edit_count_group').sum()).round(2).astype(str) + '%'
pct_user_opt_rate_byexp

Unnamed: 0_level_0,Unnamed: 1_level_0,new_dt_user
edit_count_group,preference,Unnamed: 2_level_1
100-500,explicitly disabled,3.1%
100-500,explicitly enabled,68.03%
100-500,no local preference recorded,28.87%
over 500,explicitly disabled,7.29%
over 500,explicitly enabled,66.3%
over 500,no local preference recorded,26.41%
under 100,explicitly disabled,4.05%
under 100,explicitly enabled,67.63%
under 100,no local preference recorded,28.32%


## Arabic and Czech Wikipedias

In [19]:
# filter to target wikis
target_wiki = ['arwiki', 'cswiki']
new_dt_user_pref_wiki= new_dt_user_pref[new_dt_user_pref['wiki'].isin(target_wiki)]


In [20]:
# calculate total enabled and disabled
new_dt_pref_wiki = new_dt_user_pref_wiki[['preference', 'wiki' ,'new_dt_user']].groupby(['wiki','preference']).count()

new_dt_pref_wiki

Unnamed: 0_level_0,Unnamed: 1_level_0,new_dt_user
wiki,preference,Unnamed: 2_level_1
arwiki,explicitly disabled,3
arwiki,explicitly enabled,59
cswiki,explicitly disabled,1
cswiki,explicitly enabled,29


In [21]:
pct_user_opt_rate_wiki =(100. * new_dt_pref_wiki/ new_dt_pref_wiki.groupby('wiki').sum()).round(2).astype(str) + '%'
pct_user_opt_rate_wiki

Unnamed: 0_level_0,Unnamed: 1_level_0,new_dt_user
wiki,preference,Unnamed: 2_level_1
arwiki,explicitly disabled,4.84%
arwiki,explicitly enabled,95.16%
cswiki,explicitly disabled,3.33%
cswiki,explicitly enabled,96.67%
