In [1]:
import pandas as pd
from wmfdata import hive, mariadb

You can find the source for `wmfdata` at https://github.com/neilpquinn/wmfdata


This query apparently overcounts monthly active administrators by several times.

In [2]:
maa_cur = hive.run("""
select 
    wiki as database_code,
    round(sum(monthly_active_administrators) / 12, 0) as monthly_active_administrators
from (
    select
        wiki_db as wiki,
        substr(log_timestamp, 1, 6) as month,
        count(distinct log_actor) as monthly_active_administrators
    from wmf_raw.mediawiki_logging
    where
        log_type in ("block", "protect", "delete", "rights") and
        log_timestamp >= "{start}" and
        log_timestamp < "{end}" and
        snapshot = "{snapshot}" and
        wiki_db in ("enwiki", "dewiki", "eswiki", "commonswiki")
    group by wiki_db, substr(log_timestamp, 1, 6)
) mae
group by wiki
""".format(start="201808", end="201908", snapshot="2019-07"))

In [44]:
maa_cur = maa_cur.set_index("database_code")
maa_cur.style.format("{:.0f}")

Unnamed: 0_level_0,monthly_active_administrators
database_code,Unnamed: 1_level_1
commonswiki,182
dewiki,448
eswiki,189
enwiki,1562


These issue seems to be that within the "delete", "move", and "rights" log types there are several actions that are widely done by non-admins, namely:
* `delete`, `delete_redir`: a user deleted a redirect by moving the page that it redirected to on top of it
* `move`, `move_prot`: a user moved a protected page (which regular users can do for semi-protected pages)
* `rights`, `autopromote`: a user was automatically promoted into a group after reaching an edit or time threshold

Explanation of some other non-obvious events:
* `delete`, `event`: delete (that is, hide) a log entry
* `delete`, `revision`: delete (that is, hide) a specific revision. The difference between this and suppression is that suppression hides it even from administrators.

In [4]:
mariadb.run("""
select distinct log_type, log_action
from logging
where log_type in ('block', 'protect', 'delete', 'rights')
""", "enwiki")

Unnamed: 0,log_type,log_action
0,block,block
1,block,reblock
2,block,unblock
3,delete,delete
4,delete,delete_redir
5,delete,event
6,delete,flow-delete-post
7,delete,flow-delete-topic
8,delete,flow-restore-post
9,delete,flow-restore-topic


In [5]:
maa_new = hive.run("""
select 
    wiki as database_code,
    sum(monthly_active_administrators) / 12 as monthly_active_administrators
from (
    select
        wiki_db as wiki,
        substr(log_timestamp, 1, 6) as month,
        count(distinct log_actor) as monthly_active_administrators
    from wmf_raw.mediawiki_logging
    where
        log_type in ("block", "delete", "protect", "rights") and
        -- Omit the "delete_redir", "move_prot", and "autopromote" actions, which can be done by regular users
        log_action not in ("autopromote", "delete_redir", "move_prot")
        log_timestamp >= "{start}" and
        log_timestamp < "{end}" and
        snapshot = "{snapshot}" and
    -- TEST
        wiki_db in ("enwiki", "dewiki", "eswiki", "commonswiki")
    group by wiki_db, substr(log_timestamp, 1, 6)
) mae
group by wiki
""".format(start="201808", end="201908", snapshot="2019-07"))

Excluding those actions, the numbers are much lower:

In [45]:
maa_new = maa_new.set_index("database_code")
maa_cur.merge(maa_new, on="database_code", suffixes=("_old", "_new")).style.format("{:.0f}")

Unnamed: 0_level_0,monthly_active_administrators_old,monthly_active_administrators_new
database_code,Unnamed: 1_level_1,Unnamed: 2_level_1
commonswiki,182,164
dewiki,448,132
eswiki,189,55
enwiki,1562,426


In [46]:
maa_sample = hive.run("""
select
    actor_name as user_name,
    coalesce(ug_group = "sysop", false) as is_admin,
    count(*) as admin_actions,
    wiki_db as wiki
from wmf_raw.mediawiki_logging log
inner join wmf_raw.mediawiki_private_actor actor
on
    log_actor = actor_id and
    log.wiki_db = actor.wiki_db and
    log.wiki_db in ("commonswiki", "dewiki", "enwiki", "eswiki") and
    log_type in ("block", "protect", "delete", "rights") and
    -- Regular users can move pages over redirects, move semi-protected pages, or be autopromoted
    log_action not in ("autopromote", "delete_redir", "move_prot") and
    log_timestamp between "201901" and "201902" and
    log.snapshot = "2019-07" and
    actor.snapshot = "2019-07"
left join wmf_raw.mediawiki_user_groups groups
on
    actor_user = ug_user and
    actor.wiki_db = groups.wiki_db and
    ug_group = "sysop" and
    groups.snapshot = "2019-07"
group by actor_name, log.wiki_db, ug_group
""")

Looking at the users captured by this definition in January 2019, the vast majority (93% at the English Wikipedia, 96%+ elsewhere) are currently administrators.

In [47]:
def admins_agg(group):
    aggs = {
        "active apparent admins": len(group),
        "true admin proportion": len(group.query("is_admin")) / len(group)
    }
    
    return pd.Series(aggs, index=aggs.keys())

maa_sample.groupby("wiki").apply(admins_agg)

Unnamed: 0_level_0,active apparent admins,true admin proportion
wiki,Unnamed: 1_level_1,Unnamed: 2_level_1
commonswiki,168.0,0.988095
dewiki,141.0,0.957447
enwiki,439.0,0.931663
eswiki,58.0,0.965517


The non-admins counted by our query fall into three groups
* **Former administrators**: 28bytes, Alex Shih, Ansh666, Ariel Cetrone (WMDC), Beetstra, Boing! said Zebedee, DaB., Deor, DoRD,  Enigmaman, Euryalus, Floquenbeam, Fram, GB fan, Gruznov, Kenny McFly, Kusma, Magiers, MSGJ, Nakon, Neozoon, Od Mishehu, Regiomontanus, Renamed user mou89p43twvqcvm8ut9w3, Spartaz, Voice of Clam, WJBscribe
* **Users with a limited right to change others' rights**: 
 * **Event coordinators on English Wikipedia**: Andrew Davidson, Another Believer, Delphine Dallison, Lirazelf, Wugapodes
 * **Image reviewer on Commons**: Nemo bis
* **Global maintainers**: 
 * **Stewards**: -revi, Ajraddatz, HakanIST
 * **Global deleter**: Pathoschild
 * **Global interface editing script**: MediaWiki default

In [73]:
non_admins = maa_sample.query("~is_admin").reset_index(drop=True)
non_admins.head()

Unnamed: 0,user_name,is_admin,admin_actions,wiki
0,-revi,False,3,enwiki
1,28bytes,False,3,enwiki
2,Ajraddatz,False,2,enwiki
3,Alex Shih,False,1,enwiki
4,Andrew Davidson,False,7,enwiki
