In [1]:
import wmfdata as wmf

## Non-main-namespace page views
In July 2022, we fixed a Wikipedia Preview bug ([T313596](https://phabricator.wikimedia.org/T313596)) which meant that previews of non-main-namespace pages were not counted.

Now that the Wikimedia sites that were linking to a lot of such pages have been updated to the new version, does the data look more correct? We would expect to see a _decrease_ in the clickthrough rate.

Old data:


In [2]:
wmf.presto.run("""
    SELECT
        referer_host,
        device_type,
        SUM(previews) AS previews,
        SUM(pageviews) AS pageviews,
        CAST(SUM(pageviews) AS REAL) / CAST(SUM(previews) AS REAL) AS clickthrough_rate
    FROM wmf_product.wikipediapreview_stats
    WHERE
        year = 2022
        AND month = 6
        AND referer_host IN ('diff.wikimedia.org', 'wikimediafoundation.org')
    GROUP BY
        referer_host,
        device_type
    ORDER BY
        referer_host,
        device_type
""")

Unnamed: 0,referer_host,device_type,previews,pageviews,clickthrough_rate
0,diff.wikimedia.org,non-touch,2740,16,0.005839
1,diff.wikimedia.org,touch,493,298,0.604463
2,wikimediafoundation.org,non-touch,2958,27,0.009128
3,wikimediafoundation.org,touch,589,1136,1.928693


New data:

In [2]:
wmf.presto.run("""
    SELECT
        referer_host,
        device_type,
        SUM(previews) AS previews,
        SUM(pageviews) AS pageviews,
        CAST(SUM(pageviews) AS REAL) / CAST(SUM(previews) AS REAL) AS clickthrough_rate
    FROM wmf_product.wikipediapreview_stats
    WHERE
        year = 2023
        AND month = 2
        AND referer_host IN ('diff.wikimedia.org', 'wikimediafoundation.org')
    GROUP BY
        referer_host,
        device_type
    ORDER BY
        referer_host,
        device_type
""")

Unnamed: 0,referer_host,device_type,previews,pageviews,clickthrough_rate
0,diff.wikimedia.org,non-touch,4201,20,0.004761
1,diff.wikimedia.org,touch,416,234,0.5625
2,wikimediafoundation.org,non-touch,7976,50,0.006269
3,wikimediafoundation.org,touch,1812,995,0.549117


Well, unfortunately, neither the new data nor the old looks correct. But most of that is because the most common pageview path on non-touch devices is not instrumented in the version these sites are using ([T317171](https://phabricator.wikimedia.org/T317171)). And we do see a substantial increase in previews at wikimediafoundation.org which has some particularly prominent non-main-namespace links, so it seems like yes, we are seeing the effects of the fix.