# Signpost Article Linked Views

Sometimes *Signpost* articles get linked to by mainstream media (usually tech media). This short companion notebook studies what happens when this happens.

In [30]:
from pageviews import PageviewsClient
import urllib
import mwapi
import arrow
import datetime


def viewcounts(article_name, start=None, end=None):
    """
    Fetches the viewcounts.
    """
    article_name = article_name.replace(' ', '_')
    parsed_article_name = urllib.parse.quote(article_name).replace('/', '%2F')
    p = PageviewsClient().article_views("en.wikipedia",
                                        [parsed_article_name],
                                        access="all-access",
                                        # access="users",
                                        granularity="daily",
                                        start=start,
                                        end=end)
    counts = {key: p[key][article_name] for key in p.keys()}
    return [p[key][article_name] for key in sorted(p.keys())]
    # TODO: Fix be pre-padding with 0s if output is below 15 length.

In [15]:
import json
import plotly.plotly as py
import plotly.graph_objs as go
import plotly.tools as tls

# Set my plotly credentials.
data = json.load(open('plotly_credentials.json'))['credentials']
tls.set_credentials_file(username=data['username'], api_key=data['key'])

In [41]:
example_viewcounts = viewcounts("Wikipedia:Wikipedia Signpost/2015-12-09/Op-ed", start="2015121000", end ="2015123000")

trace1 = go.Scatter(
    x = [(arrow.get("2015-12-10-00") + datetime.timedelta(days=n)).format("YYYY-MM-DD") for n in range(0, 20)],
    
    y = example_viewcounts,
    mode = 'lines',
    name = 'article views'
)

layout = go.Layout(
    title='Typical Signpost Viewership Curve'
)

data = [trace1]

fig = go.Figure(data=data, layout=layout)
py.iplot(fig, filename='example-signpost-views-curve')

This is the viewership curve for a typical *Signpost* article, the fairly popular op-ed "[Wikidata: Knowledge from different points of view](https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2015-12-09/Op-ed)", from pre-publication to publication to post-publication. It's a pattern that's representative of what most *Signpost* viewership curves look like.

In [40]:
example_viewcounts = viewcounts("Wikipedia:Wikipedia Signpost/2016-01-13/News and notes", start="2016011500", end ="2016013000")

trace1 = go.Scatter(
#     x = list(range(len(example_viewcounts))),
    x = [(arrow.get("2016-01-15-00") + datetime.timedelta(days=n)).format("YYYY-MM-DD") for n in range(0, 14)],
    y = example_viewcounts,
    mode = 'lines',
    name = 'article views'
)

layout = go.Layout(
    title='Not A Typical Signpost Viewership Curve'
)

data = [trace1]

fig = go.Figure(data=data, layout=layout)
py.iplot(fig, filename='example-linked-signpost-views-curve')

What happened to "[Community objects to board trustee](https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2016-01-13/News_and_notes)" here? Actually this graph is a tale of two curves: the original peak is the Wikipedia community (e.g. the *Signpost's* usual readers) picking up on the story immediately after publication; the much large peak is the mainstream media picking up on the news. What was already an extremely popular article by *Signpost* standards absolutely exploded when the story was picked up and recirculated by several outlets linking back to the *Signpost*: [ZDNet](http://www.zdnet.fr/actualites/un-membre-du-ca-de-la-fondation-wikimedia-mis-en-cause-pour-son-passe-chez-google-39831606.htm), [The Register](http://www.theregister.co.uk/2016/01/27/trust_me_pleads_wikipedia_former_google_man/), [The Register](http://www.theregister.co.uk/2016/01/28/wmf_geshuri_steps_down/?mt=1454029117421), and, most critically and centrally, [Fortune](http://fortune.com/2016/01/26/wikipedia-board-geshuri/).

Let's look at a few more like this: the ones that we're aware of, at least. It's helpful for our purposes to compare how we were linked and from where to how much we got out of it.

Since the pageview API data doesn't go back very far we also first need to define a similar API caller against the older, venerable [stats.grok.se](http://stats.grok.se/) API.

In [None]:
https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2015-01-21/Anniversary
http://quarry.wmflabs.org/query/7131

In [None]:
def grok_viewcounts(article)

## Further examples

In [None]:
example_viewcounts = viewcounts("Wikipedia:Wikipedia Signpost/2016-01-13/News and notes", start="2016011500", end ="2016013000")

trace1 = go.Scatter(
#     x = list(range(len(example_viewcounts))),
    x = [(arrow.get("2016-01-15-00") + datetime.timedelta(days=n)).format("YYYY-MM-DD") for n in range(0, 14)],
    y = example_viewcounts,
    mode = 'lines',
    name = 'article views'
)

layout = go.Layout(
    title='Not A Typical Signpost Viewership Curve'
)

data = [trace1]

fig = go.Figure(data=data, layout=layout)
py.iplot(fig, filename='example-2-linked-signpost-views-curve')