Skip to content

Add analytics pageview module, spec, and tests#343

Closed
zzmoss wants to merge 1 commit intowikimedia:masterfrom
zzmoss:master
Closed

Add analytics pageview module, spec, and tests#343
zzmoss wants to merge 1 commit intowikimedia:masterfrom
zzmoss:master

Conversation

@zzmoss
Copy link
Copy Markdown

@zzmoss zzmoss commented Sep 18, 2015

Bug on phabricator: T107053

These would be the frontend to backend mappings we need on all wikis (basically just making the {domain} map implicitly to the {project} parameter on our backend:

  /{module:stats}/views/v1/per-article/{access}/{agent}/{article}/{granularity}/{start}/{end}
    get:
      x-request-handler:
        - get_view_stats_per_article
            request:
              uri: http://wikimedia.org/stats/views/v1/per-article/{domain}/{access}/{agent}/{article}/{granularity}/{start}/{end}

  /{module:stats}/views/v1/per-project/{access}/{agent}/{granularity}/{start}/{end}
    get:
      x-request-handler:
        - get_view_stats_per_project
            request:
              uri: http://wikimedia.org/stats/views/v1/per-project/{domain}/{access}/{agent}/{granularity}/{start}/{end}

  /{module:stats}/views/v1/top/{access}/{year}/{month}/{day}
    get:
      x-request-handler:
        - get_view_top_stats
            request:
              uri: http://wikimedia.org/stats/views/v1/top/{domain}/{access}/{year}/{month}/{day}

And for the global wikimedia.org we could just pass whatever comes after stats/* straight through, there would be no rewriting. This is where people could pass values such as "all-projects" for the {project} parameter.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need to define a separate config.test.analytics.yaml, as it's not picked up anywhere, therefor tests are failing in travis. Just adding

    /{module:pageviews}:
        x-modules:
            - name: pageviews
              version: 1.0.0
              type: file

to the config.test.yaml would be enough.

@d00rman
Copy link
Copy Markdown
Contributor

d00rman commented Sep 21, 2015

I'm quite confused. This PR seems to aim at integrating the pageviews logic completely into the upstream RESTBase. Alas, you will have your own RESTBase instance set up in the analytics cluster.

If you want to make your module publicly available, I think it would better to package and publish it as a separate npm module and document it there (cf the Cassandra module and the SQLite module for examples of stand-alone RESTBase modules).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case a string is actually a enum, you could add a swagger enum to get a nice choice box in resulting docs.

@Pchelolo
Copy link
Copy Markdown
Contributor

One more confusion is that I didn't find any references to the backend analytics service, so it's not clear to me, how the data would be pulled into cassandra storage?

Also, please not that all new tests are failing on travis.

@d00rman
Copy link
Copy Markdown
Contributor

d00rman commented Sep 21, 2015

it's not clear to me, how the data would be pulled into cassandra storage?

To answer your question @Pchelolo and clarify further my previous comment, there will soon be an additional RESTBase instance in WMF production to fulfil Analitics team's needs (pageviews API); cf. Gerrit patch 231574 which adds its production configuration. The idea is that the principal RESTBase instance only exposes the public API and simply forwards all of the requests to the internal pageviews RESTBase instance. As such, the only thing that should be in this PR is the specification that exposes the public API endpoints and their connection to the pageviews RESTBase instance.

@d00rman
Copy link
Copy Markdown
Contributor

d00rman commented Sep 21, 2015

If you want to make your module publicly available, I think it would better to package and publish it as a separate npm module and document it there

Follow-up: AFAIC, I'm not against adding mods/pageviews.js (and related tests) per se. I do mind, however, putting additional example configs in the repo, as I really think this will become messy quite soon.

Regarding the public routes themselves, the v1 should be evicted, as the final URI will contain it twice. Also, as per T103811, the three routes have different scopes:

  • /{module:stats}/views/v1/per-article/{access}/{agent}/{article}/{granularity}/{start}/{end} should really be /{module:page}/title/{title}/stats/{access}/{agent}/{granularity}/{start}/{end} so that we end up with a URI like https://en.wikipedia.org/api/rest_v1/page/title/Foobar/stats/{access}/{agent}/{granularity}/{start}/{end}
  • /{module:stats}/views/v1/per-project/{access}/{agent}/{granularity}/{start}/{end} should become /{module:data}/stats/views/{access}/{agent}/{granularity}/{start}/{end}, which would effectively give clients an URI like https://en.wikipedia.org/api/rest_v1/data/stats/views/{access}/{agent}/{granularity}/{start}/{end}
  • /{module:stats}/views/v1/top/{access}/{year}/{month}/{day} should, similarly to the previous one become /{module:data}/stats/views/top/{access}/{year}/{month}/{day} and should be added only to the global, domain-independent domain spec (cf PR Mathoid: Add public and back-end endpoints #339 which proposes wikimedia.org as the global domain), so if a client wants to obtain top views for all projects, they'd compose a URI like https://wikimedia.org/data/stats/views/top/{access}/{year}/{month}/{day} (also, because this route will be available only in the global domain, I wonder where's the added value in having top in the route at all)

@d00rman
Copy link
Copy Markdown
Contributor

d00rman commented Sep 21, 2015

also let's cc @wikimedia/analytics and @wikimedia/services here :)

@d00rman
Copy link
Copy Markdown
Contributor

d00rman commented Sep 22, 2015

Closing in favour of #350

@d00rman d00rman closed this Sep 22, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants