Skip to content

Loading…

Use Piwik to measure how Piwik app is used by the community (opt-in) #4589

Closed
hpvd opened this Issue · 32 comments

4 participants

@hpvd

just a weird idea:
track yourself / track piwik usages
would be interesting

  • to show the power of piwik e.g. for every firstime user
  • to show what is possible if integrated in best way
  • to compare different piwik users
  • to learn what new featuers may be needed for tracking such complex sites like piwik backend
  • to optimize usability
  • ...

with event tracking which is on the track it should be possible to get great inside of what one have used piwik for even with that many onpage actions piwik has.

'''enable via a simple checkbox in settings
-> automatic: new website "My Piwik" and start tracking immediately'''

  • everything should be configured/setup in best way
  • events hast to be implemented for all piwik standard actions
  • possibility to share the results with piwik developers
  • ....

what do you think ??

@mattab
Piwik Open Source Analytics member

Thanks for the suggestion!

we would love to measure Piwik usage with Piwik itself. For example measure clicks on menus, which features are used, and more. The data would be useful to know features popularity and maybe take better product decisions.

Note:there is a setting in the config file:

; if set to > 0, a Piwik tracking code will be included in the Piwik UI footer and will track visits, pages, etc.
; data will be stored for idSite = enable_measure_piwik_usage_in_idsite
; this is useful for Piwik developers as an easy way to create data in their local Piwik
enable_measure_piwik_usage_in_idsite = 0

however it only tracks a basic pageview when Piwik is loaded...

@mattab
Piwik Open Source Analytics member

see also: #1050 Plugin: opt-in to share aggregate daily data with centralized server

@hpvd hpvd added this to the 2.x - The Great Piwik 2.x Backlog milestone
@mattab mattab removed the Major label
@mattab mattab added the Major label
@mattab mattab modified the milestone: Mid term, Short term
@mattab mattab changed the title from Use Piwik to track how Piwik is used! Share data publicly with community to Use Piwik to measure how Piwik app is used by the community (opt-in)
@mattab
Piwik Open Source Analytics member

This feature is key to our Product developments being more data driven.

note: related link

@mattab mattab modified the milestone: Short term, Mid term
@mattab
Piwik Open Source Analytics member

We have many interesting feedback from the like / dislike buttons. As part of measuring how Piwik is used, we can send this feedback to the Piwik install, for example tracked as Events with the full user feedback message & anonymous metadata (eg. Piwik version)

Example of what the feedback email looks like currently:

Feature: Evolution over the period
Like: Yes
Feedback:
I like it. But it would be great if it would show only the selected period.

When I select august 2015 it shows the period beginning with september 2013 but all I wanted to see is a detailed view of august 2015.

Piwik 2.14.3
IP: a.X.y.z
URL: http://hello/index.php?module=CoreHome&action=index&idSite=1&period=month&date=2015-08-01
@quba
Piwik Open Source Analytics member

What about making possible to track Piwik usage using a separate Piwik instance? E.g. an additional config option to specify domain.

@mattab
Piwik Open Source Analytics member

Great idea @quba

@mattab
Piwik Open Source Analytics member

Here are my thoughts on the topic.

Piwik features to showcase with awesome real data

  • Event tracking: click on report footer icons, clicks on calendar buttons, clicks on Segment editor, Website selector, Dashboard & Widgets selector, click in Transitions, Overlay, Row Evolution, click on graph metric picker, customise dashboard (add/delete/move widgets)...
  • Content tracking: when listing entities, individual reports in the page, widgets on dashboard...
  • Custom variables: User permission level...
  • Site Search: track awesome search, track all in-report search?, ...
  • User ID (track login as User id)
  • Use Site speed to find out how fast Piwik renders pages
  • All standard Piwik metrics are very interesting for example "Session length" (visit duration), Active users, etc.

Tracking quantitative data?

how could we track, Piwik version, number of reports, number of segments, number of websites, number of users, number of non-core plugins enabled, number of core plugins disabled, number of visits / actions per website, etc. Not sure if Events, or Custom Variables, or another, are best suited.

Notes & questions

  • what else do we want to track?
  • How do we track how Administration section is used, how form elements (#562) are clicked?

Keeping usage data anonymous

  • Anonymise 3 bytes of IP
  • Filter out data send as Events etc. and remove from there any token_auth or URL?
  • do not track "input" field values
  • do not track Goal names, website names, etc.

Release strategy

  • Add some hooks into core JS file where needed to measure Piwik
  • Create a plugin, released on Marketplace, providing this "Measure Piwik" feature
  • Having a plugin lets us improve it without having to wait for core release
  • Plugin setting: "Track data into another Piwik instance (URL)" so that the plugin can be enabled, but only send the data to a centralised Piwik
  • (optional) Could the plugin include a new Custom Type "Piwik app" instead of using "Website"?
    • (not sure if this works well in 2.X or we need to wait 3.0)
    • a "Piwik app" can then be added (instead of a Website)
    • this app could have pre-set "Goals" ("Add a new website", "Add a new user", "Install plugin"...)

For example, maybe we can spend one day or two on this and try to track as many relevant things as possible, and publish the plugin on the Marketplace?

Excited to think that we may work on this soon!

@mattab
Piwik Open Source Analytics member

Our goal is:

  • Learn how Piwik is used, popular features, un-used features
  • Learn about a sample of Piwik users: are they power users? new users? high traffic?
  • Showcase some Piwik features on the demo eg. Event, Content Tracking
  • and more possibilities with this data!
  • we are careful not to track too much and only track minimum necessary for useful analysis
@mattab mattab modified the milestone: 2.15.0, Short term
@tsteur tsteur self-assigned this
@tsteur
Piwik Open Source Analytics member

Plugin setting: "Track data into another Piwik instance (URL)" so that the plugin can be enabled, but only send the data to a centralised Piwik

We'd also need an IdSite. I think it would be nice to allow people to track to up to three Piwiks.

  • A checkbox "send usage to Piwik"
  • One setting where they can track into their "own" Piwik so they only have to enter an idSite (if #8641 was implemented we could even let them add a site with just one click and get idSite manually but not needed for MVP)
  • One setting where they can enter any URL + idSite

I think it's good to let people also track it into their own instance easily so they can as well see what we track (if they want). No matter whether we make this anonymized data public or not. It is more transparent this way and maybe also useful to some of their users.

Update: Implementation can be actually problematic since there might be different Piwik versions used!

@tsteur
Piwik Open Source Analytics member

Should we keep existing feature enable_measure_piwik_usage_in_idsite in config?

@mattab
Piwik Open Source Analytics member

Should we keep existing feature enable_measure_piwik_usage_in_idsite in config?

no let's remove

@hpvd

@tsteur
with "would be nice to allow people to track to up to three Piwiks"
you talk about tracking "three different piwiks"?
Or report tracking of one Piwik to three different places (Piwiks)?
For latter:

  • sending track data to "your own Piwik" -> to get insides of how people of your own team use Piwik, e.g. to identify education and training needs
  • sending track data to "official Piwik Team / Community" -> to get insides of how people in general use Piwik to optimize the direction of further development of piwik
@tsteur
Piwik Open Source Analytics member

report tracking of one Piwik to three different places (piwiks)

I meant this one (each one would be optional, tracking to Piwik's Piwik would be enabled by default I think)

@hpvd

+1 / thumbs up!

@tsteur
Piwik Open Source Analytics member

It would be really nice to know which API's are used how often as well. We could Track this via PiwikTracker or with JS. We should not slow down a normal API request though. Instead we could write them into a file or database and at some point send a bulk request with many all together. The time (at what time was which API called) etc would not be important for us so we would not need to send a superuser token or so.

@tsteur
Piwik Open Source Analytics member

Tracking API calls will let us know which API's are used often and which are maybe more important when it comes to bugfixing or performance improvements and as we will use more and more the API directly we won't have to hook to all kinda events in the API such as delete user, add user etc.

@tsteur
Piwik Open Source Analytics member

I wonder if we should let users know that they are being tracked and whether we need an opt out link? Eg imagine a super user installs this plugin but doesn't notify users that this instance is being tracked. This can be problematic especially when there are only one or two users that use Piwik. A "boss" could basically watch what and if employees are doing there.

We could maybe add a user setting to disable tracking for a specific user? By clicking on "Username => Plugin settings" users get a chance to disable it / opt out.

BTW: Plugin is developed in https://github.com/piwik/plugin-AnonymousPiwikUsageMeasurement

@tsteur tsteur added a commit that referenced this issue
@tsteur tsteur refs #4589 removed enable_measure_piwik_usage_in_idsite feature as it…
… will be replaced by a plugin, highlight introduction in plugin settings
9372889
@tsteur
Piwik Open Source Analytics member

Here's another update

I implemented content tracking for the dashboard but prefer to rather do this with pages that are used less often and that provide us more value like Marketplace and API page. In dashboards there are rarely any interactions. I'm currently tracking an interaction for closing, refreshing, minimizing and maximizing which kinda makes sense but would prefer to track them as events. Every time a user opens the dashboard we'd send a big request containing the content impressions which can be a bit annoying and it can slow the user experience down (detection of content blocks etc.). Rendering and fetching all dashboards is already resource intensive enough. If one configured to track the data to 2 or 3 instances we even do it multiple times slowing it further down. As we do not have unique content impressions yet the impressions would be not really useful for us anyway I think. Marketplace and API would be way more valuable for us.

If we still want to track the used widgets in dashboard I recommend that we setup a daily or weekly task that tracks all used widgets via PiwikTracker. The information which widgets are used and which ones are used together in one dashboard is pretty valuable for us as we could create a plugin providing more default dashboards based on the information which plugins are often used together.

I will possibly also try to find a solution to track API calls via PiwikTracker in V1 as it would be really helpful for us and users. I wouldn't track them at the same time the API call is made, I'd rather write them somewhere to disk or DB and track them later all at once via a scheduled task.

FYI:

  • Referrer will be always removed and set to ''
  • I anonymize all URL parameters apart from a few whitelisted ones ('module', 'action', 'idSite', 'idDashboard', 'period', 'date', 'popover', 'idGoal', 'pluginName') to make sure we do not track any private or sensitive data. If a parameter is not whitelisted (like token_auth and nonce) I "anonymize" them by setting the value to XYZ.
  • I also make sure to "anonymize" idSite/idGoal/idDashboard (as it could allow to identify specific installations under circumstances), and remove all information from popover (as it can include visitorIds, URLs) apart from the handler.
  • We cannot really enable link tracking / search / JS error tracking as it might include private data. We might be able to add it later but not sure.
  • I turn URLs into hierarchical urls eg /$module/$action/?idsite=1&...
  • I track a page scoped custom variable for module, action and popover (if used)
  • I track a visit scope custom variable for Piwik version and PHP version
  • We may track 3rd party module/action names which can disclose private plugin names, controller actions, report names etc. Such names my contain private data that allows to identify a specific installation on demo.piwik.org. We should mention this in the plugin description so users are aware of it.
@mattab
Piwik Open Source Analytics member

It would be really nice to know which API's are used how often as well. We could Track this via PiwikTracker or with JS. We should not slow down a normal API request though. Instead we could write them into a file or database and at some point send a bulk request with many all together. The time (at what time was which API called) etc would not be important for us so we would not need to send a superuser token or so.

Awesome idea ;-)

We could maybe add a user setting to disable tracking for a specific user? By clicking on "Username => Plugin settings" users get a chance to disable it / opt out.

Yes :+1: very nice to give option to Piwik users to opt-out individually from tracking.
Maybe we could also add a Super User setting eg. "Let users disable anonymous tracking: YES (default), No" so Super User could specifically not let users opt-out (although users would by default be able to optout)

@mattab
Piwik Open Source Analytics member

I implemented content tracking for the dashboard but prefer to rather do this with pages that are used less often and that provide us more value like Marketplace and API page. In dashboards there are rarely any interactions.

Agreed: we don't need to measure how Widgets are viewed on dashboard.

I'm currently tracking an interaction for closing, refreshing, minimizing and maximizing which kinda makes sense but would prefer to track them as events.
[...]
As we do not have unique content impressions yet the impressions would be not really useful for us anyway I think.

:+1: because when we don't use Content Tracking for widgets, we can't benefit from the CTR = Click / Impression, provided by Content tracking. So using custom events is better choice as you point out!

The information which widgets are used and which ones are used together in one dashboard is pretty valuable for us as we could create a plugin providing more default dashboards based on the information which plugins are often used together.

Assuming

  • we don't want to track "Widgets views"
  • we are interested in tracking "List of widgets in dashboard"

maybe we could hook on Dashboard.saveLayout action and send the data to us, only when user changes the dashboard layout (which should be a quite rare action by users)?

FYI:

Looks like you nail the "Privacy / anonymity" part. Exciting & looking forward to seeing the reports :-)

@tsteur
Piwik Open Source Analytics member

Another idea I had:

  • Send diagnostic data to Piwik, this may contain private data and is not fully anonymized. Any tracked data won't be made public (disabled by default)

This one might include private data that is not anonymized. We'd make sure to anonymize token_auths and URLs of the host. Thinking of error messages + maybe even backtraces of exceptions etc. Possibly we'd also enable JS error logging. It would go into a different site in Piwik to not mix it with anonymous data. There's a certain risk in case of hacks though and ideally data would maybe go to a different server etc.

Possibly something for V2

@mattab
Piwik Open Source Analytics member

@tsteur sounds good for V2

@tsteur
Piwik Open Source Analytics member

Maybe we could also add a Super User setting eg. "Let users disable anonymous tracking: YES (default), No" so Super User could specifically not let users opt-out (although users would by default be able to optout)

I added it. If a super user disables it, no user will see the setting (also won't know it's enabled).

@mattab Do you maybe have an idea as what the API calls should be tracked? I guess as event? Eg category=API, name='$module', action='$method'? If there were multiple calls to the same method, instead of sending each event separately, could I aggregate the number of calls to a specific method and send it as value?

@tsteur
Piwik Open Source Analytics member

Update:

  • Added tracking of SegmentEditor as events
    • Selecting a segment
    • Adding a new segment
    • Saving a segment
    • Deleting a segment
  • Added tracking of some Dashboard events
    • Clicking on reset dashboard
    • Clicking on change layout
    • and all the other actions
  • Tracking click on "Download" in Personal Report as download (the filename contains the used format and a (report id % 20) to have multiple downloads in the report)
  • Tracking click on external link in "All Websites Dashboard" as outlink (anonymized as well of course, eg http://example.com/multisites/1)
  • Added content tracking for Marketplace
  • I do no longer send custom variables Piwik Version and PHP version via JavaScript tracker as it could leak eg PHP version to a (anonymous) user
  • Instead I send a daily tracking call via PiwikTracker containing the following infos
    • Custom variable Piwik Version
    • Custom variable PHP Version
    • Custom variable Number of total users
    • Custom variable Number of total websites
    • Custom variable Number of total segments
    • All made API calls during the day as events

I'm sending this data only once a day to keep resources to a minimum. The API calls are sent aggregated so there won't be many calls and on top bulk tracking is used. Before events are sent they are written to the database where they are stored aggregated for fast inserts. I was thinking about storing them in a file but this is problematic in a multi-server environment and if the server is not configured correctly there might be a chance that one can open this information directly via URL unless we store it similar to config.ini.php which is not really efficient.

Todo:

  • Need to write tests for PHP and JS code to make sure anonymization works correctly.
  • Need to customize the scheduled tasks event. If I use daily it could pretty much send the data from all Piwik instances to demo.piwik.org at the same time which could result in a slow tracking performance and slow demo

We could in general track so much more but we will have to add things over time. Eg how often is a search used in MultiSites and how often paging etc. So far I added it to parts where it doesn't really slow down Piwik and where it doesn't send tracking requests often (eg Download in personal reports, outlink in multi sites).

@mattab I haven't implemented the hook on dashboard.saveLayout yet. Not sure as what to track it as the plain information of used widgets is only kinda useful. It would be nice to know which widgets are used together in one dashboard etc. but not sure how to do this.

@tsteur tsteur added a commit that referenced this issue
@tsteur tsteur refs #4589 removed enable_measure_piwik_usage_in_idsite feature as it…
… will be replaced by a plugin, highlight introduction in plugin settings
8b81180
@mattab
Piwik Open Source Analytics member

@mattab Do you maybe have an idea as what the API calls should be tracked? I guess as event? Eg category=API, name='$module', action='$method'? If there were multiple calls to the same method, instead of sending each event separately, could I aggregate the number of calls to a specific method and send it as value?

Yes, all sounds good. using "event value" as "count" sounds good. How would you aggregate the API counts, ie. over a day? over an hour?

EDIT: Ok got the answer in next comment: "All made API calls during the day as events"

@mattab
Piwik Open Source Analytics member

Todo: Need to write tests for PHP and JS code to make sure anonymization works correctly.

Covered in #8953

Need to customize the scheduled tasks event. If I use daily it could pretty much send the data from all Piwik instances to demo.piwik.org at the same time which could result in a slow tracking performance and slow demo

Good point

Looking forward to seeing the data & also making it available to anonymous so everyone can see it on the demo :+1:

@tsteur
Piwik Open Source Analytics member

Just FYI: I don't really need another issue for #8953. I wanna write them before we close this issue and release it on the marketplace. Otherwise I would not feel confident about it and would rather recommend to not release it. We need to make sure things are anonymized and do not break an installation as we hook into several places of the UI.

@tsteur
Piwik Open Source Analytics member

@mattab I wrote tests and fixed a couple of issues.

One thing that is still to be done is to enable IP anonymization on demo.piwik.org

Also noticed it would be nice to track page generation time but that won't be trivial and is rather something for v2

@mattab
Piwik Open Source Analytics member

One thing that is still to be done is to enable IP anonymization on demo.piwik.org

Done, IP anon 2 bytes enabled on demo.piwik.org

@tsteur
Piwik Open Source Analytics member

@diosmosis @mattab does anyone of you mind having a look at the plugin as a review and also to check if you have some ideas in case I missed something to anonymize? https://github.com/piwik/plugin-AnonymousPiwikUsageMeasurement/

If it's about features please create new issue in the plugin repository

@mattab
Piwik Open Source Analytics member

Suggested steps to finish this important project:

  • Confirm https://demo-anonymous.piwik.org is working well
  • Make https://demo-anonymous.piwik.org publicly available
  • Review plugin's README
  • Publish plugin on Marketplace
  • Create a new FAQ How do I track and measure how my Piwik service is being used?
  • Write a little blog post to advertise the plugin and invite people to install it
@mattab
Piwik Open Source Analytics member

Moved next steps to this new issue: #9051

Nice work @tsteur - will be a huge help in the future, when we will need to understand how people use Piwik, and /or find out what can we improve. Also it will be super useful later for Piwik PRO Cloud service, to learn more how clients use it (when we will enable it, not planned yet)

@mattab mattab closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.