Participation stats #3447

nilbus · 2017-04-03T04:37:02Z

This is for analyzing the results of the experiment in exercism/discussions#123.

Implements #3445; see also the preliminary discussion there.

Goal: Measure review participation quantity and quantity (length) before, during, and after gamification is applied.

This branch contains DB migrations.

WIP screenshot:

Todo:

The related specs were deleted in: - 8760ab6 - ae9251b

Metrics/MethodLength is apparently loose enough now.

This replaces the /stats redirect to the first track. See exercism/discussions#123 Feature flag: participation_stats This introduces a new plotting library, Plotly.js. See discussion in exercism#3445. The migration introduces postgres to the crc32 hashing function, so it can determine which users are in the experiment group and not. This branch will close exercism#3445.

List control first, so experimental has the brighter color.

The new Track Stats header provides for a better empty state. We don't particularly want people to opt out of participation tracking. I intentionally leave unsaid that this is how you can see the statistics, so as not to reward curiosity during the experiment.

nilbus · 2017-04-09T13:46:01Z

Release notes:

Feature flag: participation_stats
Run migrations
This PR introduces application metric reporting to Hosted Graphite using the metric exercism_io.stats.experiment.query.time. Under the Heroku app's resources tab, add the Hosted Graphite add-on (under Find more add-ons), and metrics will begin reporting. Please add me as a team member on the Hosted Graphite dashboard (linked from the Heroku Resources page). In environments without HOSTEDGRAPHITE_APIKEY set, metrics reporting is a no-op.

nilbus · 2017-04-09T13:51:58Z

Forgot to mention… Hosted Graphite's free developer tier should be sufficient for us.

bernardoamc · 2017-04-10T01:23:27Z

db/migrate/201704022149_create_function_crc32.rb

+class CreateFunctionCrc32 < ActiveRecord::Migration
+  def up
+    ActiveRecord::Base.connection.execute <<~SQL
+      CREATE FUNCTION crc32(text_string text) RETURNS bigint AS $$


Should we have a check for NULL in this function just in case?

Yes, otherwise it starts an infinite loop. Fixed in 8ee996d. Thanks!

@bernardoamc

Otherwise, crc32(NULL) starts an infinite loop. Thanks @bernardoamc

bernardoamc · 2017-04-10T01:32:13Z

lib/exercism/metrics.rb

+    new.increment(metric_name)
+  end
+
+  def initialize(api_key: ENV['HOSTEDGRAPHITE_APIKEY'], host: 'carbon.hostedgraphite.com', port: 2003)


It seems this class simply doesn't work unless we have a valid api_key. Should we be explicit and raise when we don't have this attribute?

No, it was an intentional choice to make Metrics no-op when the API key is not present. Consider that most everyone working with this project in development will not have an API key. I don't want to force a dummy key to be provided or an account to be set up. In most cases, developers won't care to look at the metrics anyway.

Now that you mention this though, I just thought of something… instead of doing nothing, Metrics can just print to stdout. That would be even better for development. 👍

Cool, thanks for the context! <3

Added in 1e23ff2.

bernardoamc · 2017-04-10T01:36:55Z

lib/exercism/participation_stats.rb

+      where('comments.created_at <  ?', end_date).
+      group('created_date').
+      order('created_date')
+    relation = filter_experiment_group(relation, experiment_group)


experiment_group is accessible anywhere in the class, we don't need to use it as a parameter here. :)

Fixed in ff9a080.

bernardoamc · 2017-04-10T01:39:39Z

lib/exercism/participation_stats.rb

+    }
+    if gamification_markers
+      result.merge!(
+        gamification_start_date: GAMIFICATION_START_DATE,


We could move this to a constant and do something like: result = result.merge(GAMIFICATION_MARKERS) if gamification_markers.present?

Less garbage collection for the win! 🐎

Done in 334c2f6.

bernardoamc · 2017-04-10T01:42:42Z

lib/exercism/participation_stats.rb

+  end
+
+  def results
+    result = {


Is the order important here? date[0] refer to daily_review_count[0] which refer to daily_review_count[0] and so on? If so, we could create a struct like: DailyReview = Struct.new(:date, :count, :length) and instantiate it when iterating through the results.

The order is important. The trick is that these values get converted immediately to JSON for storage in a data attribute for the Plotly, and this is the format that it expects. In other circumstances, I agree that would make more sense.

See where this is used in stats.coffee:40.

bernardoamc · 2017-04-10T01:43:54Z

test/app/stats_test.rb

+
+  attr_reader :alice, :opted_out
+  def setup
+    super


Why do we need the super here?

DBCleaner starts a transaction in the superclass. When overriding setup, not calling super means that the transaction doesn't start, and any records created get left in the database after each test run.

I see! Maybe it would be a good case to use prepend.. not the scope of this PR though. :)

bernardoamc · 2017-04-10T01:46:52Z

test/exercism/metrics_test.rb

+class MetricsTest < MiniTest::Test
+  def test_reports_without_error
+    metrics = Metrics.new(api_key: 'test', host: 'localhost')
+    UDPSocket.any_instance.expects(:send)


Should we use UDPSocket.any_instance.expects(:send).twice here?

metrics = Metrics.new(api_key: 'test', host: 'localhost') UDPSocket.any_instance.expects(:send).twice metrics.time('test.time') { 1 } metrics.increment('test.count')

Sure. I wasn't sure if that was a thing in Minitest. 👍

I think this is provided by mocha, it has a bunch of interesting expectations. :)

Right, that's what I meant. 😄 My mind was just going to "not rspec".

Changed in fc7b1bd.

@bernardoamc

Thanks @bernardoamc

@bernardoamc

This will reduce garbage collection slightly and cleans up the code some. Thanks @bernardoamc.

Generating 1 million comments takes about 1 minute on my machine.

@bernardoamc

Thanks @bernardoamc.

kytrinyx · 2017-04-10T15:49:35Z

This looks great @nilbus. Let me know if/when you're ready to merge (I see a few unchecked boxes).

Spread both comments and users over 4 years, with quantity increasing exponentially toward the present.

From 1300ms to 130ms.

nilbus · 2017-04-11T02:36:15Z

Performance tests complete. With 50k users and 1M comments, the statistics query for each experiment group (2) takes under 200ms. I chose not to make a few other optimizations that brought the query time down to 100ms, because the complexity of pre-calculated columns that would need to be updated using triggers isn't worth the performance gain. I may revisit these though for long-term stats. With sub-second request times and no obvious existing cache mechanism, I'm also not concerned with application-level caching for this low-traffic page.

This is ready to go! (release notes)

nilbus · 2017-04-11T02:40:54Z

Thanks for the thorough review @bernardoamc! 🚀

- Withdrawal date matches actual date. - Pre gamification period matches withdrawal period duration.

nilbus added 4 commits April 1, 2017 09:56

Rename ExercismLib::Stats to ExercismLib::TrackStats

14b1dd7

Remove unused approvals fixtures

f169077

The related specs were deleted in: - 8760ab6 - ae9251b

Convert stats.js to coffeescript before future expansion

73a1020

Remove unnecessary rubocop disable

6aa58fb

Metrics/MethodLength is apparently loose enough now.

nilbus force-pushed the participation-stats branch 2 times, most recently from da0c4ba to 6f1d8df Compare April 4, 2017 02:51

nilbus added 2 commits April 8, 2017 13:47

Graph review lengths

01013c7

List control first, so experimental has the brighter color.

nilbus force-pushed the participation-stats branch 3 times, most recently from 2061cb0 to 2344195 Compare April 8, 2017 20:25

nilbus added 4 commits April 8, 2017 16:29

Filter experiment results to qualified users

c748cf6

Display experiment statistics after the experiment is complete

009c5e6

Exclude the last day of stats, which is not fully recorded

87fc2e6

nilbus force-pushed the participation-stats branch from 2344195 to 87fc2e6 Compare April 8, 2017 20:31

nilbus added 2 commits April 8, 2017 17:21

Allow guests to see experiment stats after it's over

9b4f0ec

Test that /stats loads for all user types

4bcb2b2

nilbus force-pushed the participation-stats branch from be640f2 to 4bcb2b2 Compare April 9, 2017 13:32

nilbus added 2 commits April 9, 2017 09:35

Add a Metrics class for reporting to Hosted Graphite

752c8c7

Report ParticipationStats query time

9375a06

bernardoamc reviewed Apr 10, 2017

View reviewed changes

Return NULL when psql crc32 function called with NULL

8ee996d

Otherwise, crc32(NULL) starts an infinite loop. Thanks @bernardoamc

bernardoamc reviewed Apr 10, 2017

View reviewed changes

nilbus added 5 commits April 9, 2017 22:07

Print Metrics messages to stdout when HOSTEDGRAPHITE_APIKEY unset

1e23ff2

Stop passing class attributes unnecessarily

ff9a080

Thanks @bernardoamc

Extract constant GAMIFICATION_MARKERS

334c2f6

This will reduce garbage collection slightly and cleans up the code some. Thanks @bernardoamc.

Create rake data:generate:comments task

6c32956

Generating 1 million comments takes about 1 minute on my machine.

Use appropriate mocha assertion #twice

fc7b1bd

Thanks @bernardoamc.

nilbus added 2 commits April 10, 2017 22:18

Create rake data:generate:users task

e8f045d

Spread both comments and users over 4 years, with quantity increasing exponentially toward the present.

Index comments.created_at to improve stats performance

231874a

From 1300ms to 130ms.

Modify experiment dates

aaabc98

- Withdrawal date matches actual date. - Pre gamification period matches withdrawal period duration.

kytrinyx merged commit 76aebfa into exercism:master Apr 13, 2017

nilbus mentioned this pull request Apr 15, 2017

Prevent Rikki bot from affecting behavior study #3464

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Participation stats #3447

Participation stats #3447

nilbus commented Apr 3, 2017 •

edited

nilbus commented Apr 9, 2017 •

edited

nilbus commented Apr 9, 2017

bernardoamc Apr 10, 2017

nilbus Apr 10, 2017 •

edited

bernardoamc Apr 10, 2017

nilbus Apr 10, 2017 •

edited

bernardoamc Apr 10, 2017

nilbus Apr 10, 2017

bernardoamc Apr 10, 2017

nilbus Apr 10, 2017

bernardoamc Apr 10, 2017

nilbus Apr 10, 2017

nilbus Apr 10, 2017

bernardoamc Apr 10, 2017

nilbus Apr 10, 2017

bernardoamc Apr 10, 2017

nilbus Apr 10, 2017

bernardoamc Apr 10, 2017

bernardoamc Apr 10, 2017

nilbus Apr 10, 2017

bernardoamc Apr 10, 2017 •

edited

nilbus Apr 10, 2017

nilbus Apr 10, 2017

kytrinyx commented Apr 10, 2017

nilbus commented Apr 11, 2017

nilbus commented Apr 11, 2017

Participation stats #3447

Participation stats #3447

Conversation

nilbus commented Apr 3, 2017 • edited

nilbus commented Apr 9, 2017 • edited

nilbus commented Apr 9, 2017

Choose a reason for hiding this comment

nilbus Apr 10, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nilbus Apr 10, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bernardoamc Apr 10, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kytrinyx commented Apr 10, 2017

nilbus commented Apr 11, 2017

nilbus commented Apr 11, 2017

nilbus commented Apr 3, 2017 •

edited

nilbus commented Apr 9, 2017 •

edited

nilbus Apr 10, 2017 •

edited

nilbus Apr 10, 2017 •

edited

bernardoamc Apr 10, 2017 •

edited