Refactor contact rollups code to library, add tests #13388

jeremydstone · 2017-02-22T22:21:33Z

This PR moves existing code out of cron jobs build_contact_rollups and sync_contact_rollups and into a new library, contact_rollups.rb. It also adds tests for the first step in the contact rollup process.

contact_rollups.rb is not new code - it is moved from the cron job code with very few changes. The major change is that I had to change it from using the Sequel database library to ActiveRecord in order to participate in the test framework and use FactoryGirl.

I have not moved the cron job code to use it yet. There is a little more work to do there but the cron code will be dropping down to just a call to the library. The cron job code is not running in production at the moment.

The test right now tests that the process works at all, picks up the right contacts, and that the analysis of ages taught (one of the most important fields) works. In future PRs I will be expanding this test to cover more of the rollup process output.

aoby

Great start!

I have a number of style comments, but the overall structure looks good.

One of the tests is failing in circle too.

aoby · 2017-02-23T07:20:34Z

dashboard/test/integration/pardot_test.rb

+      expected_column_values = expected_values_hash[email]
+      expected_column_values.each do |key, value|
+        column_index = COLUMN_NAME_TO_INDEX_MAP[key]
+        # puts ("email: #{email} actual: #{row[column_index]} expected: #{value}")


remove commented-out line

aoby · 2017-02-23T07:23:38Z

dashboard/test/integration/pardot_test.rb

+    expected_values.each do |expected_values_hash|
+      email = expected_values_hash.keys.first
+      email_sanitized = ActiveRecord::Base.sanitize(expected_values_hash.keys.first)
+      results = ActiveRecord::Base.connection.execute("select roles, ages_taught from pegasus_test.contact_rollups_daily where email=#{email_sanitized}")


not a big deal, but it would be easier to read if this ActiveRecord::Base.connection.execute was aliased

def execute(sql) ActiveRecord::Base.connection.execute(sql) end

Disagree. :(

@ashercodeorg out of curiosity, why not? It could be better named, such as execute_sql.

I could go either way. There is cost to the code reader to delve into the aliased function to see what it does. Since we don't do this routinely/consistently elsewhere, I am going to leave as is.

aoby · 2017-02-23T07:25:16Z

dashboard/test/integration/pardot_test.rb

+
+    # Should now have expected_count records in daily rollups table, and still none in main rollups table
+    assert ActiveRecord::Base.connection.execute("select count(*) from pegasus_test.contact_rollups_daily").first[0] == expected_count
+    assert ActiveRecord::Base.connection.execute("select count(*) from pegasus_test.contact_rollups").first[0] == 0


Use assert_equal. Note the expected value comes first, so:

assert_equal expected_count, ActiveRecord::Base...

Done, thx (throughout)

aoby · 2017-02-23T07:42:34Z

dashboard/test/integration/pardot_test.rb

+      email = expected_values_hash.keys.first
+      email_sanitized = ActiveRecord::Base.sanitize(expected_values_hash.keys.first)
+      results = ActiveRecord::Base.connection.execute("select roles, ages_taught from pegasus_test.contact_rollups_daily where email=#{email_sanitized}")
+      assert results.count == 1


assert_equal 1, results.count

aoby · 2017-02-23T07:43:19Z

dashboard/test/integration/pardot_test.rb

+      expected_column_values.each do |key, value|
+        column_index = COLUMN_NAME_TO_INDEX_MAP[key]
+        # puts ("email: #{email} actual: #{row[column_index]} expected: #{value}")
+        assert row[column_index] == value


assert_equal

aoby · 2017-02-23T07:57:01Z

dashboard/test/integration/pardot_test.rb

+    # Create teacher 3 with one section and multiple students
+    @teacher3 = create(:teacher, email: "rolluptestteacher3@code.org")
+    @teacher3_section = create(:section, user: @teacher3)
+    create_sections_helper @teacher3, [[{age: 6}, {age: 10}, {age: 14}, {age: 10}]]


What is the section on line 23 for? Line 24 will create a new section inside create_sections_helper that contains all the students.

Perhaps to test a teacher with a section with no students? At least, I see that as being a useful test to have.

That would be a useful test, and it would make more sense to have a separate teacher with an empty section.

It was in there accidentally. But it is a good test to have an empty section, I added that in deliberately now.

aoby · 2017-02-23T08:00:55Z

dashboard/test/integration/pardot_test.rb

+
+  private
+
+  def rollups_test_helper(expected_count, expected_values)


This method name is pretty generic. Since it builds the contact rollups and asserts a bunch of values, perhaps rename something like build_and_verify_contact_rollups.

Good idea, thx.

aoby · 2017-02-23T08:02:39Z

dashboard/test/integration/pardot_test.rb

+      { "rolluptestteacher2@code.org": { "roles": "Teacher", "ages_taught": "6" }},
+      { "rolluptestteacher3@code.org": { "roles": "Teacher", "ages_taught": "6,10,14" }},
+      { "rolluptestteacher4@code.org": { "roles": "Teacher", "ages_taught": "9,10,11,14,15" }}
+    ]


aoby · 2017-02-23T08:04:24Z

lib/cdo/contact_rollups.rb

+  def self.insert_from_pegasus_forms
+    start = Time.now
+    log "Inserting contacts and IP geo data from pegasus.forms"
+    ActiveRecord::Base.connection.execute "


Same as in the test above, this repeated ActiveRecord::Base.connection.execute is a bit cumbersome and would be more readable aliased in a method.

aoby · 2017-02-23T08:06:26Z

dashboard/test/integration/pardot_test.rb

+    # Verify expected values in contacts_rollup_daily
+    expected_values.each do |expected_values_hash|
+      email = expected_values_hash.keys.first
+      email_sanitized = ActiveRecord::Base.sanitize(expected_values_hash.keys.first)


I suppose it can't hurt, but I don't think you need to sanitize values that you provide in the test... unless you're worried about your own SQL injection attack ;)

It can hurt, because it obfuscates what is happening in the test.

I was doing this on general principle. Since you both had negative reactions to it, I have pulled it out. Once we start running static security code analysis routinely it will presumably light up on this and I may have to do something.

ashercodeorg · 2017-02-23T14:29:16Z

dashboard/test/integration/pardot_test.rb

+require 'cdo/contact_rollups'
+
+PEGASUS_TEST_DB_NAME = "pegasus_#{Rails.env}"
+COLUMN_NAME_TO_INDEX_MAP = { "roles": 0, "ages_taught": 1 }.freeze


The quotes around roles and ages_taught are superfluous.

ashercodeorg · 2017-02-23T14:35:14Z

dashboard/test/integration/pardot_test.rb

+    expected_values.each do |expected_values_hash|
+      email = expected_values_hash.keys.first
+      email_sanitized = ActiveRecord::Base.sanitize(expected_values_hash.keys.first)
+      results = ActiveRecord::Base.connection.execute("select roles, ages_taught from pegasus_test.contact_rollups_daily where email=#{email_sanitized}")


Disagree. :(

ashercodeorg · 2017-02-23T14:36:15Z

dashboard/test/integration/pardot_test.rb

+
+  private
+
+  def rollups_test_helper(expected_count, expected_values)


Please provide YARD comments for helper methods.

ashercodeorg · 2017-02-23T14:51:44Z

dashboard/test/integration/pardot_test.rb

+class PardotTest < ActiveSupport::TestCase
+  def test_empty_contacts
+    # Test the rollup process with an empty database
+    rollups_test_helper 0, []


Even with the comment, I still have no idea what the expectations are. Part of this is the rollups_test_helper name, as @aoby commented on. Part of this is the method signature. Maybe

build_and_verify_contact_rollups { expected_count: 0, expected_contacts: [] }

ashercodeorg · 2017-02-23T14:53:45Z

dashboard/test/integration/pardot_test.rb

+    # Create teacher 3 with one section and multiple students
+    @teacher3 = create(:teacher, email: "rolluptestteacher3@code.org")
+    @teacher3_section = create(:section, user: @teacher3)
+    create_sections_helper @teacher3, [[{age: 6}, {age: 10}, {age: 14}, {age: 10}]]


Perhaps to test a teacher with a section with no students? At least, I see that as being a useful test to have.

ashercodeorg · 2017-02-23T14:54:01Z

dashboard/test/integration/pardot_test.rb

+    rollups_test_helper 0, []
+  end
+
+  def test_teachers


This should be split into multiple tests, suggested by the fact that the test can fail for any number of reasons.

Because the rollup process operates on a large corpus of data, I am going to leave this as is. If I split it up, each test would be operating on a database with a small number of users (such as one). By generating a mildly interesting body of test fixtures, it demonstrates that we don't have something going on such as bad joins that will generate additional rows in the output. Things like that might not show up in the smaller data set of the individual things we are verifying later.

ashercodeorg · 2017-02-23T14:56:14Z

dashboard/test/integration/pardot_test.rb

+    # Verify expected values in contacts_rollup_daily
+    expected_values.each do |expected_values_hash|
+      email = expected_values_hash.keys.first
+      email_sanitized = ActiveRecord::Base.sanitize(expected_values_hash.keys.first)


It can hurt, because it obfuscates what is happening in the test.

ashercodeorg · 2017-02-23T15:07:57Z

dashboard/test/integration/pardot_test.rb

+    assert ActiveRecord::Base.connection.execute("select count(*) from pegasus_test.contact_rollups").first[0] == 0
+
+    # Verify expected values in contacts_rollup_daily
+    expected_values.each do |expected_values_hash|


That this is so unreadable suggests the expected_value format is wrong. It would be less complex and more readable to have, e.g.,

expected_values = { "rolluptestteacher1@code.org": { "roles": "Teacher", "ages_taught": nil }, "rolluptestteacher2@code.org": { "roles": "Teacher", "ages_taught": "6" }, "rolluptestteacher3@code.org": { "roles": "Teacher", "ages_taught": "6,10,14" }, "rolluptestteacher4@code.org": { "roles": "Teacher", "ages_taught": "9,10,11,14,15" } }

and

expected_value.each do |email, expected_email_info| results = ActiveRecord::Base.connection.execute( "select roles, ages_taught from pegasus_test.contact_rollups_daily where email=#{email}" ) assert_equal 1, results.count expected_email_info.each do |column, expected_column_value| assert_equal expected_column_value, results.first[COLUMN_NAME_TO_INDEX_MAP[column]] end end

Aside: This (as written and as suggested) runs a DB query per email. It might make sense to optimize the test to grab all the results, then compare all the results. Depending, it might make sense to have expected_values and results directly comparable, e.g., make it so that the assertion assert_equal expected_values, results is sufficient.

Good idea on the simplification, done, thx. On the test run time optimization - trying to batch the 5 DB queries into 1 only saves a handful of millisecond of test run time, not worth doing.

jeremydstone · 2017-02-24T17:09:57Z

@aoby done, thx

Move code to library and add tests

5bab80f

jeremydstone requested review from wjordan, aoby and ashercodeorg February 22, 2017 22:21

Build fix, test reliability fix

1d27875

aoby reviewed Feb 23, 2017

View reviewed changes

ashercodeorg suggested changes Feb 23, 2017

View reviewed changes

Jeremy Stone added 3 commits February 23, 2017 12:01

Try another approach to see if it passes in CircleCI

0bbf8c9

PR feedback

3005e7b

PR feedback

ac673f9

ashercodeorg approved these changes Feb 24, 2017

View reviewed changes

aoby approved these changes Feb 24, 2017

View reviewed changes

jeremydstone merged commit 133c048 into staging Feb 24, 2017

jeremydstone deleted the contact_rollup_refactor branch February 24, 2017 19:27

This was referenced Feb 24, 2017

Revert "Refactor contact rollups code to library, add tests" #13442

Merged

Unrevert contact rollup refactor #13493

Merged


		private

		def rollups_test_helper(expected_count, expected_values)

Refactor contact rollups code to library, add tests #13388

Refactor contact rollups code to library, add tests #13388

Conversation

jeremydstone commented Feb 22, 2017

aoby left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jeremydstone commented Feb 24, 2017