New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor contact rollups code to library, add tests #13388
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great start!
I have a number of style comments, but the overall structure looks good.
One of the tests is failing in circle too.
expected_column_values = expected_values_hash[email] | ||
expected_column_values.each do |key, value| | ||
column_index = COLUMN_NAME_TO_INDEX_MAP[key] | ||
# puts ("email: #{email} actual: #{row[column_index]} expected: #{value}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove commented-out line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thx
expected_values.each do |expected_values_hash| | ||
email = expected_values_hash.keys.first | ||
email_sanitized = ActiveRecord::Base.sanitize(expected_values_hash.keys.first) | ||
results = ActiveRecord::Base.connection.execute("select roles, ages_taught from pegasus_test.contact_rollups_daily where email=#{email_sanitized}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a big deal, but it would be easier to read if this ActiveRecord::Base.connection.execute
was aliased
def execute(sql)
ActiveRecord::Base.connection.execute(sql)
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disagree. :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ashercodeorg out of curiosity, why not? It could be better named, such as execute_sql
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could go either way. There is cost to the code reader to delve into the aliased function to see what it does. Since we don't do this routinely/consistently elsewhere, I am going to leave as is.
|
||
# Should now have expected_count records in daily rollups table, and still none in main rollups table | ||
assert ActiveRecord::Base.connection.execute("select count(*) from pegasus_test.contact_rollups_daily").first[0] == expected_count | ||
assert ActiveRecord::Base.connection.execute("select count(*) from pegasus_test.contact_rollups").first[0] == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use assert_equal. Note the expected value comes first, so:
assert_equal expected_count, ActiveRecord::Base...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thx (throughout)
email = expected_values_hash.keys.first | ||
email_sanitized = ActiveRecord::Base.sanitize(expected_values_hash.keys.first) | ||
results = ActiveRecord::Base.connection.execute("select roles, ages_taught from pegasus_test.contact_rollups_daily where email=#{email_sanitized}") | ||
assert results.count == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert_equal 1, results.count
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thx
expected_column_values.each do |key, value| | ||
column_index = COLUMN_NAME_TO_INDEX_MAP[key] | ||
# puts ("email: #{email} actual: #{row[column_index]} expected: #{value}") | ||
assert row[column_index] == value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert_equal
# Create teacher 3 with one section and multiple students | ||
@teacher3 = create(:teacher, email: "rolluptestteacher3@code.org") | ||
@teacher3_section = create(:section, user: @teacher3) | ||
create_sections_helper @teacher3, [[{age: 6}, {age: 10}, {age: 14}, {age: 10}]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the section on line 23 for? Line 24 will create a new section inside create_sections_helper
that contains all the students.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps to test a teacher with a section with no students? At least, I see that as being a useful test to have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be a useful test, and it would make more sense to have a separate teacher with an empty section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was in there accidentally. But it is a good test to have an empty section, I added that in deliberately now.
|
||
private | ||
|
||
def rollups_test_helper(expected_count, expected_values) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method name is pretty generic. Since it builds the contact rollups and asserts a bunch of values, perhaps rename something like build_and_verify_contact_rollups
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, thx.
{ "rolluptestteacher2@code.org": { "roles": "Teacher", "ages_taught": "6" }}, | ||
{ "rolluptestteacher3@code.org": { "roles": "Teacher", "ages_taught": "6,10,14" }}, | ||
{ "rolluptestteacher4@code.org": { "roles": "Teacher", "ages_taught": "9,10,11,14,15" }} | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice :)
def self.insert_from_pegasus_forms | ||
start = Time.now | ||
log "Inserting contacts and IP geo data from pegasus.forms" | ||
ActiveRecord::Base.connection.execute " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as in the test above, this repeated ActiveRecord::Base.connection.execute
is a bit cumbersome and would be more readable aliased in a method.
# Verify expected values in contacts_rollup_daily | ||
expected_values.each do |expected_values_hash| | ||
email = expected_values_hash.keys.first | ||
email_sanitized = ActiveRecord::Base.sanitize(expected_values_hash.keys.first) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose it can't hurt, but I don't think you need to sanitize values that you provide in the test... unless you're worried about your own SQL injection attack ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can hurt, because it obfuscates what is happening in the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was doing this on general principle. Since you both had negative reactions to it, I have pulled it out. Once we start running static security code analysis routinely it will presumably light up on this and I may have to do something.
require 'cdo/contact_rollups' | ||
|
||
PEGASUS_TEST_DB_NAME = "pegasus_#{Rails.env}" | ||
COLUMN_NAME_TO_INDEX_MAP = { "roles": 0, "ages_taught": 1 }.freeze |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The quotes around roles
and ages_taught
are superfluous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thx
expected_values.each do |expected_values_hash| | ||
email = expected_values_hash.keys.first | ||
email_sanitized = ActiveRecord::Base.sanitize(expected_values_hash.keys.first) | ||
results = ActiveRecord::Base.connection.execute("select roles, ages_taught from pegasus_test.contact_rollups_daily where email=#{email_sanitized}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disagree. :(
|
||
private | ||
|
||
def rollups_test_helper(expected_count, expected_values) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please provide YARD comments for helper methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thx
class PardotTest < ActiveSupport::TestCase | ||
def test_empty_contacts | ||
# Test the rollup process with an empty database | ||
rollups_test_helper 0, [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even with the comment, I still have no idea what the expectations are. Part of this is the rollups_test_helper
name, as @aoby commented on. Part of this is the method signature. Maybe
build_and_verify_contact_rollups { expected_count: 0, expected_contacts: [] }
# Create teacher 3 with one section and multiple students | ||
@teacher3 = create(:teacher, email: "rolluptestteacher3@code.org") | ||
@teacher3_section = create(:section, user: @teacher3) | ||
create_sections_helper @teacher3, [[{age: 6}, {age: 10}, {age: 14}, {age: 10}]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps to test a teacher with a section with no students? At least, I see that as being a useful test to have.
rollups_test_helper 0, [] | ||
end | ||
|
||
def test_teachers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be split into multiple tests, suggested by the fact that the test can fail for any number of reasons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the rollup process operates on a large corpus of data, I am going to leave this as is. If I split it up, each test would be operating on a database with a small number of users (such as one). By generating a mildly interesting body of test fixtures, it demonstrates that we don't have something going on such as bad joins that will generate additional rows in the output. Things like that might not show up in the smaller data set of the individual things we are verifying later.
# Verify expected values in contacts_rollup_daily | ||
expected_values.each do |expected_values_hash| | ||
email = expected_values_hash.keys.first | ||
email_sanitized = ActiveRecord::Base.sanitize(expected_values_hash.keys.first) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can hurt, because it obfuscates what is happening in the test.
assert ActiveRecord::Base.connection.execute("select count(*) from pegasus_test.contact_rollups").first[0] == 0 | ||
|
||
# Verify expected values in contacts_rollup_daily | ||
expected_values.each do |expected_values_hash| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That this is so unreadable suggests the expected_value
format is wrong. It would be less complex and more readable to have, e.g.,
expected_values = {
"rolluptestteacher1@code.org": { "roles": "Teacher", "ages_taught": nil },
"rolluptestteacher2@code.org": { "roles": "Teacher", "ages_taught": "6" },
"rolluptestteacher3@code.org": { "roles": "Teacher", "ages_taught": "6,10,14" },
"rolluptestteacher4@code.org": { "roles": "Teacher", "ages_taught": "9,10,11,14,15" }
}
and
expected_value.each do |email, expected_email_info|
results = ActiveRecord::Base.connection.execute(
"select roles, ages_taught from pegasus_test.contact_rollups_daily where email=#{email}"
)
assert_equal 1, results.count
expected_email_info.each do |column, expected_column_value|
assert_equal expected_column_value, results.first[COLUMN_NAME_TO_INDEX_MAP[column]]
end
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside: This (as written and as suggested) runs a DB query per email. It might make sense to optimize the test to grab all the results, then compare all the results. Depending, it might make sense to have expected_values
and results
directly comparable, e.g., make it so that the assertion assert_equal expected_values, results
is sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea on the simplification, done, thx. On the test run time optimization - trying to batch the 5 DB queries into 1 only saves a handful of millisecond of test run time, not worth doing.
@aoby done, thx |
This PR moves existing code out of cron jobs
build_contact_rollups
andsync_contact_rollups
and into a new library,contact_rollups.rb
. It also adds tests for the first step in the contact rollup process.contact_rollups.rb
is not new code - it is moved from the cron job code with very few changes. The major change is that I had to change it from using the Sequel database library to ActiveRecord in order to participate in the test framework and use FactoryGirl.I have not moved the cron job code to use it yet. There is a little more work to do there but the cron code will be dropping down to just a call to the library. The cron job code is not running in production at the moment.
The test right now tests that the process works at all, picks up the right contacts, and that the analysis of ages taught (one of the most important fields) works. In future PRs I will be expanding this test to cover more of the rollup process output.