Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rollback and retry logic #34887

Merged
merged 16 commits into from May 26, 2020
Merged

Add rollback and retry logic #34887

merged 16 commits into from May 26, 2020

Conversation

hacodeorg
Copy link
Contributor

@hacodeorg hacodeorg commented May 19, 2020

PLC-840 This PR does 3 main things:

  1. Breaking the pipeline into 3 main parts: collecting & processing contacts, syncing new contacts to Pardot, and syncing updated contacts to Pardot. This allows each part to run independently. Failures in syncing new contacts Pardot do not affect syncing updated contacts to Pardot and vice versa.
  2. Wraping all data modification queries in transactions so they can be rolled back properly when they fail.
  3. Adding retrying with exponential backoff for downloading Pardot prospects and updating existing prospects.

Other smaller things (I must admit they could be in a separate PR):

  1. Encapsulating logging responsibility into ContactRollupsV2 object.
  2. Moving query building in ContactRollupsProcessed to its own function. It will be easier to write test for this query or test it on a mysql client.
  3. Cleaning up LogCollector class.

Testing story

  • Unit tests
  • [Ongoing] Testing on an adhoc with connection to a production-clone database.

Reviewer Checklist:

  • Tests provide adequate coverage
  • Code is well-commented
  • New features are translatable or updates will not break translations
  • Relevant documentation has been added or updated
  • User impact is well-understood and desirable
  • Pull Request is labeled appropriately
  • Follow-up work items (including potential tech debt) are tracked and linked

@hacodeorg hacodeorg force-pushed the ha/cr-retry branch 2 times, most recently from b5548c8 to 28e1331 Compare May 20, 2020 01:24
@hacodeorg hacodeorg changed the title Add retry/rollback logics Add rollback and retry logic May 20, 2020
@hacodeorg hacodeorg marked this pull request as ready for review May 20, 2020 08:07
@hacodeorg hacodeorg requested review from bencodeorg and a team May 20, 2020 08:10
# @param [Integer] batch_size number of records to save per INSERT statement.
def self.import_from_raw_table(batch_size = DEFAULT_BATCH_SIZE)
# Process the aggregated data row by row and save the results to DB in batches.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no logic changes in this file. Just move the query builder into a separate method.

@@ -23,6 +23,6 @@ def self.insert_from_processed_table
FROM contact_rollups_processed;
SQL

ActiveRecord::Base.connection.exec_query(insert_sql)
transaction {ActiveRecord::Base.connection.exec_query(insert_sql)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just for consistency with other places where we're using transactions? Or is there actually a benefit of wrapping this in a transaction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one benefit, ContactRollupsFinal will be empty if this fails, thus there is no risk of accidentally using incomplete data.


# These sync steps are independent, one could fail without affecting another.
# However, if the build step above fails, none of them should run.
sync_new_contacts_with_pardot
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we have no more sync_with_pardot flag, Pardot steps will happen by default. Is that a problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I forgot about that! In bin/cron/build_contact_rollups_v2, I will call collect_and_process_contacts instead of build_and_sync.

@@ -38,9 +38,10 @@ def time(action_name = nil)
)
record_exception(e)
end
alias_method :time_and_continue, :time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain this to me? 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is mostly for backward compatibility; I create an alias for the method instead of renaming it.

Background:
In LogCollector, time method measures time to execute a block and also capture any StandardError exceptions if they arise. It doesn't re-raise exception to avoid disrupting the flow of its caller.

Meanwhile, time! method, alias: time_and_raise!, will bubble up any errors occurs.

@@ -113,7 +115,7 @@ def self.build_prospect_query_url(id_greater_than, fields, limit, deleted)
def batch_create_prospects(email, data, eager_submit = false)
prospect = self.class.convert_to_pardot_prospect data.merge(email: email)
@new_prospects << prospect

# Creating new prospects is not a retriable action, unlike updating existing prospects.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this better?

Suggested change
# Creating new prospects is not a retriable action, unlike updating existing prospects.
# Creating new prospects is not a retriable action because it could succeed
# on the Pardot side and we just didn't receive a response. If we try again,
# it would create duplicate prospects.

@@ -33,4 +33,22 @@ def test_raise_if_response_error_no_error

assert_nil PardotHelpersTest.send(:raise_if_response_error, pardot_ok)
end

def test_try_with_exponential_backoff_one_retry
tries = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly this is a way to track the number of tries that actually occurred in this test? Possible to name this something else (eg, tries_counter? Got confused looking back and forth to method which also has a counter called tries?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also might just be more clear to someone with more experience with blocks/retry/variable scope/etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I will change the variable name.

tries = 0
max_tries = 2
assert_raises RuntimeError do
PardotHelpersTest.try_with_exponential_backoff(max_tries, [RuntimeError]) do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this is more of a question hour than a proper PR review :) Why do we use the test class here instead of just using the class itself like we do elsewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem at all 😄 Since PardotHelpers is a module, we have to test it by mixing it into a test class. In this case, the module is mixed into PardotHelpersTest using extend. (extend vs. include)

Copy link
Contributor

@bencodeorg bencodeorg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Mostly just a bunch of questions for my learning, and would love to figure out how you're thinking about handling not accidentally syncing to Pardot before merging.

@hacodeorg hacodeorg merged commit 6c599dd into staging May 26, 2020
@hacodeorg hacodeorg deleted the ha/cr-retry branch May 26, 2020 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants