Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

contact_rollups schema changes: add updated_at column and change emai… #12822

Merged
merged 1 commit into from Jan 24, 2017

Conversation

jeremydstone
Copy link

This PR adds an updated_at field to contact_rollups. This indicates the time a change was made to any column in the contact (or when contact is created in rollup) so we are able to do a minimal nightly sync to Pardot of just the changed contacts for the day.

This also changes contact_rollups.email collation order from utf8_general_ci to utf8_bin. This makes the sort order returned by SQL when sorted by email identical to how Ruby sorts strings, which is important to make the delta comparison work.

# change the contact_rollups email column to utf8_bin collation order. This is so queries from this table sorted by
# email get sorted in the exactly the same alphabetization convention as Ruby <=> operator, which is important
# for contact sync process.
up { run 'ALTER TABLE contact_rollups MODIFY email VARCHAR(255) COLLATE utf8_bin' }
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, in order to change column collation in a Sequel migration you have to use raw SQL as I have here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using utf8_bin means we lose case-insensitivity (assuming utf8_general_ci, see comment below). Is this a concern?

# email get sorted in the exactly the same alphabetization convention as Ruby <=> operator, which is important
# for contact sync process.
up { run 'ALTER TABLE contact_rollups MODIFY email VARCHAR(255) COLLATE utf8_bin' }
down { run 'ALTER TABLE contact_rollups MODIFY email VARCHAR(255) COLLATE utf8_general_ci' }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT, the existing collation is utf8 (from SHOW CREATE TABLE contact_rollups;). If so, presumably, the change is not intentional? If not, what am I missing?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is currently utf8_general_ci, I am changing it to utf8_bin. These sort differently. (Backtick character has different place in the sort order between the two collations is one difference I've noticed.) Ruby sorts internally using effectively utf8_bin so I need to match that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, the utf8 was charset, not collation. This part LGTM.

@@ -0,0 +1,7 @@
Sequel.migration do
# change the contact_rollups email column to utf8_bin collation order. This is so queries from this table sorted by
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/change/Change

@ashercodeorg
Copy link
Contributor

LGTM, assuming you have no concerns about losing case insensitivity. This means we'll need to be especially diligent about downcasing emails, as otherwise we could get multiple rows for the same email (with case changes).

@jeremydstone jeremydstone merged commit 160ac6b into staging Jan 24, 2017
@jeremydstone jeremydstone deleted the contact_rollups_schema_changes branch January 24, 2017 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants