-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Adding global email model #5913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/sentry/models/useremail.py
Outdated
|
|
||
| user = FlexibleForeignKey(settings.AUTH_USER_MODEL, related_name='emails') | ||
| email = models.EmailField(_('email address')) | ||
| global_email = FlexibleForeignKey('sentry.Email', related_name='useremails', null=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably don't need this since you can just index the email string and query on it
mattrobenolt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We definitely need to ship the data migration after the schema migration, otherwise you'll end up with some rows that are still NULL.
It should also be considered if we care about case sensitivity here or not. I feel like we should always lowercase the value to prevent unintended duplicates due to case differences. Matt@example.com vs matt@example.com
And, what is going to keep this up to date for new rows? I don't see any code that's adding new Email rows except for the migration.
| for useremail in RangeQuerySetWrapperWithProgressBar(queryset): | ||
| email, _ = Email.objects.get_or_create(email=useremail.email) | ||
| useremail.global_email = email | ||
| useremail.save() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't do this .save() Do the UserEmail.objects.filter(id=useremail.id).update(global_email=email) pattern instead to avoid race condition.
|
@mattrobenolt good point. I'm looking at |
|
@mattrobenolt I could modify it such that the host is case insensitive, but the username part of email can be case-sensitive. Obviously, Gmail, Yahoo, and Outlook make the username case-insensitive, so we could be the same, but email standard itself mandates case sensitivity. I'm not sure if this would be a security issue—I'll defer to whatever you decide. |
|
Yeah, internally @tkaemming brought this up. But our own behavior for normal If we have control over this, we can just apply a functional index that would handle uniqueness as case insensitive to enforce it correctly. |
|
Can someone pull data on if we have case duplicated emails? if not lets move towards case insensitive. |
|
Let's do a few things:
|
I was going to do this with a functional index to enforce it on the database. Something like, |
|
@mattrobenolt is there a good reason to not store things lowercase? its a lot easier to be compatible if we do that |
|
But yeah, we can do a field as well, I think having the index enforce it makes it more explicit on the db side. I think my concerns are how we have to deal with User.email today. We have to always make sure we do a case insensitive filter, etc. Which may be error prone. I don't think a field would help with this case, would it? For example, doing |
|
An index wouldn't coerce a filter clause, but a field could. We should see if we can do that, though someone from @getsentry/platform might need to help here. I think doing the unique on lower() would be useful to ensure the guarantee, but we should do both. |
|
We can't assume all email servers are case insensitive. If example.com's email server is case-sensitive, then someone with the address I prefer Matt's suggestion of enforcing in the index, but storing the case-sensitive value. |
|
I'd prefer a case insensitive text field, but given we dont have that, I'm ok with saiyng "we will not support case sensitive email servers". |
|
Adding this as reference:
Therefore, it's probably safe to assume case-insensitivity. |
|
Per talk internally, let's just make it a 'citext' field for postgres, and assume its case insensitive. In MySQL we might be able to use a case insensitive collation, but im not sure if Django lets us provide that. Ensuring correctness for SQLite would be nice, but i'm not sure we have an option. |
mattrobenolt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still need to move the backfill to another deploy. The problem is there will be a race condition. We'll deploy the new model, run the backfill, then between running the backfill and new code wtih the signal being live, we'll miss data.
src/sentry/models/email.py
Outdated
| """ | ||
| __core__ = True | ||
|
|
||
| email = CIEmailField(_('email address')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need unique=True here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true. Added.
|
Removed backfill |
|
@mattrobenolt We need to create the citext extension within the test database. Can you help with this? |
Security concerns found
Migration Checklist
Generated by 🚫 danger |
src/sentry/models/email.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add a date_added field here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
|
this is actually good to go at this point right @ehfeng? (i didnt check tests) |
|
I gotta take over and figure out what's causing citext to fail on some of the tests when I get a few spare minutes to poke at it. |
src/sentry/utils/pytest/sentry.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this code should probably happen after fix_south is called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does. Unless you mean immediately after. But either way, I'm gonna see what makes things happy here. This was me just guessing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean immediately after, but not sure if it will actually cause issues by keeping it below. Theres sort of a "things should be in an ok state at this point", and I'd just suggest bundling the "database should be ready" parts.
|
i'll take this over and figure out why tests are broken real quick |
|
pushed a fix for citext extension |
|
@dcramer not sure why it's not working, but it looks like things passed: https://travis-ci.org/getsentry/sentry/builds/273059785 |
be8d53c to
5120b0a
Compare
|
It looks like master is broken right now. I'm going to wait until it's green, rebase, and see what happens. The current failing test seems to have appeared in 09883f6 |
|
You'll need to regenerate migration for new id since it's out of date from current master. |
|
Regenerated. |
Adding Email model Adding post_save triggers on UserEmail to fill in Email Adding on_delete signal adding CIText to testing setup
| __core__ = True | ||
|
|
||
| email = CIEmailField(_('email address'), unique=True) | ||
| date_added = models.DateTimeField(default=timezone.now) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't need an index on this now, we'll probably need to add one later if we try to do anything with this table regarding cleanup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to punt on this because I'm unsure when we'd need to do such a cleanup.
|
|
||
|
|
||
| def delete_email(instance, **kwargs): | ||
| if UserEmail.objects.filter(email=instance.email).exists(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to note here, that this might be slightly inaccurate due to this not being a case insensitive lookup. You should do email__iexact. Because Email is case insensitive but UserEmail is, you'll get false positives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point: #6070

Because emails are not distinct per
User, we're using theEmailto store email-specific settings (email subscription state).UserEmailwill represent verification that a given user "owns" an email address.Alternatively,
UserEmail.emailcould just be a key forEmail.id, where the id value is the email string. The value for this seems minimal (UserEmail.emailis never updated in place, so the chance ofUserEmail.email != UserEmail.global_email.emailis low), so I've done the more naive approach.