New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(wip) Part/phase 2 of adding Topic model #1339
Conversation
Automated message from Dropbox CLA bot @showell, it looks like you've already signed the Dropbox CLA. Thanks! |
3d5beb3
to
bcfcbc0
Compare
This migration simply creates an empty Topic model with no data, so it should probably require minimal babysitting.
Only write Topics for now.
ops note: After this commit is deployed, you should start to see rows in the zerver_topic table.
bcfcbc0
to
c2c378e
Compare
This function applies another migration function to all the rows in Message.
c2c378e
to
a96cd3c
Compare
This migration could potentially take hours, even days, for really large installations. We need to test it very carefully before pushing to master.
a96cd3c
to
4fb27bb
Compare
name = models.CharField(max_length=MAX_SUBJECT_LENGTH, db_index=True) # type: text_type | ||
|
||
class Meta(object): | ||
unique_together = ("recipient", "name") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want these to be unique_together in a case-insensitive way; is that something we can do in Django?
(Or do we?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting question. I'm kind of ok with "Bart" and "BART" actually being distinct topics--one is about the Simpsons character, the other is about trains. But I think there's already a built-in concept of case insensitivity to our current code. (If we allowed the Bart/BART distinction, it would be mildly error prone for users, but realistically you're often just replying to a topic or auto-completing, so the system handles it for you.)
To your first question, I don't see a good way to do it in Django. I don't think the "unique_together" does much interesting on the app side; it's more about the database. And, for a composite key like this, we might want to do some DB stuff outside of Django anyway, like build indexes. (It's unclear to me whether postgres makes constraints provide index hints, but I know that postgres doesn't guarantee anything here.)
So, I think the most pragmatic thing to do here might be to omit the Meta
class, and just update the migrations to build a unique index/constraint directly in postgres.
Posted a few comments; overall the approach looks pretty reasonable. I'd consider adding to |
Thanks for reviewing. For all the performance considerations, it might be good to have a plan to try this on a real-world system, so that we can calibrate batch sizes, sleeps, etc., and also figure out how much the round-trips on get_or_create hurt us. The good news about this particular backfill is that it's almost completely risk-free from a data integrity perspective. Since the app won't be actually using the Topic data yet, if the backfill gets aborted, or we want to re-tune parameters on our experimental instance, we can just blow away the table and start over. |
Hello @showell, a Zulip maintainer reviewed this pull request over a week ago, but you haven't updated your pull request since. Please take a look at the requested changes and update your pull request accordingly. Thank you for your valuable contributions to Zulip! |
Heads up @showell, we just merged some commits (latest: 510b7a0) that conflict with the changes your made in this pull request! You can review this repository's recent commits to see where the conflicts occur. Please rebase your feature branch against the |
Hello @showell, a Zulip maintainer reviewed this pull request over a week ago, but you haven't updated your pull request since. Please take a look at the requested changes and update your pull request accordingly. Thank you for your valuable contributions to Zulip! |
Hello @showell, a Zulip maintainer reviewed this pull request over a week ago, but you haven't updated your pull request since. Please take a look at the requested changes and update your pull request accordingly. Thank you for your valuable contributions to Zulip! |
Heads up @showell, we just merged some commits that conflict with the changes your made in this pull request! You can review this repository's recent commits to see where the conflicts occur. Please rebase your feature branch against the |
Hello @showell, a Zulip maintainer reviewed this pull request over a week ago, but you haven't updated your pull request since. Please take a look at the requested changes and update your pull request accordingly. Thank you for your valuable contributions to Zulip! |
Hello @showell, a Zulip maintainer reviewed this pull request over 7 days ago, but you haven't updated your pull request since. Please take a look at the requested changes and update your pull request accordingly. Thank you for your valuable contributions to Zulip! |
Heads up @showell, we just merged some commits that conflict with the changes your made in this pull request! You can review this repository's recent commits to see where the conflicts occur. Please rebase your feature branch against the |
Hello @showell, a Zulip maintainer reviewed this pull request over 10 days ago, but you haven't updated your pull request since. Please take a look at the requested changes and update your pull request accordingly. Thank you for your valuable contributions to Zulip! |
Hello @showell, a Zulip maintainer reviewed this pull request over 10 days ago, but you haven't updated your pull request since. Please take a look at the requested changes and update your pull request accordingly. Thank you for your valuable contributions to Zulip! |
Closing this, since I think a 1.7 year old branch isn't going to be directly rebased at this point. Thanks @showell! |
The last four commits here comprise phase 2 of #1191, and they set us to backfill Topic from Message. This series is not yet phase 3 (where we actually link Message rows to Topic), so the only risk in this series is operational headaches. To put it another way, running this migration is unlikely to impact app users beyond down time; it just might take a really long time to do the migration.
@timabbott Please review this with a little bit of extra scrutiny, as the approach here does set the template for how we structure code in the phase 3 migration.