Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(wip) Part/phase 2 of adding Topic model #1339

Closed
wants to merge 7 commits into from

Conversation

showell
Copy link
Contributor

@showell showell commented Jul 21, 2016

The last four commits here comprise phase 2 of #1191, and they set us to backfill Topic from Message. This series is not yet phase 3 (where we actually link Message rows to Topic), so the only risk in this series is operational headaches. To put it another way, running this migration is unlikely to impact app users beyond down time; it just might take a really long time to do the migration.

@timabbott Please review this with a little bit of extra scrutiny, as the approach here does set the template for how we structure code in the phase 3 migration.

@smarx
Copy link

smarx commented Jul 21, 2016

Automated message from Dropbox CLA bot

@showell, it looks like you've already signed the Dropbox CLA. Thanks!

This migration simply creates an empty Topic model with no data,
so it should probably require minimal babysitting.
ops note: After this commit is deployed, you should start
to see rows in the zerver_topic table.
This function applies another migration function to all the
rows in Message.
This migration could potentially take hours, even days, for
really large installations.  We need to test it very carefully
before pushing to master.
name = models.CharField(max_length=MAX_SUBJECT_LENGTH, db_index=True) # type: text_type

class Meta(object):
unique_together = ("recipient", "name")
Copy link
Sponsor Member

@timabbott timabbott Jul 21, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want these to be unique_together in a case-insensitive way; is that something we can do in Django?

(Or do we?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting question. I'm kind of ok with "Bart" and "BART" actually being distinct topics--one is about the Simpsons character, the other is about trains. But I think there's already a built-in concept of case insensitivity to our current code. (If we allowed the Bart/BART distinction, it would be mildly error prone for users, but realistically you're often just replying to a topic or auto-completing, so the system handles it for you.)

To your first question, I don't see a good way to do it in Django. I don't think the "unique_together" does much interesting on the app side; it's more about the database. And, for a composite key like this, we might want to do some DB stuff outside of Django anyway, like build indexes. (It's unclear to me whether postgres makes constraints provide index hints, but I know that postgres doesn't guarantee anything here.)

So, I think the most pragmatic thing to do here might be to omit the Meta class, and just update the migrations to build a unique index/constraint directly in postgres.

@timabbott
Copy link
Sponsor Member

timabbott commented Jul 21, 2016

Posted a few comments; overall the approach looks pretty reasonable. I'd consider adding to migrate_all_messages support for a sleep=1 option and using that in the migration (sleeping 1s between batches); this can help a lot with allowing other traffic through the database in case there is transaction/lock contention (I seem to recall we had to do that for various migrations back in the day).

@showell
Copy link
Contributor Author

showell commented Jul 21, 2016

Thanks for reviewing. For all the performance considerations, it might be good to have a plan to try this on a real-world system, so that we can calibrate batch sizes, sleeps, etc., and also figure out how much the round-trips on get_or_create hurt us.

The good news about this particular backfill is that it's almost completely risk-free from a data integrity perspective. Since the app won't be actually using the Topic data yet, if the backfill gets aborted, or we want to re-tune parameters on our experimental instance, we can just blow away the table and start over.

@showell showell changed the title Part/phase 2 of adding Topic model (wip) Part/phase 2 of adding Topic model Aug 10, 2016
@zulipbot
Copy link
Member

Hello @showell, a Zulip maintainer reviewed this pull request over a week ago, but you haven't updated your pull request since. Please take a look at the requested changes and update your pull request accordingly.

Thank you for your valuable contributions to Zulip!

@zulipbot
Copy link
Member

Heads up @showell, we just merged some commits (latest: 510b7a0) that conflict with the changes your made in this pull request! You can review this repository's recent commits to see where the conflicts occur. Please rebase your feature branch against the upstream/master branch and resolve your pull request's merge conflicts accordingly.

@zulipbot-test
Copy link

Hello @showell, a Zulip maintainer reviewed this pull request over a week ago, but you haven't updated your pull request since. Please take a look at the requested changes and update your pull request accordingly.

Thank you for your valuable contributions to Zulip!

@zulipbot
Copy link
Member

Hello @showell, a Zulip maintainer reviewed this pull request over a week ago, but you haven't updated your pull request since. Please take a look at the requested changes and update your pull request accordingly.

Thank you for your valuable contributions to Zulip!

@zulipbot
Copy link
Member

zulipbot commented Aug 1, 2017

Heads up @showell, we just merged some commits that conflict with the changes your made in this pull request! You can review this repository's recent commits to see where the conflicts occur. Please rebase your feature branch against the upstream/master branch and resolve your pull request's merge conflicts accordingly.

@zulipbot-test
Copy link

Hello @showell, a Zulip maintainer reviewed this pull request over a week ago, but you haven't updated your pull request since. Please take a look at the requested changes and update your pull request accordingly.

Thank you for your valuable contributions to Zulip!

@zulipbot
Copy link
Member

Hello @showell, a Zulip maintainer reviewed this pull request over 7 days ago, but you haven't updated your pull request since. Please take a look at the requested changes and update your pull request accordingly.

Thank you for your valuable contributions to Zulip!

@zulipbot
Copy link
Member

zulipbot commented Nov 21, 2017

Heads up @showell, we just merged some commits that conflict with the changes your made in this pull request! You can review this repository's recent commits to see where the conflicts occur. Please rebase your feature branch against the upstream/master branch and resolve your pull request's merge conflicts accordingly.

@zulipbot
Copy link
Member

zulipbot commented Dec 4, 2017

Hello @showell, a Zulip maintainer reviewed this pull request over 10 days ago, but you haven't updated your pull request since. Please take a look at the requested changes and update your pull request accordingly.

Thank you for your valuable contributions to Zulip!

@zulipbot-test
Copy link

Hello @showell, a Zulip maintainer reviewed this pull request over 10 days ago, but you haven't updated your pull request since. Please take a look at the requested changes and update your pull request accordingly.

Thank you for your valuable contributions to Zulip!

@timabbott
Copy link
Sponsor Member

Closing this, since I think a 1.7 year old branch isn't going to be directly rebased at this point. Thanks @showell!

@timabbott timabbott closed this Mar 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants