Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate queue to postgres #65

Merged
merged 2 commits into from Feb 7, 2018

Conversation

@djmitche
Copy link
Collaborator

commented Feb 7, 2018

Pros:

  • We can do smarter things
    • for priority
    • task affinity
    • querying tasks
    • statistics
  • Lots of cheap hosting options:
    • aws aurora postgres
    • aws rds postgres
    • heroku postgres
    • google cloud SQL postgres
    • self-hosted postgres
    • self-hosted cockroachdb (we are close to key-value usage)
  • We can host in the same data center as our web nodes (reducing latency)

Cons:

  • It's not a hand-off auto-scaling solution
  • We can do expensive queries and bring the system down

There is an initial attempt at outlining what the database schema would look at here:
https://public.etherpad-mozilla.org/p/jonasfj-queue-with-postgres

@djmitche

This comment has been minimized.

Copy link
Collaborator

commented Jun 1, 2017

If we do our heavy queries on a read-only replica, then we can avoid bringing down production.

@jonasfj

This comment has been minimized.

Copy link
Author

commented Jun 1, 2017

If we do our heavy queries on a read-only replica, then we can avoid bringing down production.

Indeed, the nice thing with azure table and similar services is that you can't make queries that won't scale. It's usually pretty obvious that a full-table scan doesn't scale. With postgres there is risk you don't find performance issues before it's scaled.

@djmitche

This comment has been minimized.

Copy link
Collaborator

commented Jun 2, 2017

Given our team's knowledge of postgres at scale, I'd say that is not so much a risk as a certainty.

@djmitche

This comment has been minimized.

Copy link
Collaborator

commented Sep 29, 2017

I'm in favor of this, but I tihnk at least two members of the team should go to some postgres admin training. @selenamarie may be able to recommend something? Ideally we'd have a good knowledge of things like sharding, replicas, explaining queries, when to use views and triggers, and so on. And probably a half dozen things with which I don't even have a passing acquaintance.

I think a proposal for this should include some specifics about how the data would be organized in Postgres, and also how we'd migrate to it.

@djmitche

This comment has been minimized.

Copy link
Collaborator

commented Oct 10, 2017

This quarter:

  • Learn us some Postgres for great good
  • Design the implementation
    • Schemas
    • Deployment / operations model (read-only replicas? Expiration? etc.)
  • Find hosting (maybe Heroku, maybe not)
  • Plan transition (we'll need to be in both Azure and Pg for a little while..)

To the end of the first bullet point, I emailed Selena

@djmitche

This comment has been minimized.

Copy link
Collaborator

commented Feb 5, 2018

I don't think there's any dissent that this is something we should do, although we haven't committed to when to do it. So I've marked it as "decided".

@djmitche djmitche merged commit 0dfa314 into master Feb 7, 2018

@owlishDeveloper owlishDeveloper deleted the rfc65 branch Feb 9, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.