Skip to content

Migrate queue to postgres#65

Merged
djmitche merged 2 commits into
masterfrom
rfc65
Feb 7, 2018
Merged

Migrate queue to postgres#65
djmitche merged 2 commits into
masterfrom
rfc65

Conversation

@djmitche
Copy link
Copy Markdown
Contributor

@djmitche djmitche commented Feb 7, 2018

Pros:

  • We can do smarter things
    • for priority
    • task affinity
    • querying tasks
    • statistics
  • Lots of cheap hosting options:
    • aws aurora postgres
    • aws rds postgres
    • heroku postgres
    • google cloud SQL postgres
    • self-hosted postgres
    • self-hosted cockroachdb (we are close to key-value usage)
  • We can host in the same data center as our web nodes (reducing latency)

Cons:

  • It's not a hand-off auto-scaling solution
  • We can do expensive queries and bring the system down

There is an initial attempt at outlining what the database schema would look at here:
https://public.etherpad-mozilla.org/p/jonasfj-queue-with-postgres

@djmitche
Copy link
Copy Markdown
Contributor

djmitche commented Jun 1, 2017

If we do our heavy queries on a read-only replica, then we can avoid bringing down production.

@jonasfj
Copy link
Copy Markdown
Author

jonasfj commented Jun 1, 2017

If we do our heavy queries on a read-only replica, then we can avoid bringing down production.

Indeed, the nice thing with azure table and similar services is that you can't make queries that won't scale. It's usually pretty obvious that a full-table scan doesn't scale. With postgres there is risk you don't find performance issues before it's scaled.

@djmitche
Copy link
Copy Markdown
Contributor

djmitche commented Jun 2, 2017

Given our team's knowledge of postgres at scale, I'd say that is not so much a risk as a certainty.

@djmitche
Copy link
Copy Markdown
Contributor

I'm in favor of this, but I tihnk at least two members of the team should go to some postgres admin training. @selenamarie may be able to recommend something? Ideally we'd have a good knowledge of things like sharding, replicas, explaining queries, when to use views and triggers, and so on. And probably a half dozen things with which I don't even have a passing acquaintance.

I think a proposal for this should include some specifics about how the data would be organized in Postgres, and also how we'd migrate to it.

@djmitche
Copy link
Copy Markdown
Contributor

This quarter:

  • Learn us some Postgres for great good
  • Design the implementation
    • Schemas
    • Deployment / operations model (read-only replicas? Expiration? etc.)
  • Find hosting (maybe Heroku, maybe not)
  • Plan transition (we'll need to be in both Azure and Pg for a little while..)

To the end of the first bullet point, I emailed Selena

@djmitche
Copy link
Copy Markdown
Contributor

djmitche commented Feb 5, 2018

I don't think there's any dissent that this is something we should do, although we haven't committed to when to do it. So I've marked it as "decided".

@djmitche djmitche merged commit 0dfa314 into master Feb 7, 2018
@owlishDeveloper owlishDeveloper deleted the rfc65 branch February 9, 2019 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants