Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exploring wal2json #20

Closed
mmd-osm opened this issue Jul 15, 2020 · 2 comments
Closed

Exploring wal2json #20

mmd-osm opened this issue Jul 15, 2020 · 2 comments

Comments

@mmd-osm
Copy link
Contributor

mmd-osm commented Jul 15, 2020

As we were having lots of discussions on the plugin topic recently, and more challenges ahead of us when it comes to packaging the plugin for different Postgresql versions, I started exploring the option to use an existing plugin, that is part of the Postgresql repository. wal2json seemed like a good option for this exercise:

This extension is supported on those platforms that PostgreSQL is. The installation steps depend on your operating system. PostgreSQL yum repository and PostgreSQL apt repository provide wal2json packages.

There's one downside I see at the moment: they still have to figure out the same issue with excessive BEGIN/COMMIT entries, we've already solved (see eulerto/wal2json#106). Apart from that, I believe this plugin would be feature complete for our requirements.

I prepared a small demo in https://github.com/mmd-osm/osmdbt/tree/json as a proof of concept. It moves the mapping that was previously done in osm-plugin to a JSON parser in osmdbt-get-log.

In the long run this might be cheaper for us, as can leave all those Postgresql plugins details to others.

By the way, https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_PostgreSQL.html also lists wal2json as supported plugin, albeit in a somewhat older version. So this all seems like production ready code.

@joto
Copy link
Collaborator

joto commented Jul 16, 2020

Generating and parsing the verbose JSON seems like a huge and unnecessary overhead to me. We are moving quite a bit of data here and should avoid any overhead we can.

Lets move ahead with what we have now instead of getting sidetracked by shinier things somewhere else.

@joto joto closed this as completed Jul 16, 2020
@mmd-osm
Copy link
Contributor Author

mmd-osm commented Jul 16, 2020

Just for the record, osmdbt-get-log w/ catch-up for 1 mio changes takes about 41s on wal2json, and 17s on osm-logical. Both values are still well within our worst case volume for minutely replication (which would be around 500k changes). So yes, there's overhead (which was expected), but it's not totally out of proportion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants