Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decoding with database column type #1

Closed
kiwicopple opened this issue Sep 26, 2019 · 9 comments
Closed

Decoding with database column type #1

kiwicopple opened this issue Sep 26, 2019 · 9 comments

Comments

@kiwicopple
Copy link

Hello again, I made a comment on your other repo and I'm making steady progress towards a usable repo - thanks again for your repos, they are extremely useful for my work.

I have another noob question about the WAL and the ability to decode it with the correct DB column type. For example, the ID column in my tables are bigints, but when it gets decoded it is a string.

Here is a full example:

[debug] Received binary message: <<66, 0, 0, 0, 0, 3, 97, 225, 16, 0, 2, 54, 112, 113, 117, 141, 181, 0, 0, 3, 183>>

[debug] Decoded message: %PgoutputDecoder.Messages.Begin{commit_timestamp: #DateTime<2019-09-26 09:48:41Z>, final_lsn: {0, 56746256}, xid: 951}

[debug] Received binary message: <<85, 0, 0, 68, 8, 78, 0, 7, 116, 0, 0, 0, 1, 50, 116, 0, 0, 0, 12, 84, 101, 115, 116, 32, 112, 97, 114, 116, 110, 101, 114, 116, 0, 0, 0, 7, 49, 50, 51, 52, 53, 54, 55, 110, 110, 116, 0, 0, 0, 23, 50, 48, 49, 57, 45, 48, 57, 45, 50, 54, 32, 48, 57, 58, 52, 56, 58, 50, 49, 46, 48, 51, 55, 116, 0, 0, 0, 23, 50, 48, 49, 57, 45, 48, 57, 45, 50, 54, 32, 48, 57, 58, 52, 56, 58, 52, 49, 46, 49, 50, 53>>

[debug] Decoded message: %PgoutputDecoder.Messages.Update{changed_key_tuple_data: nil, old_tuple_data: nil, relation_id: 17416, tuple_data: {"2", "Test partner", "1234567", nil, nil, "2019-09-26 09:48:21.037", "2019-09-26 09:48:41.125"}}

As you can see the tuple_data has an value of "2" for the ID.
I will be happy to make a PR to fix this if it's possible, I just don't know exactly where to start.

@bbhoss
Copy link
Collaborator

bbhoss commented Sep 26, 2019

Hey! So I comment on this limitation in the main cainophile readme/post. The problem is that Postgres sends everything as a string/binary, as if you'd run the query

SELECT id::text FROM footable

See TupleData here for how it actually sends it. This basically makes it impossible to decode without actually being postgres. There are two options I see for it. First is to create manual mappings of known types to Elixir types, and provide a way to extend for ones we don't want to support out of the box. The other is to somehow leverage the postgres library directly as a NIF to have it do the decoding the same way a postgres server would do it. With the latter, we'd still need to translate the internal data format to Elixir types, so it's not a magic bullet. I think the former is probably the best way to do it, but I don't believe the encoding format is guaranteed anywhere, so the client library might be safer in the long run. Interested to hear what you come up with and definitely open to a pull request for either of these approaches or something I haven't considered.

@kiwicopple
Copy link
Author

OK cool, i completely missed the limitation! Let me explore first how to get this working with wal2json (since JSON is my goal anyhow), and then I will update here.

Feel free to close this, I'll still drop some notes later based on my progress

@bbhoss
Copy link
Collaborator

bbhoss commented Sep 27, 2019

Since wal2json requires compiling and installing a shared library to the Postgres server, I suspect they just have a native to json mapping internally that we could copy. Debezium has recently merged a similar effort which you could reference as well debezium/debezium#956

@kiwicopple
Copy link
Author

Hey @bbhoss . I finally had a chance to dig into this. I understand your previous comments a lot better after digging through the PG documentation that you provided. My knowledge of elixir is somewhat limited so instead I have come to a resolution where i send all of the changes (including the columns), then I do the mappings on the client (javascript in my case). This is easier for me, but also suits my use-case as it allows me to decide which columns to reformat.
I had to draw heavily on your cainophile repo to make my changes and I have put attributions in the relevant files. Let me know if you want me to improve/change the attribution.

Also i'm still using this repo and it's perfect. The fact that I don't need to use/install wal2json is a massive bonus. Thanks once again. I'll close this issue

@bbhoss
Copy link
Collaborator

bbhoss commented Oct 11, 2019

Is what you're using available publicly somewhere? I don't care about attribution but would be happy to take a look/receive improvements. It's APL2 so you're good either way.

@kiwicopple
Copy link
Author

Yes, it's publicly available here: https://github.com/supabase/realtime
Any improvements/recommendations would be appreciated, and likewise I'm happy to PR any of the changes I have made into your cainophile (I doubt you will want anything). I'll link the client library here once I deploy it

@kiwicopple
Copy link
Author

Hey, I didn't want to open a new issue for this, just wanted to let you know that I found a bunch more types that you can add to the oid database:

https://github.com/brianc/node-pg-types/blob/master/lib/builtins.js

Alternatively, you can use this:

SELECT json_object_agg(UPPER(PT.typname), PT.oid::int4 ORDER BY pt.oid)
 FROM pg_type PT
 WHERE typnamespace = (SELECT pgn.oid FROM pg_namespace pgn WHERE nspname = 'pg_catalog') -- Take only builting Postgres types with stable OID (extension types are not guaranted to be stable)
 AND typtype = 'b' -- Only basic types
 AND typelem = 0 -- Ignore aliases
 AND typisdefined -- Ignore undefined types

Hope that helps

@bbhoss
Copy link
Collaborator

bbhoss commented Jul 15, 2020

Are you still using cainophile? I want to get these PRs merged if they still work.

@kiwicopple
Copy link
Author

We no longer plug cainophile, so it's not an easy PR - here is where we adapted your OID mappings (which you will probably be able to improve): https://github.com/supabase/realtime/blob/master/server/lib/decoder/oid_database.ex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants