Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transaction metadata gets dropped by Unicode NUL character #1421

Closed
jpy-luke opened this issue May 31, 2023 · 1 comment · Fixed by #1631
Closed

Transaction metadata gets dropped by Unicode NUL character #1421

jpy-luke opened this issue May 31, 2023 · 1 comment · Fixed by #1631
Assignees
Labels
bug Something isn't working

Comments

@jpy-luke
Copy link

jpy-luke commented May 31, 2023

OS
Your OS: Gentoo Linux

Versions
The db-sync version (eg cardano-db-sync --version):
cardano-db-sync 13.1.1.1 - linux-x86_64 - ghc-8.10
git revision ec5b22b
PostgreSQL version: 15.3

Build/Install Method
The method you use to build or install cardano-db-sync: nix build

Run method
The method you used to run cardano-db-sync (eg Nix/Docker/systemd/none): manually from CLI

Additional context
I traced this to a workaround made for #297 , now in https://github.com/input-output-hk/cardano-db-sync/blob/master/cardano-db-sync/src/Cardano/DbSync/Era/Util.hs line 50. Obviously the workaround has come as is to production.

TLDR; PostgreSQL can not record \0 in stringy types. JSON and Unicode encodings may contain \0. JSON formatted value is dropped altogether.

Problem Report

While running cardano-db-sync to initially populate the database, there are intermittent Warning log level messages:
[db-sync-node:Warning:75] [2023-05-31 11:02:29.12 UTC] prepareTxMetadata: dropped due to a Unicode NUL character.

The warning message is misleading. Tx metadata is not dropped, but simply the json column of the row is recorded as null. Metadata payload is still stored in CBOR format in the bytes column.

While this behaviour may not be critical, it can be a surprise to database consumer. It might be a good idea to document in the schema.md that the json column has this limitation and that the bytes column records the object payload in CBOR encoding, if not producing some actual fix to this.

@jpy-luke jpy-luke added the bug Something isn't working label May 31, 2023
@kderme
Copy link
Contributor

kderme commented May 31, 2023

Indeed we can improve the logs for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants