Skip to content

BREAKING: fix json marshal unmarshal for namespace > 127 (#7810,#7838,#8091) #8599 #8602

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

harshil-goel
Copy link
Contributor

@harshil-goel harshil-goel commented Jan 10, 2023

We used to store predicate as | (pipe | signifies concatenation). We store this as a string. is 8 bytes uint64, which when marshaled to JSON bytes mess up the predicate. This is because for the namespace greater than 127, the UTF-8 encoding might take up several bytes (also if the mapping does not exist, then it replaces it with some other rune). This affects three identified places in Dgraph:

Live loader
Backup and List Backup
Http clients and Ratel
Fix:
Fix is to have a UTF-8 string when dealing with JSON. A better idea is to use UTF-8 string even for internal operations. Only when we read/write to badger we convert it into the format of the byte.
New Format: - (- is the hyphen literal)

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@mangalaman93 mangalaman93 added the slash-to-main PRs which bring slash branch on par with main. label Jan 11, 2023
@mangalaman93 mangalaman93 added this to the v23.0.0 milestone Jan 11, 2023
@harshil-goel harshil-goel changed the base branch from harshil/manifest_bug_fix to anurag/cherry-pick-drop-data January 13, 2023 12:10
@harshil-goel harshil-goel changed the title Fix bugs introduced by change in format of predicate BREAKING: fix json marshal unmarshal for namespace > 127 (#7810) #8599 Jan 13, 2023
@harshil-goel harshil-goel changed the title BREAKING: fix json marshal unmarshal for namespace > 127 (#7810) #8599 BREAKING: fix json marshal unmarshal for namespace > 127 (#7810,#7838,#8091) #8599 Jan 16, 2023
mangalaman93
mangalaman93 previously approved these changes Jan 16, 2023
@matthewmcneely
Copy link
Member

A better idea is to use UTF-8 string even for internal operations.

@harshil-goel Was this accomplished in this PR? Have all use-cases been addressed with this namespace formatting change (migration, etc)?

@mangalaman93 mangalaman93 force-pushed the anurag/cherry-pick-drop-data branch 4 times, most recently from 4e1118c to 6c16b0b Compare January 23, 2023 16:09
Base automatically changed from anurag/cherry-pick-drop-data to main January 24, 2023 10:12
We used to store predicate as <namespace>|<attribute> (pipe | signifies concatenation). We store this as a string. <namespace> is 8 bytes uint64, which when marshaled to JSON bytes mess up the predicate. This is because for the namespace greater than 127, the UTF-8 encoding might take up several bytes (also if the mapping does not exist, then it replaces it with some other rune). This affects three identified places in Dgraph:

Live loader
Backup and List Backup
Http clients and Ratel
Fix:
Fix is to have a UTF-8 string when dealing with JSON. A better idea is to use UTF-8 string even for internal operations. Only when we read/write to badger we convert it into the format of the byte.
New Format: <anmespace>-<attribute> (- is the hyphen literal)
With #7810 change, we changed the format of the predicate. We missed updating the schema and predicate. This PR fixes it.
There is an issue in ExtractNamespaceFromPredicate. The issue is the parsing was done assuming ns in <ns>-<attr> to be decimal (actually it is hexadecimal). This leads to the following issues.

A predicate a-name, it was skipped.
A predicate 11-name was parsed as namespace 11, actually it is namespace 17 (0x11).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
slash-to-main PRs which bring slash branch on par with main.
Development

Successfully merging this pull request may close these issues.

5 participants