Skip to content

RDF parser Bug with Unicode Character when Export. #3383

@MichelDiz

Description

@MichelDiz

If you suspect this could be a bug, follow the template.

  • What version of Dgraph are you using?
    1.0.14, v1.0.15-rc4 and Master

  • Have you tried reproducing the issue with latest release?
    yes

  • What is the hardware spec (RAM, OS)?
    32GB, Darwin.

  • Steps to reproduce the issue (command/config used to run Dgraph).

This happened due a import from Twitter dataset from flock. The load was canceled:

[22:42:02-0300] Elapsed: 23m45s Txns: 41480 N-Quads: 41480000 N-Quads/s [last 5s]: 30200 Aborts: 952
[22:42:07-0300] Elapsed: 23m50s Txns: 41679 N-Quads: 41679000 N-Quads/s [last 5s]: 39800 Aborts: 952
2019/05/06 22:42:09 while parsing line "<0x273fdd> <description> \"Attention : ces jours-ci, Twitter pourra devenir instable, avec souvent des pro~po_~{po ─ ~ ®o n~poã_\\a~{o┼[po ╣y¿ po¿w4k*¿*n~p┌blèmes\\r\\nǝuuɐd uǝ ʇsǝ lı 'ʇsǝ ʎ ɐɔ\"^^<xs:string> .\n": while lexing <0x273fdd> <description> "Attention : ces jours-ci, Twitter pourra devenir instable, avec souvent des pro~po_~{po ─ ~ ®o n~poã_\a~{o┼[po ╣y¿ po¿w4k*¿*n~p┌blèmes\r\nǝuuɐd uǝ ʇsǝ lı 'ʇsǝ ʎ ɐɔ"^^<xs:string> . at line 1 column 25: Invalid escape character : 'a' in literal

So to reproduce it, just do like:

{
    set {
     <_:uid2> <pred> "\u0007"^^<xs:string> .
   }
}

Then export http://localhost:8080/admin/export
and the bug happens
<0x1> <description> "\a"^^<xs:string> .

When you try to reimport the RDF you have a lexing error.
"while lexing <_:0x1> <description> \"\\a\"^^<xs:string> . at line 1 column 22: Invalid escape character : 'a' in literal"

To solve this in part we should force (auto)escape. Or recommend users to do it in application level.

If you escape the string, mutate and export you gonna have a desirable result:

{
    set {
     <_:uid2> <pred> "\\u0007"^^<xs:string> .
   }
}

RDF exported:
<0x1> <pred> "\\u0007"^^<xs:string> .

Metadata

Metadata

Assignees

Labels

kind/bugSomething is broken.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions