New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inferred schemas treat integers as floats, may silently alter data #2377
Comments
This isn't a Dgraph thing. Go parses JSON integers as float64, which is what is causing this issue. You can see an example here: https://play.golang.org/p/gCvBHpNpsVG Update: A way to avoid this would be to send the data in RDF format. In JSON, as a hex encoded string with the schema set to int upfront. |
I've been trying out hex encoded strings, but I haven't figured out how to encode them properly yet--"0x123" is rejected with "invalid syntax". Strings of digits without 0x in front ("123") are treated as decimal integers. Strings with hex letters ("1a") are rejected with "invalid syntax". |
I can confirm that N-Quads writes round-trip correctly! That makes it look like Go's JSON library can write values it can't correctly read. |
Yup. Here's a demo. You miiight want to choose a different JSON parser, or issue warnings to users when reading values that may not have round-tripped correctly. https://play.golang.org/p/kut6IgUn0r3 package main
import (
"encoding/json"
"strconv"
"fmt"
"log"
)
func main() {
fmt.Println("vim-go")
var val int64
val = 9223372036854775296
fmt.Println("Wrote int64 " + strconv.FormatInt(val, 10))
data := []byte(`{"hi": ` + strconv.FormatInt(val, 10) + `}`)
m := make(map[string]interface{})
if err := json.Unmarshal(data, &m); err != nil {
log.Fatal(err)
}
for _, v := range m {
// fmt.Printf("Key: %s. ", k)
switch v.(type) {
case int:
fmt.Println("Read int: ", v)
case float64:
fmt.Println("Read float64 ", v)
f, _ := v.(float64)
fmt.Println("As int64 ", int64(f))
default:
fmt.Println("Type unknown")
}
}
}
|
Following up on #2378. |
Since at least 1.0.2 and through 1.0.5-dev 5b93fb4, Dgraph will infer the type of new predicates with integer values as
float
, which means that unless users specify a schema up front, users could write0
, and read back0.0
. In languages with aggressive numeric type coercion, this may work fine, until users attempt to write a number which is not cleanly representable as a float. For instance, if one writes9007199254740993
to a predicate without a schema, then attempts to read that value back, dgraph will return9007199254740992
instead. Write27670116110564327426
, and2.7670116110564327E19
comes back--426 less than the value written.You can reproduce this with Jepsen 56dce4d5b875bc2eec841564f865b72168c91938 by running
The text was updated successfully, but these errors were encountered: