Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JObject roundtrip issue #39

Closed
oliverjanik opened this issue Apr 1, 2016 · 13 comments
Closed

JObject roundtrip issue #39

oliverjanik opened this issue Apr 1, 2016 · 13 comments

Comments

@oliverjanik
Copy link

As promised here's a test that fails JObject roundtrip failing test. Just drop it into the GitHubIssues.cs

        [Test]
        public void issue_39()
        {
            var dateTime = DateTime.Parse("2016-09-03T10:30:20Z");
            var obj = new JObject(new JProperty("timestamp", dateTime));

            var table = R.Db(DbName).Table(TableName);
            table.Delete().Run(conn);

            var result = table.Insert(obj).RunResult(conn);
            var id = result.GeneratedKeys[0];

            var check = table.Get(id).RunAtom<JObject>(conn);
            check.Dump();

            var dt = (DateTime)check["timestamp"];
            dt.Should().Be(dateTime);
        }

There's the output of check.Dump():

{
  "id": "e605a378-d8f9-4d0d-b910-e4707c0ce6b0",
  "timestamp": {
    "$reql_type$": "TIME",
    "epoch_time": 1472898620,
    "timezone": "+10:00"
  }
}
@bchavez
Copy link
Owner

bchavez commented Apr 1, 2016

Hey Oliver,

Thanks for taking the time to report this. I appreciate it.

I think what's happening here is when you specify RunAtom<JObject> the driver passes JObject to Newtonsoft. This causes Newtonsoft totally bypasses our converters because it doesn't need them for converting a JObject. So, check remains a JObject pseudo type. And 💥 goes the exception. Using a JObject is like manual mode. There's a blurb about raw types in the documentation Raw Format Options wiki page here. I updated a note about it under Extra C# Features:JObject Support wiki page too.

If you want to pull this DateTime out of the JObject pseudo type, you'll need to do something like this:

var dt = check["timestamp"].ToObject<DateTime>(Net.Converter.Serializer);

Let me know if this helps.

PS. I'm going on off-grid tomorrow morning. So I won't be back for a couple of weeks. Also, I won't be around the Internet very much during my time off.

@oliverjanik
Copy link
Author

It's good that you have that documented, but it doesn't help with my use-case. RethinkDB is a JSON store. All I'm doing is providing a thin HTTP wrapper around it. PUT saves json GET retrieves it back. As it stands I haven't found a good way to achieve non-destructive round-trip. Particularly when DateTimes are involved.

I think the driver should attempt and interpret the datums as native types if possible. If it comes accross $reql_type$: "TIME" it will convert that into a DateTimeOffset. The users of the library should be blissfully unaware of the underlying data model.

@oliverjanik
Copy link
Author

Looks like I'll need to to a post-processing pass on the JObjects. I can't have POCOs, unfortunately.

The problem stems from impedance mismatch between JSON and native REQL representation. JSON does not have advanced types like DateTime.

What happens here is you get value like 2016-09-03T10:30:20Z on the wire in JSON. JSON.NET interprets this as DateTime object for some reason then Reql driver saves it as native TIME type.

On the way back ReqlDriver makes no interpretation $reql_type$ and I get what's above.

I'm wondering what the solution is here? Should Reql Driver have a mode where it interprets $reql_time$ or should it have hooks for custom serialization? Or both?

@bchavez
Copy link
Owner

bchavez commented Apr 29, 2016

Hi @oliverjanik,

So I've been sort of avoiding this particular issue because of the complexities involved. But I guess it's time to discuss the music. 🎹 😸

What happens here is you get value like 2016-09-03T10:30:20Z on the wire in JSON. JSON.NET interprets this as DateTime object for some reason then Reql driver saves it as native TIME type.

In particular, I need some context, from what perspective?

App -> Driver -> Wire -> RethinkDB
or
RethinkDB -> Wire -> Driver -> App?

  • On the way out: The driver to RethinkDB: the driver will serialize any DateTime or DateTimeOffset to a $reql_type$:TIME. So by the time it hits the wire, the actual time is already in $reql_type$:TIME.
  • On the way back: From RethinkDB to the Driver: the driver picks up $reql_type$:TIME JSON over the wire.

In both cases, 2016-09-03T10:30:20Z is $reql_type$:TIME over the wire to and from RethinkDB server.

Now, as you've described, the problem arises when you have requested RunHelper<JObject> where T is JObject. Currently, we immediately take the wire JSON and feed it to Newtonsoft by saying something like JsonConvert.DeseralizeObject<JObject>(wire_json, UseOurSeralizer).

And here's where it gets tricky. Since JObject is a Newtonsoft special token type, I am not sure our UseOurSeralizer with $reql_type$ converters are considered during deserialization. My gut feeling here is I think Newtonsoft does not fully parses the response. It's almost like Newtonsoft lazy-loads the wire JSON only up to the first JObject token. And by this time, we've already returned JObject to the user. So how would our $reql_type$ converters ever get kicked into gear when Newtonsoft only parses the very first token of the response (not the entirety of the response)?

So far, the only solutions I can really think of are:

Solution A
One solution is to have a special test inside all the Run* methods to test for T is JToken, then run the entirety of the response through a 2nd stage converter (like Solution B). Just feels ugly and hackish, and doesn't feel right. 😢

Solution B
Give the user a conversion utility helper? Originally back in November, we had a more proactive converter. See this commit. whereby all the $reql_type$ tokens were replaced using JSON Path. We could possibly bring this back and expose it to the user. Again, just feels ugly, hackish and doesn't feel right. 😢

So I could be completely wrong since it's so very late here... but I think this is fundamentally what is happening here and I don't think there's an easy answer for this.

@oliverjanik
Copy link
Author

Sorry, I wasn't clear enough. The flow of data is like this:

Client -> Wire -> MyApi -> Driver -> RethinkDB

In MyApi I receive data using WebAPI controller typed as JObject. This is where the conversion from 2016-09-03T10:30:20Z to DateTime happens for some reasons. I don't think this is good behaviour but it is what it is.

I pass this JObject without any modifications to the Driver which converts DateTime to $reql_type$:TIME. This is correct behaviour.

On the way back I have this:

RethinkDB -> Driver -> MyApi -> Wire -> Client

I grab my object from the Driver by id typed as JObject and I don't get the DateTime back. This is where I think the interpretation should happen so I would get DateTime back.

Then I return this JObject as response from my controller.

Result:

The net result is that from Client's perspective they don't get what they stored within my system.

@oliverjanik
Copy link
Author

oliverjanik commented Apr 29, 2016

I don't think the driver should treat JObject any different to other objects. That is implementation detail.

I do believe you need a 2 stage parser. If I ask for IDictionary<string, object> and one of the values is $reql_type$:TIME, what do I get back? I would expect DateTime.

@bchavez
Copy link
Owner

bchavez commented Apr 30, 2016

Unfortunately, after reviewing some Newtonsoft source it looks like we would need a second stage parser specifically for JObject to resolve this impedance mismatch. The JsonReader is non-cached, forward-only parser. 🌵 🔥

The only reasonable alternative it seems is to bring back our JSON Path converter. Use JSON Path to iterate over the JObject and pluck out all the $reql_type$ pseudo types and replace them appropriately.

I'll need a few days to deal with this since this is a major breaking change and would mean we'd need to support raw options again.

@bchavez
Copy link
Owner

bchavez commented Apr 30, 2016

Also, @oliverjanik , after exploring your specific issue more from

Client -> Wire -> MyApi

You can configure Newtonsoft to parse JSON date / times as DateTimeOffsets with the following

var json = @"{
                ""name"":""hello kitty"",
                ""dob"":""2016-09-03T10:30:20Z""
              }";

var settings = new JsonSerializerSettings
    {
        DateParseHandling = DateParseHandling.DateTimeOffset
    };

var j = JsonConvert.DeserializeObject<JObject>(json, settings);
// now, internally j["dob"] should be DateTimeOffset, no longer DateTime.

@oliverjanik
Copy link
Author

Thanks for the DateParseHandling tip.

If I set it to DateParseHandling.None my timestamp should stay a string and that would avoid this roundtrip issue. It would prevent me from running time functions on the database though.

bchavez added a commit that referenced this issue May 3, 2016
…ow, by default, get passed though pseudo type conversion before being passed back to the user.
bchavez added a commit that referenced this issue May 3, 2016
Fixes #39. Breaking Change: JObject and JArray (JToken derivatives) now, by default, get passed through pseudo type conversion before being passed back to the user.
@bchavez
Copy link
Owner

bchavez commented May 11, 2016

Hi @oliverjanik , 2.3.1-beta-2 should have this fix now. All JToken return types should have their underlying pseudo types converted by default. If you still want to retain old raw type behavior, you'll need to specify .Run(conn, new { time_format="raw"}). Let me know how it goes. Thanks!

https://www.nuget.org/packages/RethinkDb.Driver/2.3.1-beta-2

-Brian

@oliverjanik
Copy link
Author

Thanks heaps, you're a legend.

@oliverjanik
Copy link
Author

@bchavez I've just upgraded to 2.3.3 stable and I still see this:

{
  "_createdBy": "oliver",
  "_storedAt": {
    "$reql_type$": "TIME",
    "epoch_time": 1464077642.148,
    "timezone": "+00:00"
  },
  "id": "03233075-1c5b-4df5-ba2e-50de71f58949",
  "stamp": "2016-05-24T17:32:03Z",
  "test": "hello!"
}

Is there anything I need to configure?

@oliverjanik
Copy link
Author

Alright I had RunAtomAsync<object>(m.Conn) in the code and I changed it to RunAtomAsync<JObject>(m.Conn) and it looks ok apart from #64

The abstraction leaks. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants