Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify run helpers? #43

Open
oliverjanik opened this issue Apr 20, 2016 · 13 comments
Open

Simplify run helpers? #43

oliverjanik opened this issue Apr 20, 2016 · 13 comments
Milestone

Comments

@oliverjanik
Copy link

One one the first things that confused me was why are there so many Run* methods.

Let's see (excluding async versions):

  1. object Run()
  2. object Run<T>()
  3. void RunNoReply()
  4. Cursor<T> RunCursor<T>()
  5. T RunAtom<T>()
  6. T RunResult<T>()
  7. Result RunResult()
  8. Cursor<Change<T>> RunChanges<T>()
  9. IEnumerable<GroupedResult<TKey, TItem>> RunGrouping<TKey, TItem>()

Why is (2) needed? If i give it a and it returns object it's not at useful as it could be. Here's suggestion:

  1. object Run()
  2. T Run<T>()
  3. Cursor<T> RunCursor<T>()

If you need Result you'd just do Run<Result>(), if you need Atom you'd do Run<int>.

What do you think?

@bchavez
Copy link
Owner

bchavez commented Apr 20, 2016

Hey Oliver,

To answer your question (from documentation) the differences are subtle but valuable differences. Each vary in strictness in what they can deserialize:

Wiki: Run Helpers

.RunAtom

When the response type for a query is a SUCCESS_ATOM the driver is expected to return an object of T. Typically, queries that return singleRowSelection are atom responses.

Hero result = R.Db("marvel").Table("heros")
              .Get("iron_man")
              .Run<Hero>(conn);

var result = R.Db("marvel").Table("heros")
              .Get("iron_man")
              .RunAtom<Hero>(conn);

Emphasis on parsing SUCCESS_ATOM responses only.


.RunResult<T> (Generic)

.RunResult<T>() helps deserialize T object atom (SUCCESS_ATOM) OR finished sequences List<T> (SUCCESS_SEQUENCE) List<T> server responses.

Either SUCCESS_ATOM or SUCCESS_SEQUENCE.


.RunResult (Non Generic) DML

.RunResult() helps with DML (data manipulation language) type of result queries:

var result = R.Db("marvel").Table("heros")
              .Insert({"name":"Iron Man"})
              .RunResult(conn);
/*
{
    "deleted": 0,
    "errors": 0,
    "inserted": 1,
    "replaced": 0,
    "skipped": 0,
    "unchanged": 0
}
*/

As for .Run<T>, per documentation:

Wiki: Dynamic Language Runtime (DLR) Integration

/* long result is 4 of type long */
long result = R.Table("foobar").Get("a")["Baz"].Run(conn);

// Give the compiler and run-time more type information with run<T>()
/* int result is 4 of type int */
int  result = R.Table("foobar").Get("a")["Baz"].Run<int>(conn);

Notice the result declarations long and int and their respective Run and Run<T> calls. The underlying deserializer, Newtonsoft.Json determined the underlying type (without T) is a long. Given T as int, the deserializer can make a more specific deserialization of the result.


Thanks,
Brian

@bchavez
Copy link
Owner

bchavez commented Apr 20, 2016

Ultimately, though, I like the way run helpers cover all response types without involving DLR. I was using the driver in my projects and I ran into situations where I wanted .Run to do more. I prefer having the .RunMethod(conn) call to be as simplified as possible.

Imagine:

var result = 
      R.Expr(games).Group("player")
       .Run<GroupedResult<string, Game>>(conn);
       // Eh. Double Generic. Not so pretty nor elegant ...

versus:

var result = 
    R.Expr(games).Group("player")
     .RunGrouping<string, Game>(conn);
     // legit

If you have better naming suggestions for the run helpers we could discuss those suggestions.

@oliverjanik
Copy link
Author

oliverjanik commented Apr 20, 2016

Thanks Brian, I have read that documentation. I'm playing devil's advocate here.

As the user of the library I don't quite see why there is a difference between Run<T>, RunResult<T>, RunAtom<T> or why I should even care. They all return json that I want serialized back to <T>.

  1. The object Run<T> is the most confusing. If I ask for a an int why does it return dynamic? In what situation would this be useful?
  2. The T RunResult<T> covers other ones, doesn't it? if result is SUCCESS_ATOM then RunResult<T> does exact same thing as RunAtom<T>, correct?
  3. And RunResult<Result>() does the same thing as RunResult(), right?

@bchavez
Copy link
Owner

bchavez commented Apr 20, 2016

Hi @oliverjanik ,

No worries. 👍

Regarding 1

We briefly had a technical discussion about this in the early days of the driver. I believe the discussion started here: #4 (comment)

IIRC, T .Run<T> becomes an issue with Cursors. While it's easy for us to return T as int or long or Person. T is freely provided by the compiler and can be used in the C# language by generic parameter. What is not so easy (nor free) is determining T in .Run<Cursor<T>>, where T is the cursor's item type. We need T in .Run<Cursor<T>> when deserializing Cursors. To do that, we would need to use System.Reflection to probe for T in Run<Cursor<T>>, and as we know, System.Reflection is slow and gets ugly fast. Newtonsoft has no idea how to construct Cursor<T> from JSON.

So, dynamic .Run<T> will get us: T as int or long or Person or Cursor<T> ✨ free without much hassle. We have T immediately available as a generic argument. Additionally, dynamic .Run<T> behavior is consistent with the Java driver and all the other official drivers where return type could be T or Cursor<T>.

Regarding 2

The T RunResult covers other ones, doesn't it? if result is SUCCESS_ATOM then RunResult does exact same think as RunAtom, correct?

RunResult<T> handles both SUCCESS_ATOM and SUCCESS_SEQUENCE responses. But .RunAtom only handles SUCCESS_ATOM. So, .RunAtom could be useful to someone out there that only wants to ensure query SUCCESS_ATOM responses. There may be a use case and would like to leave this door open for them.

Regarding 3

And RunResult<Result>() does the same thing as RunResult(), right?

Yes, I think so. In terms of aesthetics and style I think RunResult() is a lot better than RunResult<Result>(). 😎


As the user of the library I don't quite see why there is a difference between Run, RunResult, RunAtom or why I should even care. They all return json that I want serialized back to .

You make a good point. I would agree that the protocol details have bled into run helpers (such as needing to know what your response type is: SUCCESS_ATOM, SUCCESS_SEQUENCE, SUCCESS_PARTIAL). We're giving a full range of performance options albeit at a slightly cost of complexity of needing to know which run helper to use correctly. We could be open to suggestions on better naming of the run helpers but removing some I don't think is a good idea.

Thoughts?

@oliverjanik
Copy link
Author

oliverjanik commented May 18, 2016

@bchavez

I'm running into a problem with Run helpers. I have a query like this:

r.db.table().getAll()

Which returns Cursor, so I use RunCursor<>. Now when I modify the query like this:

r.db.table().getAll().orderBy()

It doesn't return cursor, it returns Atom.

Which helper do I use to cover both cases? The orderby expression is appended on the fly.

Edit:
I just found this: http://stackoverflow.com/questions/25375082/when-will-rethinkdb-return-a-cursor
It seems it's entirely up to server to return whatever response type it likes and drivers need to cope.

Should RunCursor support all response types including non streaming ones and just provide in memory shim for iteration?

@bchavez
Copy link
Owner

bchavez commented May 18, 2016

Hey @oliverjanik, depends on the response type. I'd need to see the protocol traces for each. There are two types of cursor responses: cursor partial and cursor complete. If it's a complete cursor response then .RunResult<T> run helper can parse both atom and cursor complete. However, _I think_ it would be unsafe to use a run helper in this respect because a table().getall() could return a partial cursor eventually causing .RunResult<T> to throw.

Also, if you need any more help, I'm available on slack. http://slack.rethinkdb.com @bchavez

@bchavez
Copy link
Owner

bchavez commented Apr 25, 2018

One thing about run helpers that bothered me was the sort of ambiguous method overload RunResult method:

  1. T RunResult<T>()
  2. Result RunResult()

So, I've been thinking about renaming the 7. Result RunResult() to:

  1. Result RunWrite().

The Go driver uses RunWrite and I think the name is better suited for what it actually does, write to the database and get a write Result response. :)

This will probably make it into 2.4 driver release with an [Obsolete] on Result RunResult().

☀️ 🌙 "I've been counting down the days and the nights..."

@bchavez bchavez added this to the 2.4 milestone Apr 25, 2018
@oliverjanik
Copy link
Author

oliverjanik commented Aug 22, 2019

Hi @bchavez This has bit me again.

I have situation where a query gets built dynamically based on various parameters.

This depending actual query that gets issued I can get one of 3 responses:

  1. SUCCESS_SEQUENCE
  2. SUCCESS_PARTIAL
  3. SUCCESS_ATOM with an array

From my point of view I just need to return an array out of my API.

Is there a single Run Helper method that covers the 3 scenarios?

RunCursor does not handle SUCCESS_ATOM, RunResult<List> does not handle SUCCESS_PARTIAL.

would it be possible for RunCursor to support SUCCESS_ATOM if the result is an array?

@bchavez
Copy link
Owner

bchavez commented Aug 22, 2019

Hi @oliverjanik,

Pretty interesting problem and scenario.

My initial thoughts are not to make contract changes to .RunCursor() as it relates to the protocol responses being read from the wire.

However, I think there is an escape hatch you can utilize. Specifically, the undocumented .RunUnsafeAsync() method. See #141 for more info.

Basically, send your dynamic query down .RunUnsafeAsync(conn) and handle the raw Response object you get back manually.

Normally, the Run Helpers, like .RunResult() take a Response object and check the kind of response via Response.IsAtom or Response.IsSequence and do something specific with the response type. See below:

if( res.IsAtom )
{
try
{
if( typeof(T).IsJToken() )
{
if( res.Data[0].Type == JTokenType.Null ) return (T)(object)null;
var fmt = FormatOptions.FromOptArgs(query.GlobalOptions);
Converter.ConvertPseudoTypes(res.Data[0], fmt);
return (T)(object)res.Data[0]; //ugh ugly. find a better way to do this.
}
return res.Data[0].ToObject<T>(Converter.Serializer);
}
catch ( IndexOutOfRangeException ex )
{
throw new ReqlDriverError("Atom response was empty!", ex);
}
}

public virtual bool IsAtom => this.Type == ResponseType.SUCCESS_ATOM;
public virtual bool IsSequence => this.Type == ResponseType.SUCCESS_SEQUENCE;
public virtual bool IsPartial => this.Type == ResponseType.SUCCESS_PARTIAL;

The Response.IsAtom, Response.IsSequence, and Response.IsPartial help determine the kind of response to process SUCCESS_ATOM, SUCCESS_SEQUENCE, or SUCCESS_PARTIAL respectively. Then the Run Helpers sometimes apply transformations over the JSON data (ie: Converter.ConvertPseudoTypes). Finally, the Run Helpers returns the appropriate type for consumption.

A possible solution to your problem would probably involve having a run helper extension method that uses .RunUnsafeAsync under the hood that processes the Response object manually and can handle all 3 response types for your specific use case. Be mindful of disposing of any new Cursor objects you create manually or you could run the risk creating a memory leak.

Let me know if you think this approach could help.

Thanks,
Brian

🚶‍♂️ 🚶‍♀️ Missing Persons - Walking in L.A.

@bchavez
Copy link
Owner

bchavez commented Aug 22, 2019

Also, another approach could be to use some ReQL primitives that force the response object to always be SUCCESS_SEQUENCE or SUCCESS_PARTIAL, but never SUCCESS_ATOM. For example, .CoerceTo could help or something like r.expr([]).prepend(r.expr({foo:'bar'})) which forces the {foo:'bar'} object literal of SUCCESS_ATOM into an array of a SUCCESS_SEQUENCE response type.

@oliverjanik
Copy link
Author

My solution at the moment is to coerce everything to "array". I'm wondering if I lose out on benefits of actual cursors then. Or maybe it might strain the DB too much when requesting large dataset.

I'm a simple man when I issue Select clause against SQL Server I always get a dataset I can iterate. I don't care if it's cursor or list or atom or whatever.

With RethinkDB it seems to decide on the fly what kind of response it gives. Makes things rather non-trivial when using the driver. I'm not sure how predictable the response type is, given same input.

I'll look into the RunUnsafeAsync, that's my last resort.

@bchavez
Copy link
Owner

bchavez commented Aug 23, 2019

Thanks Oliver. I appreciate your perspective. Let me know if you get an implementation using .RunUnsafeAsync() working.

Perhaps we can look at adding a new run helper like .RunEnumerableAsync(), or .RunReadAsync() that always returns something that can be enumerated over regardless of the response type.

I've been planning to implement IAsyncEnumerable when C# 8 and .NET Standard 2.1 hit, so that's probably around the timeframe we can start looking at what we can do to improve the situation w.r.t dynamic queries.

Goes without saying, much of what we do in programming has some kind of trade-off. Also, let's not forget the spirit of the run helpers is to avoid the performance hit from the DLR. Many of these helper methods are tools to help you get closer to the wire. As such, protocol details like this, unfortunately, bubble up through the API.

@walterX12
Copy link

Perhaps we can look at adding a new run helper like .RunEnumerableAsync(), or .RunReadAsync() that always returns something that can be enumerated over regardless of the response type.

I am voting for this feature/methods. It would simplify driver use cases for "end-programmer", because IEnumerable output is expected outcome from the "SELECT-like" queries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants