Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

r.debug() #3012

Open
wojons opened this issue Sep 5, 2014 · 6 comments
Open

r.debug() #3012

wojons opened this issue Sep 5, 2014 · 6 comments

Comments

@wojons
Copy link
Contributor

wojons commented Sep 5, 2014

This would be a something that is added to the API. It can be appended just about any other method in the API. Debug is also ignored in the chaining process. Lets get an example. Your using r.http but you want to know how long did it take for just the r.http part to run. Maybe you want to know the status code that came out and how many retries it had to make.

r.http('http://example.com').debug(function(debug) {})

This would allow the debug now has its own context to work in without anything else so If the user still wants to run a filter on the data out of http. It would look like this.

r.http('http://example.com').debug(function(debug) {}).filter({'status': 'active'}).pluck('username')

Now inside of the debug maybe you want to log that data to a table or something else if it was a super slow call. (I may be missing a branch or something for the command but the idea should be there)

r.http('http://example.com').debug(function(debug) {
    r.db('log').table('http_log').insert(debug.filter(r.row("latency").gt(100))
}).filter({'status': 'active'}).pluck('username')

I also a better name maybe inspect and not debug it depends. A query should be able to have as many debug method calls things should also should be able to be wraped in debug

r.debug(r.table('my_table'))

I am not exactly sure what things should be in every debug. for example a filter debug would result in how many docs were sent into the filter and how many made it out of the filter. I think there should also be a way to do some sort of a debug and a merge so if you want to get the debug into the normal pipeline of rethinkdb.

r.table().filter().debug(function(doc){}, true).pluck()
@coffeemug coffeemug added this to the subsequent milestone Sep 5, 2014
@coffeemug
Copy link
Contributor

Related to #329 (and possibly a dup). I'll try to prioritize this.

@neumino
Copy link
Member

neumino commented Sep 5, 2014

It's just a workaround, but one way to know what's going on in your query is to throw errors

You can just sneak this snippet in your query:

.do(function(result) {
   return r.error(result.coerceTo("STRING")) // or result.typeOf()/result.info()/result.count()/etc.
})

@wojons
Copy link
Contributor Author

wojons commented Sep 5, 2014

@coffeemug

this is not as much as a logging feature but being able to get debug information that you may need to make better choices.

@neumino

can you explain this a little more to me

@neumino
Copy link
Member

neumino commented Sep 5, 2014

@wojons -- so what happened a few times for me was that an error was thrown because a document was malformed.

When the error is thrown in the middle of a nested query, it can be a bit annoying to filter results and see where the error is. So in the case a field is missing, I just throw an error with the whole document using r.error.

That way I get the primary key of the offending document, and what the document is. And I can properly fix the document.

What can happen to is that you get a type error (like "expected X and got Y").
If you don't immediately see the document that has type Y, you can just throw Y to see what it is and if it's what you were expecting.

@danielmewes
Copy link
Member

I think the important part about @wojons' r.debug() proposal is that you get access to additional runtime data, such as the latency of the term it's applied to.

If you don't need that, you can emulate debug() through something like this:

r.http('http://example.com').do(function(x) {
  return r.or(
    r.and(r.db('log').table('http_log').insert({...}),
      false),
    x)}).filter({'status': 'active'}).pluck('username')

The use of or and and makes sure that the insert is first executed, but then the function inside the do just returns its argument x, which here is made the second argument to or.

However in contrast to @wojons original example, you cannot log the latency because you don't get access to it inside the do

@wojons
Copy link
Contributor Author

wojons commented Sep 6, 2014

@danielmewes

You have the right Idea when the query is being evaulated it would see that there is a debug and then it would start counting things. filters would count the total number of docs that enter the docs that leave maybe some math stats for average min max per doc to process. pluck would how how many docs did not have any of the fields to be used maybe the math operations like ad and subtract may have the two orginal numbers before the math operation maybe if it had to do a int to double converse on divsion i mean the list goes on but it helps you know if you can clean up your data or what is going on behind the sense there are lots of users I am sure that just write queries and have no idea whats going on. Unless your a Database enthuiest how all of this works could go right over your head.

This is also useful since rethinkdb does not have a query plainer or an explain really yet. It allows it for at least one operation at a time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants