Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.fail() doesn't support complex payload #23

Closed
mitar opened this issue Jul 30, 2014 · 27 comments
Closed

.fail() doesn't support complex payload #23

mitar opened this issue Jul 30, 2014 · 27 comments

Comments

@mitar
Copy link
Collaborator

mitar commented Jul 30, 2014

While you can pass arbitrary data to .done(), you cannot pass it to .fail(). Allowing only string message is problematic: I would like to store whole stacktrace, for example. Wouldn't it be useful to have at least a standardized way to store a stacktrace along the message (array of lines)? Or allow storing arbitrary data in the same way as .done()?

@vsivsi
Copy link
Owner

vsivsi commented Jul 31, 2014

I think this is a good idea, and I just started to implement it, but now I remember why I didn't do it this way to begin with...

A given job can retry any number of times, and so unlike job.done() there isn't a single result to store. There could be a very large number of failed "results". This could of course be accommodated by pushing them onto a "log like" array in the job doc.

But this concerns me because it wouldn't take too many retries (with e.g. deep stack traces) to start encroaching on the 16MB size limit of a single MongoDB document.

And what about a job that fails 10 times and then succeeds? Should all of those failures be kept?

@vsivsi
Copy link
Owner

vsivsi commented Jul 31, 2014

I've implemented this as described above on the new_fail_value branch. Check it out. For the reasons stated above, I'm not super happy with it, but there it is.

@mitar
Copy link
Collaborator Author

mitar commented Jul 31, 2014

I must say that I would prefer if this would be an additional field to the log. So I would propose, that we have: .fail(message, [failureData], ...) and then a log entry with message is added, with field failure set to failureData if set, otherwise to message (objectified). The issue currently is that if you look at the log, it is hard to get that extra stack trace from failures – how do you know which one corresponds to which log entry? With adding a field failure you would exactly have access both to data and know which log entries correspond to failures (currently you do not know that either).

@mitar
Copy link
Collaborator Author

mitar commented Jul 31, 2014

I would not really worry about 16 MB limit. I think you would have to retry thousand times to start worrying about that. But then you have some other issue. And for periodic jobs you clone job documents anyway, no?

@vsivsi
Copy link
Owner

vsivsi commented Jul 31, 2014

Next you're going to ask that job.log() have a complex payload too! 😄

Like I said on the commit comment, I need to think about this. The goal should be to harmonize job.done() and job.fail() as much as possible...

@mitar
Copy link
Collaborator Author

mitar commented Jul 31, 2014

Good idea for job.log. :-) But yes, why not? :-)

There is an inherit difference between done and fail. Done has a result which is often consumed by other processes. While fail mostly has results only for users to debug. (And secondary for other processes, which can be stored above in my proposed failure). We could argue that .log and .fail are similar. And those two might be harmonized. (So allowing both a message and additional payload.)

And yes, having log field for additional payload (default to {}) would also help to know that some log entries were made by logging.

@mitar
Copy link
Collaborator Author

mitar commented Jul 31, 2014

So more I think more I would argue that fail and log should be harmonized (having message + additional payload stored and some way to know what type of log entry it is (maybe based on a field, maybe some additional type field with string constant). done() is very different.

@mitar
Copy link
Collaborator Author

mitar commented Jul 31, 2014

message: "Job Failed with #{"Fatal" if options.fatal} Error: #{err.value if err.value? and typeof err.value is 'string'}."

produces strange message when error is not fatal:

Job Failed with undefined Error: 

I suggest you do #{if options.fatal then "Fatal " else ''} or something similar.

@vsivsi
Copy link
Owner

vsivsi commented Feb 8, 2015

Fixed!

@vsivsi
Copy link
Owner

vsivsi commented Feb 9, 2015

I've also added functionality to job.fail() so that it always adds the runId to the failure object before writing it to the job document. This addresses the issue of not knowing which failures correspond to which runs (although it could previously be reconstructed from the order in the failure array and the order of failures in the log, these are now more tightly connected).

Also, per your request, job.log() now has a data option that will store an arbitrary object in the log entry. This is not used for failures however.

These changes are currently on their own branch, but I plan to merge them into dev soon.
https://github.com/vsivsi/meteor-job-collection/tree/new_log_failure_data

@mitar
Copy link
Collaborator Author

mitar commented Feb 9, 2015

Hm, could you make also job.fail() accept data payload?

@vsivsi
Copy link
Owner

vsivsi commented Feb 9, 2015

Isn't the fail object a data payload?

@mitar
Copy link
Collaborator Author

mitar commented Feb 9, 2015

Ah, now I see. You store failure data payload under failures, and then you store runId there so that it can be matched with the log. Hm, OK. But this makes it complicated to display the log for a particular job. I do not understand why failures is not just a counter/number, and you store the err object as data inside log array? Then all log entries would have the same structure, time, message, and optional data payload.

@vsivsi
Copy link
Owner

vsivsi commented Feb 9, 2015

At this point I'd really prefer not to break the document model. And changing failures to a counter would do that. job.fail() could write the failure object to both places (failures and log.data) but that seems really wasteful. My intent for the log array was always to be something that could easily be made human readable. Whereas the failures data need not be. Now with the optional log.data field, job.log calls can support custom extensions. The job.fail() failure objects can already do that.

Basically, I think it's suboptimal to force someone to filter through all of the log entries to find the failures.

I think all of this confusion was seeded by my early decision to automatically write special "success" and "failure" log entries during job.done() and job.fail() calls. I probably mostly did this for myself as a convenience for not having to write a separate job.log() for these events, although doing it in one call/update is more efficient. Note I didn't do it for job.progress() however, because there may be a lot of progress calls for a long running job!

Part of me wants to remove the "automatic" job.done() and job.fail() log entries, simply because they are an undocumented "surprise feature" that is not part of the document model. They don't hurt anything until someone relies on them being a certain way, and then things can break. Which is why I still think of the log as a human readable thing, mostly.

@mitar
Copy link
Collaborator Author

mitar commented Feb 9, 2015

job.fail() could write the failure object to both places (failures and log.data) but that seems really wasteful.

I agree.

My intent for the log array was always to be something that could easily be made human readable. Whereas the failures data need not be.

That's why there is message and data. ;-)

At this point I'd really prefer not to break the document model. And changing failures to a counter would do that.

I know that. But this why it is the last chance to break things before the 1.0 release. ;-)

@mitar
Copy link
Collaborator Author

mitar commented Feb 9, 2015

Anyway, I know how to wrap things for myself here around it. I just find it a bit cumbersome. I see that this should be simple one log for everything, and then you can have inside various types of log messages. Automatic message, custom message, failures, successes. And each log message could have both human readable and machine readable part.

But this is only my view. It is your decision. It is good enough for me how it is now.

@vsivsi
Copy link
Owner

vsivsi commented Feb 9, 2015

I personally like having the result/failures data objects be top level attributes in the job document. Storing them in the log adds a level of redirection, and I'd prefer not to sift through a log that may contain a lot of things to find these really important data. I'm going to keep the "automatic" log entries for save/rerun/fail/done for now. but leave them undocumented.

@mitar
Copy link
Collaborator Author

mitar commented Feb 9, 2015

OK.

@mitar
Copy link
Collaborator Author

mitar commented Feb 9, 2015

Probably at some point it will be necessary to support customization of messages for those automatic log messages. At least translation to other languages comes to mind.

@vsivsi
Copy link
Owner

vsivsi commented Feb 9, 2015

Yet another reason to remove them!

@mitar
Copy link
Collaborator Author

mitar commented Feb 9, 2015

What about then create _logJobFailure noop method, which does not do anything, but it is called automatically on failure. And _logJobDone which is the same. And start? Or what other cases you have when you log?

And then I can extend the class and make it log in the way it is currently logging, or with a translated string. Or whatever? So then log would be by itself always empty. Only failures would be populated automatically. And one can then easily extend.

Or maybe it would be even better to create generic hooks. _onJobFailure, _onJobDone, etc.

@vsivsi
Copy link
Owner

vsivsi commented Feb 10, 2015

I'm looking at this now.

@vsivsi
Copy link
Owner

vsivsi commented Feb 11, 2015

@mitar Your wish is my command. e19b2c3

Feedback welcomed.

@vsivsi vsivsi closed this as completed Feb 11, 2015
@vsivsi vsivsi reopened this Feb 11, 2015
@mitar
Copy link
Collaborator Author

mitar commented Feb 11, 2015

That looks amazing. Even better than I imagined!

@mitar
Copy link
Collaborator Author

mitar commented Feb 11, 2015

And this also plays well with stuff from new_log_failure_data branch, no?

@mitar
Copy link
Collaborator Author

mitar commented Feb 11, 2015

I have nothing more to add. :-) If you will ever be around Berkeley, tell me and I am buying you a lunch. :-)

@vsivsi
Copy link
Owner

vsivsi commented Mar 24, 2015

Merged into master.

@vsivsi vsivsi closed this as completed Mar 24, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants