Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What are the guarantees regarding single (non-duplicate) job execution? #31

Closed
imlm opened this issue Apr 23, 2015 · 6 comments
Closed

Comments

@imlm
Copy link

imlm commented Apr 23, 2015

I am aware that the current implementation of the Pipleines API is based on App Engine Task Queues and, as such, this made me wonder whether the same warning about possible duplicate task execution applies to job executions. Is there any guarantees that a job will be executed only once (not considering job retries)? Or should a job always be idempotent?

@sadovnychyi
Copy link
Contributor

https://github.com/GoogleCloudPlatform/appengine-pipelines/wiki/Python#what-is-a-pipeline

All pipelines must be idempotent. This means that running the same pipeline with the same inputs more than once will yield the same results and the same side-effects. The library does not enforce the idempotence requirement on pipelines, it is up to developers to do it themselves. However, the library provides a few pieces (like stable pipeline IDs) which make it easy to achieve idempotence for side-effects.

@imlm
Copy link
Author

imlm commented Apr 23, 2015

Oh that was my bad, I made sure I had read the Java wiki because that's the version I am using, but never considered the python one. 👍
Thanks!

@aozarov
Copy link
Contributor

aozarov commented Apr 23, 2015

Right, pipeline code is expected be idempotent in both cases. I am not
certain about the Python implementation but the Java implementation is
slightly more lenient in the "yield the same results and the same
side-effects" part. Basically, it is true that the same Job can run more
than once but once run the job state will be changed conditionally and
atomically (CAS). Such a change (including adding child job) will be
dropped if job state was already modified. This means that as long as your
jobs do not change external state/data you should be OK and not experience
any data clobbering in case input changes between runs.

Arie.

On Thu, Apr 23, 2015 at 5:40 AM, Irineu notifications@github.com wrote:

Oh that was my bad, I made sure I had read the Java wiki because that's
the version I am using, but never considered the python one. [image: 👍]
Thanks!


Reply to this email directly or view it on GitHub
#31 (comment)
.

@imlm
Copy link
Author

imlm commented Apr 24, 2015

Hi @aozarov ,

Thank you very much for further expanding on this. I assume this is the code snippet where CAS happens.

jobRecord.setState(State.WAITING_TO_FINALIZE);
jobRecord.setChildGraphGuid(currentRunGUID);
updateSpec.getFinalTransaction().includeJob(jobRecord);
updateSpec.getFinalTransaction().includeBarrier(finalizeBarrier);
backEnd.saveWithJobStateCheck(
    updateSpec, jobRecord.getQueueSettings(), jobKey, State.WAITING_TO_RUN, State.RETRY);

Now what is intriguing me is the fact that there is also a CAS-like operation before running the job but without actually changing the job state:

if (!backEnd.saveWithJobStateCheck(
    tempSpec, jobRecord.getQueueSettings(), jobKey, State.WAITING_TO_RUN, State.RETRY)) {
    logger.info("Ignoring runJob request for job " + jobRecord + " which is not in a"
        + " WAITING_TO_RUN or a RETRY state");
    return;
}

Could we not seize this and use another State (e.g., RUNNING) to assure the run-only-once effect? I might be being too naive here since if it were this simple it'd have been done this way already, but it does not cost to ask.

@aozarov
Copy link
Contributor

aozarov commented Apr 24, 2015

lets say you changed it to RUNNING and then failed without a chance to change back the state...
If that happens it would be hard to distinct a failed running and a very long running (considering that taskqueue requests can be up to 10 minutes and backend requests up to 24 hours...)

@imlm
Copy link
Author

imlm commented Apr 25, 2015

Indeed, either way thanks again! I'm closing this as it has been answered already.

@imlm imlm closed this as completed Apr 25, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants