use Jobs when scheduling work #12

goern · 2018-04-28T05:23:08Z

If we turn all workload some components schedule into Job rather than simple Pod, OpenShift will take care of cleaning them up.

The cleanup-job could be removed...

fridex · 2018-04-28T07:50:55Z

The cleanup-job could be removed...

Is there a mechanism to stop a job after few tries? As we do not have a full control over parameters supplied, can we mark a job that cannot succeed as failed - e.g. the given image cannot be pulled?

goern · 2018-04-28T08:38:48Z

That's not it our job, the platform should take that work off of us ;) Fridolín Pokorný <notifications@github.com> schrieb am Sa. 28. Apr. 2018 um 09:51:

…

The cleanup-job could be removed... Is there a mechanism to stop a job after few tries? As we do not have a full control over parameters supplied, can we mark a job that cannot succeed as failed - e.g. the given image cannot be pulled? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#12 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAP46yWwMkHbcz4RYbstnoOztuPq8y-vks5ttB9fgaJpZM4TrQIV> .

goern · 2018-04-29T16:28:25Z

@fridex can you list the conditions that must be met to delete an old job? In general the job strives to succeed, if the job can pull and image, that might be an application level error, but not a job level error.

fridex · 2018-04-29T16:55:29Z

@fridex can you list the conditions that must be met to delete an old job?

The current implementation waits 7 days. After that the given pod object is deleted from OpenShift.

In general the job strives to succeed, if the job can pull and image, that might be an application level error, but not a job level error.

Ack, we can definitely do that. We will need to slightly redesign how we report status on API endpoint. Something like:

1.) check if the given analysis has an entry in the graph database
2.) check if the given pod has openshift status, if yes, what is the current status?

In this case we could report more detailed information about analysis status:

does not exist
it was scheduled, queued
it is running
it failed (e.g. image pull)
it succeeded, waiting for sync to the graph database
synced to the graph database

This way we can move to use purely jobs.

As of now, we just report OpenShift pod status.

Just a detail - checks 1. and 2. can be done in parallel.

fridex · 2019-06-19T14:10:53Z

This is already done, let's close this issue.

goern added the enhancement New feature or request label Apr 28, 2018

goern mentioned this issue Apr 29, 2018

central logging, monitoring and metrics thoth-station/core#61

Closed

goern added the component/analyser label Apr 29, 2018

fridex closed this as completed Jun 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use Jobs when scheduling work #12

use Jobs when scheduling work #12

goern commented Apr 28, 2018

fridex commented Apr 28, 2018

goern commented Apr 28, 2018 via email

goern commented Apr 29, 2018

fridex commented Apr 29, 2018 •

edited

Loading

fridex commented Jun 19, 2019

use Jobs when scheduling work #12

use Jobs when scheduling work #12

Comments

goern commented Apr 28, 2018

fridex commented Apr 28, 2018

goern commented Apr 28, 2018 via email

goern commented Apr 29, 2018

fridex commented Apr 29, 2018 • edited Loading

fridex commented Jun 19, 2019

fridex commented Apr 29, 2018 •

edited

Loading