Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to control the (start) order of parallel jobs? #4688

Closed
wlad opened this issue Sep 6, 2020 · 9 comments
Closed

Is it possible to control the (start) order of parallel jobs? #4688

wlad opened this issue Sep 6, 2020 · 9 comments
Assignees

Comments

@wlad
Copy link

wlad commented Sep 6, 2020

Missing Information

We have a workflow as shown below. As you can see all test jobs (can) run in parallel. But since we are on a free (open-source) trier only 4 of them start at a time and run in parallel, the rest is queued. That means that if the longest running job ends up to be the last in the queue the overall execution time will be much longer compared to what it could be if the longest running job started first. Thus among all parallel test jobs I want to make sure that QUERYSERVICE-tests-2 job starts first because it is the longest running job. Is this possible?

image

Already Looked

I searched the docs: https://circleci.com/docs/2.0/configuration-reference/

Potential Locations

here: https://circleci.com/docs/2.0/configuration-reference/#jobs-1
or here: https://circleci.com/docs/2.0/sample-config/#concurrent-workflow
or here: https://circleci.com/docs/2.0/workflows/#workflows-configuration-examples

extra

The workflow link above will probably be dead in the future, so here is a link to the project's pipeline.

@rosieyohannan
Copy link
Contributor

rosieyohannan commented Sep 7, 2020

Hey @wlad I have asked our pipelines engineers and the best thing to do here is to change the shape of the workflows graph. As all test jobs are concurrent, there is no "order" to change.

One way would be to have longest job run concurrently with shortest job and have all the other test jobs run concurrently using requires: shortest-job. Does that help?

@wlad
Copy link
Author

wlad commented Sep 7, 2020

Hi @rosieyohannan thanks for input. Greate idea. I'll try it out and report back.

@wlad
Copy link
Author

wlad commented Sep 7, 2020

The idea is very good

  1. the shortes and longest test jobs start first
    image

  2. the other test jobs start while the longest is running (some are queued)
    image

  3. unfortunately there is an issue w/ attaching workspace in the last job
    image

    Concurrent upstream jobs persisted the same file(s) into the workspace:
    - projects/ehrbase/tests/results/ADHOC-QUERY-1/log.html
    - projects/ehrbase/tests/results/ADHOC-QUERY-1/output.xml
    - projects/ehrbase/tests/results/ADHOC-QUERY-1/report.html
    ...
    Concurrent upstream jobs persisted the same file(s)
    

I know what this issue is about and probably could work around it. Instead I think I'll try a different approach. For that I'd like to know is it possible to get a particular job's state. Then I may create a custom step that would allow to wait for a job to be in RUNNING state.

@rosieyohannan
Copy link
Contributor

Hey @wlad! For this I would suggest taking a look at the API v2 reference guide, specifically the workflow endpoints. You should be able to get hold of the state that way: https://circleci.com/docs/api/v2/#circleci-api-workflow. Hope this helps :-D

@roopakv
Copy link
Contributor

roopakv commented Sep 8, 2020

@wlad the wait for job command in swissknife lets you wait for a job in the same workflow. This might be a quick fix for you.

@wlad
Copy link
Author

wlad commented Sep 8, 2020

@roopakv Thank you. It's a good starting point for me. wait_for_job waits for SUCCESS status of a job. What I need to wait for is the RUNNING status.

I guess it's just this line I have to change 😄 🥰
image

@roopakv
Copy link
Contributor

roopakv commented Sep 8, 2020

@wlad yup feel free to open a PR to accept an additional state.

@wlad
Copy link
Author

wlad commented Oct 7, 2020

@roopakv unfortunately this did not work 😢 I modified your wait_for_job command to check for RUNNING (instead SUCCESS) status but as you can see below I end up waiting for the timeout set in this command instead of the job that I'd like to wait for if the job is not among the first 4 concurrently started jobs (or to put it in other words: if the jobs ends up in queue)

image

image

I think which 4 jobs start first is simply determined by which jobs finish their "Spin up environment" step first. So there is no chance to impact this behaviour with the wait_for_job command.

At least I've learned how to write and use inline orbs 🤣

@roopakv
Copy link
Contributor

roopakv commented Oct 7, 2020

@wlad I still think it is possible, please create an issue on the roopakv/orbs and I'll take a look.

We do something similar on one of our repos. But yes it did require quite a bit of finagling :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants