Skip to content

Comments

Skip to get the batch application status when creating batch#4680

Closed
turboFei wants to merge 1 commit intoapache:masterfrom
turboFei:post_batches
Closed

Skip to get the batch application status when creating batch#4680
turboFei wants to merge 1 commit intoapache:masterfrom
turboFei:post_batches

Conversation

@turboFei
Copy link
Member

@turboFei turboFei commented Apr 10, 2023

Why are the changes needed?

To prevent the POST /batches operation stuck when getting batch application status.

For example: the yarn client always failover because of resource manager issue.

And it is also not needed that get the batch app status info when creating batch.

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before make a pull request

@turboFei turboFei changed the title Fast return batch info when post batches Skip to get the batch application status when creating batch Apr 10, 2023
@turboFei turboFei self-assigned this Apr 10, 2023
@turboFei turboFei added this to the v1.7.1 milestone Apr 10, 2023
@turboFei turboFei requested review from pan3793 and ulysses-you April 10, 2023 08:19
@ulysses-you
Copy link
Contributor

How can we return app id if we do not get application status when creating batch ? Do you mean we do not need to return app related info when creating batch ?

@turboFei
Copy link
Member Author

How can we return app id if we do not get application status when creating batch ? Do you mean we do not need to return app related info when creating batch ?

Yes, the key information is batchId.

@ulysses-you
Copy link
Contributor

cc @pan3793 not sure if it breaks your customer ..

@turboFei
Copy link
Member Author

I think that usually the response for create batch request should be returned in milliseconds.

So, it is impossible to get the application status in most cases.

@pan3793
Copy link
Member

pan3793 commented Apr 10, 2023

it is impossible to get the application status in most cases.

What's the current result in your production case? NOT_FOUND or UNKNOWN?

@turboFei
Copy link
Member Author

image

It is none, I think.

@pan3793
Copy link
Member

pan3793 commented Apr 10, 2023

We have UNKNOWN state, why should return null/None instead?

@pan3793
Copy link
Member

pan3793 commented Apr 10, 2023

I think we need to re-evaluate #3922, it's similar to #4469

@turboFei
Copy link
Member Author

cc @zwangsheng as well

@pan3793
Copy link
Member

pan3793 commented Apr 11, 2023

My idea is, revert #3922, and always propagate app status to the client.

During submitting phase, ApplicationManager returns UNKNOWN instead of NOT_FOUND when elapsed time < kyuubi.engine.submit.timeout, same as we did in #4469 for K8s

@zwangsheng
Copy link
Contributor

I'm + 1 for get Unknown state, after spark submit but we not find application from yarn cluster.

And using timeout to check whether submit success or not.

Unknown sounds more reasonable, when we submit but can fount app.

@turboFei turboFei closed this Apr 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants