Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible hang in ExpressionCompiler #6435

Open
shuai-xu opened this issue Dec 24, 2020 · 6 comments
Open

Possible hang in ExpressionCompiler #6435

shuai-xu opened this issue Dec 24, 2020 · 6 comments

Comments

@shuai-xu
Copy link

Everyday there will be querys fails with error 'Encountered too many errors talking to a worker node. The node may have crashed or be under too much load. This is probably a transient issue, so please retry your query in a few minutes.' on our cluster as shown in the folling picture.
image
After some debugging, We found that this is due to when doing updateTask on presto worker, it may hang on the stack as below:
image
An interesting thing is that the sql is usually very long because it may have a lot values in the where expression such as 'where uid in (3960028287,3960049741,3960083260,3960036528,3960068650,3960047338,3821119567,3959815116,3958678845,3959575514,3960047568,3960050616,3959660325,3959908369,3959315091,3821945432,3813742325,3960084009,3960076400,3960055354,3960027444,3960071500,3812040673,3960084010,3960093838,3960071701,3958768994,3820060031,3960038250,3960073050,3960073040,3960060636,3960077227,3809719971,3960051276,3960062392,3960066987,3960073418,3960076602,3960056687,3960053531,3960067000,3960055150,3960061434,3960105593,3960052858,3959998120,3959776363,3960076627,3960060102,3960042850,3960083120,3960028182,3960054857,3960046814,3960003996,3960072696,3960036604,3959581864,3960057289,3960068430,3960062348,3960038568,3960100114,3959947372,3960110483,3960060739,3959874691,3959894879,3959221535,3958701935,3958712551,3960080185,3959789333,3960080353,3819389303,3960082694,3959550987,3959916804,3825976317,3960065054,3960076348,3959510197,3960076274))'.
We have set the ReservedCodeCacheSize to 2G, and it seems code cache is enough when the task hang.
How to avoid this problem?

@findepi findepi changed the title Task may hang on PlanNode.accept Possible hang in ExpressionCompiler Dec 27, 2020
@findepi
Copy link
Member

findepi commented Dec 27, 2020

Seems related to #6405

@sopel39
Copy link
Member

sopel39 commented Dec 28, 2020

cc @dain

@shuai-xu
Copy link
Author

Seems related to #6405

Yes, I think these two issues may have some relations. It seems when dealing with long querys, presto may have some problems.

@sopel39
Copy link
Member

sopel39 commented Dec 28, 2020

Does it happen only with long IN lists?

@shuai-xu
Copy link
Author

Does it happen only with long IN lists?

No, some querys are long because they have many join.

@zhanglistar
Copy link

zhanglistar commented Jan 18, 2021

Any updates? @sopel39 @dain

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants