Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Target Hitting Recursion Limit During Pants Setup (with workaround) #11201

Closed
rcuza opened this issue Nov 18, 2020 · 3 comments · Fixed by #11202 or #11271
Closed

Target Hitting Recursion Limit During Pants Setup (with workaround) #11201

rcuza opened this issue Nov 18, 2020 · 3 comments · Fixed by #11202 or #11271
Assignees
Labels

Comments

@rcuza
Copy link
Contributor

rcuza commented Nov 18, 2020

Description of Problem

We’re in the process of migrating from 1.25 to 2.1.0., and hit an issue trying to run a test on specific target. The target is large and results in a max recursion limit exceeded.

I tried hacking on sys.setrecursionlimit and found for our use case 1021 was the min that would allow the test to succeed.

We can try breaking that target up, but the app it is testing is kind of a monolith so i don’t know how successful that would be.

Can you make a runtime limit in pants to handle?

This error happens in the pants setup before our pytest is run.

Workaround

In one of our plugin's register.py we added sys.setrecursionlimit(1021) and this resolved our problem.

@stuhood stuhood self-assigned this Nov 18, 2020
stuhood added a commit that referenced this issue Nov 18, 2020
### Problem

Our existing cycle detection (and reporting) algorithm is implemented recursively in Python, and can hit recursion depth limits for larger graphs.

### Solution

Rather than porting to an iterative algorithm (which is complex when you need to actually report the cycle path), or adjusting recursion limits (which would work fine, but which exist as a reminder that stack frames are not cheap in Python), port our existing algorithm to Rust. 

### Result

Fixes #11201.

[ci skip-build-wheels]
stuhood added a commit to stuhood/pants that referenced this issue Nov 18, 2020
Our existing cycle detection (and reporting) algorithm is implemented recursively in Python, and can hit recursion depth limits for larger graphs.

Rather than porting to an iterative algorithm (which is complex when you need to actually report the cycle path), or adjusting recursion limits (which would work fine, but which exist as a reminder that stack frames are not cheap in Python), port our existing algorithm to Rust.

Fixes pantsbuild#11201.

[ci skip-build-wheels]
stuhood added a commit to stuhood/pants that referenced this issue Nov 18, 2020
Our existing cycle detection (and reporting) algorithm is implemented recursively in Python, and can hit recursion depth limits for larger graphs.

Rather than porting to an iterative algorithm (which is complex when you need to actually report the cycle path), or adjusting recursion limits (which would work fine, but which exist as a reminder that stack frames are not cheap in Python), port our existing algorithm to Rust.

Fixes pantsbuild#11201.

[ci skip-build-wheels]
stuhood added a commit that referenced this issue Nov 19, 2020
### Problem

Our existing cycle detection (and reporting) algorithm is implemented recursively in Python, and can hit recursion depth limits for larger graphs.

### Solution

Rather than porting to an iterative algorithm (which is complex when you need to actually report the cycle path), or adjusting recursion limits (which would work fine, but which exist as a reminder that stack frames are not cheap in Python), port our existing algorithm to Rust.

Fixes #11201.

[ci skip-build-wheels]
stuhood added a commit that referenced this issue Nov 19, 2020
### Problem

Our existing cycle detection (and reporting) algorithm is implemented recursively in Python, and can hit recursion depth limits for larger graphs.

### Solution

Rather than porting to an iterative algorithm (which is complex when you need to actually report the cycle path), or adjusting recursion limits (which would work fine, but which exist as a reminder that stack frames are not cheap in Python), port our existing algorithm to Rust.

Fixes #11201.

[ci skip-build-wheels]
@stuhood stuhood reopened this Nov 30, 2020
@stuhood
Copy link
Sponsor Member

stuhood commented Dec 2, 2020

Although the original issue was resolved, the fix from #11202 seems to be able to peg CPUs when running on a target with a sufficiently large number of files (44 files with 1500 deps each in a graph of more than 200k edges).

We have not yet been able to confirm that the issue is an infinite loop via a profile: profiles appear to show multiple threads individually making progress. And diving in to microbenchmark (of just the code ported in #11202), it appears that we have some super-linear behavior due to lock contention (both the GIL and possibly also our interning):

>>> took 0.40951013565063477 with 1 threads
>>> took 2.203711986541748 with 2 threads
>>> took 11.812678098678589 with 4 threads
>>> took 24.522939205169678 with 8 threads

This is a good microbenchmark to examine both GIL acquisition and interning, so I'd like to see how much we can improve this by reducing contention here (which should pay dividends elsewhere in the codebase). If reducing lock contention isn't sufficient, we can look into adjusting the TransitiveTargets code to do more structure sharing (as it did before #10409) to avoid redundant work.

stuhood added a commit that referenced this issue Dec 3, 2020
### Problem

Profiles of high-contention situations (including #11201) showed a non-trivial amount of time waiting on the `Interns` lock.

### Solution

In two commits, remove the need to separately intern python types, and simplify our GIL/`Interns` locking strategy by implementing it internally to the `Interns` struct.

### Result

Simpler code, and a slight (~5%) speedup for `./pants dependencies --transitive ::`. #11201 remains open: likely we will want to make structural changes to `TransitiveTargets` to avoid the redundant/contending computation.
@stuhood
Copy link
Sponsor Member

stuhood commented Dec 4, 2020

Status update: I believe that I have a holistic fix for this, as well as for a few other TransitiveTarget performance concerns. I need to nail down one issue with the @rule graph first, but am hopeful that I can have it reviewable/cherry-pickable tomorrow.

@stuhood
Copy link
Sponsor Member

stuhood commented Dec 8, 2020

Unfortunately the holistic solution (#11270) has a blocker (#11269). For now I'm going to revert the first fix for this ticket, and fix it via #11271.

Sorry for the big mess! Graph cycles are the bane of my existence =)

stuhood added a commit that referenced this issue Dec 8, 2020
… overridden. (#11271)

### Problem

Pants uses a default Python recursion limit that makes some build graph shapes fail to cycle detect: see #11201.

### Solution

Increase the default recursion limit, and allow it to be overridden via an environment variable.

Although increasing the recursion limit is not always the best approach to solving a problem, having the capability to do so is frequently useful as a workaround. We also know that the default value is too low for otherwise reasonable graph shapes.

### Result

Fixes #11201.
stuhood added a commit that referenced this issue Dec 8, 2020
#11202 introduced a lot of contention on the GIL that was only obvious when larger numbers of targets were being built. #11270 is a holistic fix for that issue, but it is currently blocked on #11269.

In the meantime, we will land #11271 to dodge the original issue in #11201 to get us back to a medium-slow-but-working state, and then follow up on #11269 and #11270 to get us to the best place.

This reverts commit fabcb45.

[ci skip-build-wheels]
stuhood added a commit to stuhood/pants that referenced this issue Dec 8, 2020
… overridden. (pantsbuild#11271)

### Problem

Pants uses a default Python recursion limit that makes some build graph shapes fail to cycle detect: see pantsbuild#11201.

### Solution

Increase the default recursion limit, and allow it to be overridden via an environment variable.

Although increasing the recursion limit is not always the best approach to solving a problem, having the capability to do so is frequently useful as a workaround. We also know that the default value is too low for otherwise reasonable graph shapes.

### Result

Fixes pantsbuild#11201.
stuhood added a commit to stuhood/pants that referenced this issue Dec 8, 2020
…sbuild#11272)

In the meantime, we will land pantsbuild#11271 to dodge the original issue in pantsbuild#11201 to get us back to a medium-slow-but-working state, and then follow up on pantsbuild#11269 and pantsbuild#11270 to get us to the best place.

This reverts commit fabcb45.

[ci skip-build-wheels]
stuhood added a commit to stuhood/pants that referenced this issue Dec 8, 2020
… overridden. (pantsbuild#11271)

### Problem

Pants uses a default Python recursion limit that makes some build graph shapes fail to cycle detect: see pantsbuild#11201.

### Solution

Increase the default recursion limit, and allow it to be overridden via an environment variable.

Although increasing the recursion limit is not always the best approach to solving a problem, having the capability to do so is frequently useful as a workaround. We also know that the default value is too low for otherwise reasonable graph shapes.

### Result

Fixes pantsbuild#11201.
stuhood added a commit to stuhood/pants that referenced this issue Dec 8, 2020
…sbuild#11272)

In the meantime, we will land pantsbuild#11271 to dodge the original issue in pantsbuild#11201 to get us back to a medium-slow-but-working state, and then follow up on pantsbuild#11269 and pantsbuild#11270 to get us to the best place.

This reverts commit fabcb45.

[ci skip-build-wheels]
stuhood added a commit to stuhood/pants that referenced this issue Dec 9, 2020
…sbuild#11272)

In the meantime, we will land pantsbuild#11271 to dodge the original issue in pantsbuild#11201 to get us back to a medium-slow-but-working state, and then follow up on pantsbuild#11269 and pantsbuild#11270 to get us to the best place.

This reverts commit fabcb45.

[ci skip-build-wheels]
stuhood added a commit to stuhood/pants that referenced this issue Dec 9, 2020
… overridden. (pantsbuild#11271)

### Problem

Pants uses a default Python recursion limit that makes some build graph shapes fail to cycle detect: see pantsbuild#11201.

### Solution

Increase the default recursion limit, and allow it to be overridden via an environment variable.

Although increasing the recursion limit is not always the best approach to solving a problem, having the capability to do so is frequently useful as a workaround. We also know that the default value is too low for otherwise reasonable graph shapes.

### Result

Fixes pantsbuild#11201.
stuhood added a commit to stuhood/pants that referenced this issue Dec 9, 2020
… overridden. (pantsbuild#11271)

### Problem

Pants uses a default Python recursion limit that makes some build graph shapes fail to cycle detect: see pantsbuild#11201.

### Solution

Increase the default recursion limit, and allow it to be overridden via an environment variable.

Although increasing the recursion limit is not always the best approach to solving a problem, having the capability to do so is frequently useful as a workaround. We also know that the default value is too low for otherwise reasonable graph shapes.

### Result

Fixes pantsbuild#11201.
stuhood added a commit to stuhood/pants that referenced this issue Dec 9, 2020
…sbuild#11272)

In the meantime, we will land pantsbuild#11271 to dodge the original issue in pantsbuild#11201 to get us back to a medium-slow-but-working state, and then follow up on pantsbuild#11269 and pantsbuild#11270 to get us to the best place.

This reverts commit fabcb45.

[ci skip-build-wheels]
stuhood added a commit that referenced this issue Dec 9, 2020
… overridden. (cherrypick of #11271) (#11276)

### Problem

Pants uses a default Python recursion limit that makes some build graph shapes fail to cycle detect: see #11201.

### Solution

Increase the default recursion limit, and allow it to be overridden via an environment variable.

Although increasing the recursion limit is not always the best approach to solving a problem, having the capability to do so is frequently useful as a workaround. We also know that the default value is too low for otherwise reasonable graph shapes.

### Result

Fixes #11201.

[ci skip-rust]
[ci skip-build-wheels]
stuhood added a commit that referenced this issue Dec 9, 2020
…11272) (#11277)

#11202 introduced a lot of contention on the GIL that was only obvious when larger numbers of targets were being built. #11270 is a holistic fix for that issue, but it is currently blocked on #11269.

In the meantime, we will land #11271 to dodge the original issue in #11201 to get us back to a medium-slow-but-working state, and then follow up on #11269 and #11270 to get us to the best place.

This reverts commit fabcb45.

[ci skip-build-wheels]
stuhood added a commit that referenced this issue Dec 9, 2020
…11272) (#11275)

#11202 introduced a lot of contention on the GIL that was only obvious when larger numbers of targets were being built. #11270 is a holistic fix for that issue, but it is currently blocked on #11269.

In the meantime, we will land #11271 to dodge the original issue in #11201 to get us back to a medium-slow-but-working state, and then follow up on #11269 and #11270 to get us to the best place.

This reverts commit fabcb45.

[ci skip-build-wheels]
stuhood added a commit that referenced this issue Dec 9, 2020
… overridden. (cherrypick of #11271) (#11274)

### Problem

Pants uses a default Python recursion limit that makes some build graph shapes fail to cycle detect: see #11201.

### Solution

Increase the default recursion limit, and allow it to be overridden via an environment variable.

Although increasing the recursion limit is not always the best approach to solving a problem, having the capability to do so is frequently useful as a workaround. We also know that the default value is too low for otherwise reasonable graph shapes.

### Result

Fixes #11201.

[ci skip-rust]
[ci skip-build-wheels]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants