Skip to content

Conversation

@9999years
Copy link
Contributor

Closes #419

Fixes this error, very common on macOS where the default file descriptor soft limit is 1024:

Internal error (stage: calculate_output_values_failed): collecting output ProjectRelativePathBuf("buck-out/v2/gen/root/aaaaaaaaaaaaaaaa/puppy/__puppy__/doggy"): Too many open files (os error 24)

I expect this will still need some work, let me know if there's a better place to put these functions.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 29, 2025
@facebook-github-bot
Copy link
Contributor

@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. (Because this pull request was imported automatically, there will not be any future comments.)

memoffset = "0.6.4"
multimap = "0.8.2"
nix = "0.22"
nix = { version = "0.26", features = ["fs", "resource", "signal"] }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed to bump this for access to nix::sys::resource (added in 0.23) and resource::RLIM_INFINITY (added in 0.26).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this to match what we use internally, you should be able to drop it if you rebase on master

@9999years 9999years marked this pull request as ready for review April 29, 2025 19:22
Closes facebook#419

Fixes this error, very common on macOS where the default file descriptor soft limit is 1024:

    Internal error (stage: calculate_output_values_failed): collecting output ProjectRelativePathBuf("buck-out/v2/gen/root/aaaaaaaaaaaaaaaa/puppy/__puppy__/doggy"): Too many open files (os error 24)
@JakobDegen
Copy link
Contributor

Ugh, sorry, I realize I kind of reviewed this in the wrong order. This won't work as written, I think - the problem is that not all subprocesses buck2 spawns go through the forkserver (like, for example, the forkserver itself) and so that's going to get this all confused. Concretely, what I expect will happen is that you run this in the client, where you increase the soft limit and then set the static, but then the client starts the daemon without going through the forkserver, and so the daemon won't know that the client changed the default value and so won't know to revert it. Though actually, the daemon knowing to revert it wouldn't help, because it'll spawn the forkserver, and the forkserver is where you have to actually set this static to get it right.

The thing I think you should do is this:

  1. Don't change the limits in the client. The client doesn't open a significant number of files anyway, so it doesn't matter there. Instead, run this when starting up the daemon process, probably around here.
  2. Pass the information about whether to change soft/hard limits (and what to) to the forkserver via an explicit command line arg. There's a function called maybe_launch_forkserver in which you can do this. Consult the static when launching it, but then on the forkserver side don't bother with a static - just store those values like any other data

That should correctly target the actual processes you want

@9999years
Copy link
Contributor Author

Thanks, I was curious about that. I'll see about updating the maybe_launch_forkserver function. (FWIW, anecdotally this patch does solve the file descriptor limits for local builds.)

@9999years
Copy link
Contributor Author

I'm not entirely sure what the implementation strategy you're suggesting is. Here's my understanding, could you help me fill in the gaps? (Sorry this comment is a bit long.)

Though actually, the daemon knowing to revert it wouldn't help, because it'll spawn the forkserver, and the forkserver is where you have to actually set this static to get it right.

Does this mean "the daemon doesn't need to adjust the limits" or "the daemon needs to pass the limits to the forkserver"?

Don't change the limits in the client. The client doesn't open a significant number of files anyway, so it doesn't matter there.

Interpretation: Don't set the static in app/buck2/bin/buck2.rs.

Instead, run this when starting up the daemon process, probably around [buck2_daemon::daemon::DaemonCommand::run].

I think this means I should set the static / adjust the limits in DaemonCommand::run, correct?

Pass the information about whether to change soft/hard limits (and what to) to the forkserver via an explicit command line arg

buck2_common::init::ResourceControlConfig looks like an appropriate place to add these limits — it already contains similar limits like memory_max. Would it be reasonable to fill in default values for the file descriptor limits in DaemonCommand::run if they're not already set in ResourceControlConfig?

Consult the static when launching [the forkserver], but then on the forkserver side don't bother with a static - just store those values like any other data

I'm not sure how to get the data from buck2::commands::ForkserverCommand::exec to the places where std::process::Commands are created — can you clarify? Would I be adding parameters to buck2_forkserver::run::process_group::ProcessCommand::new? I also noticed there's buck2_util::process::background_command, which probably also needs to reset the limits; would that need additional parameters as well?

Notes for me

The forkserver CLI args are defined in buck2::commands::forkserver::ForkserverCommand.

CLI in general is defined in buck2::Opt (app/buck2/src/lib.rs) and buck2::CommandKind.

Here's the call path between the two functions you pointed at:

  • buck2_daemon::daemon::DaemonCommand::run
  • calls buck2_server::daemon::server::BuckdServer::run
  • calls buck2_server::daemon::state::DaemonState::new
  • calls buck2_server::daemon::state::DaemonState::init_data
  • calls buck2_server::daemon::forkserver::maybe_launch_forkserver (sets CLI args as strings)

@JakobDegen
Copy link
Contributor

See the comments in #986 before anything else :)

Does this mean "the daemon doesn't need to adjust the limits" or "the daemon needs to pass the limits to the forkserver"?

If you set the static with the information about what the limits should be reset to in the daemon, then that doesn't solve your problem, because the forkserver will read a different "instance" of that static (so, more or less the second)

I think this means I should set the static / adjust the limits in DaemonCommand::run, correct?

Correct

buck2_common::init::ResourceControlConfig looks like an appropriate place to add these limits

Yeah, maybe. I guess I probably would have put it into a different type that gets passed to some of the same places as ResourceControlConfig, but putting it in there directly seems ok too

Would it be reasonable to fill in default values for the file descriptor limits in DaemonCommand::run if they're not already set in ResourceControlConfig?

Mmm, this is backwards I think. DaemonCommand::run happens before the resource control config is computed. But tbh, I'm realizing now that you don't really need to put this into a static for what you're planning to do, and actually passing it explicitly would (unsurprisingly) be more bugproof; if you make a type to hold the (rlim_t, rlim_t), construct that in DaemonCommand::run when you initially set the rlimits, add it to the BuckdServerInitPreferences, add it to the DaemonStateData from there, and that's available pretty much anywhere you could want it - either when constructing the forkserver, or in the resource config if you still want to go that route

I'm not sure how to get the data from buck2::commands::ForkserverCommand::exec to the places where std::process::Commands are created — can you clarify? Would I be adding parameters to buck2_forkserver::run::process_group::ProcessCommand::new?

The command in question is the one here - you have access to a &UnixForkserverService there, which is initialized here and that's called directly from the function you linked

I also noticed there's buck2_util::process::background_command, which probably also needs to reset the limits

I would guess that you probably do not need to touch this - if you look at what this is actually used for, it's not used for any arbitrary user controlled stuff, and so just inheriting the higher limits on those processes seems fine

@facebook-github-bot
Copy link
Contributor

@JakobDegen merged this pull request in 053f772.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Many download targets leads to intermittent failures due to file descriptor limits

3 participants