Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove fork-related code from Rust #7894

Merged
merged 4 commits into from Jun 20, 2019

Conversation

Projects
None yet
5 participants
@gshuflin
Copy link
Contributor

commented Jun 18, 2019

cf. #7821

@gshuflin gshuflin force-pushed the gshuflin:remove_resettable branch from 6c688fd to a0f2e90 Jun 18, 2019

@Eric-Arellano
Copy link
Contributor

left a comment

This looks great! Holding off on approval only because I'm not enough of an authority on Rust to confidently approve.

Thanks for doing this!

Show resolved Hide resolved src/rust/engine/src/context.rs
@illicitonion
Copy link
Contributor

left a comment

Welcome to the project! This looks great, thanks!

CI is showing that the rust code hasn't been formatted with rustfmt; can you run build-support/bin/check_rust_formatting.sh -f and commit the result?

I recommend running build-support/bin/setup.sh one time in your git checkout, which will install git hooks which will do things like check formatting before you commit :)

.unwrap(),
remote_store_rpc_retries,
futures_timer_thread2.handle(),
let local_store_dir = local_store_dir.clone();

This comment has been minimized.

Copy link
@illicitonion

illicitonion Jun 19, 2019

Contributor

This clone is no longer necessary, because we're now moving the value into a FnOnce closure not an Fn closure.

@@ -378,13 +378,11 @@ pub extern "C" fn scheduler_metrics(
pub extern "C" fn scheduler_fork_context(

This comment has been minimized.

Copy link
@illicitonion

illicitonion Jun 19, 2019

Contributor

Definitely a question for a separate PR, but I believe this is now unused, and it (and the python half which calls this) can also be deleted

@gshuflin gshuflin changed the title Remove resettable Remove fork-related code from Rust Jun 19, 2019

@gshuflin gshuflin force-pushed the gshuflin:remove_resettable branch 2 times, most recently from 57ff713 to 4e1de00 Jun 19, 2019

@Eric-Arellano

This comment has been minimized.

Copy link
Contributor

commented Jun 20, 2019

This is still failing due to error cannot allocate memory https://travis-ci.org/pantsbuild/pants/jobs/547962452#L2810.

I originally thought this was us running out of memory because of V2 using too much of it, but I don't think that's the case anymore. This failure has happened twice now for this PR, and hasn't shown up on any other builds.

@illicitonion any thoughts on what might be causing it? Greg wasn't able to reproduce locally via ./pants --no-v1 --v2 test tests/python/pants_test/engine:fs.

--

Also, has a merge conflict now.

@blorente
Copy link
Contributor

left a comment

Welcome to the project!

I don't know why the out of memory error happens, but to reproduce it locally you might want to try setting this value to something lower: https://github.com/pantsbuild/pants/blob/master/src/rust/engine/fs/src/store.rs#L24

// Allow for some overhead for bookkeeping threads (if any).
process_execution_parallelism + 2,
store.clone(),
futures_timer_thread2.clone(), // TODO remove clone once the store and http are not resettables

This comment has been minimized.

Copy link
@blorente

blorente Jun 20, 2019

Contributor

Can we remove this clone?

@illicitonion illicitonion force-pushed the gshuflin:remove_resettable branch from 4e1de00 to 0f7b2aa Jun 20, 2019

@illicitonion

This comment has been minimized.

Copy link
Contributor

commented Jun 20, 2019

This is still failing due to error cannot allocate memory https://travis-ci.org/pantsbuild/pants/jobs/547962452#L2810.

I originally thought this was us running out of memory because of V2 using too much of it, but I don't think that's the case anymore. This failure has happened twice now for this PR, and hasn't shown up on any other builds.

@illicitonion any thoughts on what might be causing it? Greg wasn't able to reproduce locally via ./pants --no-v1 --v2 test tests/python/pants_test/engine:fs.

I do have thoughts! So, a while ago, we made the store attempt to allocate a ridiculous amount of virtual memory. This is needed because it mmaps a potentially very large file from disk, and we want to allow the file tot be very large. We make a store per pants process, or per scheduler we create in unit tests. @stuhood ended up making us re-use schedulers across the test-cases in a test-class, because he couldn't find a convenient and reliable way to force python to gc away the schedulers (which have stores) after each test-case, and so we ended up collecting a bunch of these, and they blew up memory usage like crazy.

We currently have two separate LMDB stores, one for files (which may be large), and one for metadata about directories (which should be tiny), but we allocate the same giant amount of virtual memory to both of them. We can probably trivially drop the allocation to the one for directories by a few orders of magnitude, which I will try doing now... This is still kind of hacking around the problem, but... shrug

The reason this PR will be making us more likely to max out memory is that before Stores were only lazily initialized when they were actually needed (because that's how Resettables worked), whereas now we're eagerly initialising them every time we make a scheduler; so some schedulers we made before which didn't take a bunch of virtual memory, now will.

Also, it's not just this PR; I saw this on another of my PRs today too.

--

Also, has a merge conflict now.

I've just pushed a rebase, because that merge conflict was totally my fault, and totally foreseeable by me! Sorry for the hassle!

@illicitonion illicitonion force-pushed the gshuflin:remove_resettable branch from 0f7b2aa to 9dd8569 Jun 20, 2019

fmt

@illicitonion illicitonion force-pushed the gshuflin:remove_resettable branch from 9dd8569 to e2739d3 Jun 20, 2019

@illicitonion illicitonion merged commit f7d45dc into pantsbuild:master Jun 20, 2019

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@gshuflin gshuflin deleted the gshuflin:remove_resettable branch Jun 20, 2019

@@ -79,11 +77,10 @@ impl Core {
let mut remote_store_servers = remote_store_servers;
remote_store_servers.shuffle(&mut rand::thread_rng());

let runtime = Resettable::new(|| {
let runtime =
Arc::new(RwLock::new(Runtime::new().unwrap_or_else(|e| {

This comment has been minimized.

Copy link
@stuhood

stuhood Jun 21, 2019

Member

Runtime is Send and Sync afaict, so the RwLock should no longer be necessary...? I wonder whether it ever was.

This comment has been minimized.

Copy link
@stuhood

stuhood Jun 21, 2019

Member

...oh, sorry. Except we need mutable access to it for spawn and block_on. Nevermind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.