Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider available memory and address space for parallel execution #2418

Merged
merged 3 commits into from
Mar 30, 2023

Conversation

pmatilai
Copy link
Member

@pmatilai pmatilai commented Mar 8, 2023

See commits for details, but short summary: add optional proc/thread argument to %{getncpus} macro which consider the available memory and address space, based on newly added tunables for tasksize.

The goal here is to avoid build failures due to gross overallocation on constrained systems, and to allow packagers to easily adjust for gigantic builds.

Fixes: #804

@pmatilai pmatilai added the RFE label Mar 8, 2023
@pmatilai pmatilai added this to the 4.19.0 milestone Mar 8, 2023
@pmatilai pmatilai force-pushed the memcpus-pr branch 3 times, most recently from 3353d6a to e4d0e62 Compare March 8, 2023 08:58
@pmatilai
Copy link
Member Author

pmatilai commented Mar 8, 2023

In my original version this also added a %getmem macro with similar arguments. Left it out to keep things simple and minimal, but would be trivial to add back if people think that's useful.

@pmatilai pmatilai self-assigned this Mar 16, 2023
@Conan-Kudo
Copy link
Member

%getmem could be useful for making %limit_build work better in Fedora, so I think it'd be worth having.

@pmatilai
Copy link
Member Author

Well, the point of this PR is to make %limit_build and the like redundant entirely.

@Conan-Kudo
Copy link
Member

That's even better. 👍🏾

rpmio/macro.c Outdated Show resolved Hide resolved
rpmio/macro.c Show resolved Hide resolved
{
unsigned long mem = getmem_total();
/*
* Conservative estimates for thread use on 32bit systems where address
Copy link
Contributor

@dmnks dmnks Mar 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious (no nitpick here), where do these estimates come from? They don't seem totally arbitrary so I suppose there's some technical reasoning behind them?

Copy link
Member Author

@pmatilai pmatilai Mar 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The total 32bit address space is 4GB. On a native 32bit Linux system, the kernel eats 1GB out of that, leaving 3GB for the process. On a 64bit system a 32bit process has nearly all of the 4GB available to it. While running out of physical memory can be handled by virtual memory (aka swap), the address space is a hard limit and rpm can't handle running into it. Which is why the estimates are very conservative.

Just realized the patch uses a misleading 'vmem' name when it's actually address space it's talking about.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@dmnks
Copy link
Contributor

dmnks commented Mar 22, 2023

Other than the above, the overall changeset looks sane to me.

"total" equals calling with no arguments, "proc" and "thread" consider
further constraints, what is implemented here is heuristics based on
available physical memory and address-space and %_smp_tasksize_proc /
%_smp_tasksize_thread tunables.

Change the previous %getncpus related tests to use %getconfdir instead,
they are testing unexpected arguments behavior for this type of macro,
not %getncpus itself. Add a test for the actual functionality: if nproc
is available, test that our total matches with that, and that defining
tasksize to total memory only allocates one thread. Optimally we'd
test separately for 32bit address space limitations but that gets tough
when we have no idea where this will be executed.
Take advantage of the new %{getncpus:thread} functionality when
calculating the number of threads to use for io stream compression
(when not in parallel region)
…anagement#804)

Take advantage of the new %{getncpus:proc/thread} functionality when
calculating the number of processes/threads to use during build.

The goal here is to avoid gross overallocation of processes/threads
on constrained systems. In particular threads on 32bit systems where
address space is limited, but also to allow packagers to easily tune for
gigantic build jobs such as webkit that may overwhelm otherwise adequate
systems.

Fixes: rpm-software-management#804, RhBug:1118734
@dmnks
Copy link
Contributor

dmnks commented Mar 27, 2023

OK, I can confirm the new push fixes the proc calculation, thanks!

@pmatilai pmatilai removed this from the 4.19.0 milestone Mar 28, 2023
@pmatilai
Copy link
Member Author

Okay, since there are no further comments...

@pmatilai pmatilai merged commit a213101 into rpm-software-management:master Mar 30, 2023
@pmatilai pmatilai deleted the memcpus-pr branch March 30, 2023 10:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

RFE: rpm global resource manager
3 participants