Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose heap size information to the user #3275

Merged
merged 9 commits into from May 12, 2023

Conversation

Abdullahsab3
Copy link
Contributor

@Abdullahsab3 Abdullahsab3 commented May 6, 2023

This PR aims at exposing heap size information to the end user of Scala Native. Concretely, the initial heap size and the maximum heap size should be accessed by the end user using scalanative.runtime.GC.getInitHeapSize() and scalanative.runtime.GC.getMaxHeapSize() which will both return the results in bytes. This is similar to Runtime.getRuntime().maxMemory() in the Java library.

Motivation

In some cases, the user may be interested in knowing the initial and maximum heap size the garbage collector is using at runtime. For example, if the user sets the initial and maximum heap size using environment variables, the user may want to do a sanity check in their program to make sure that the variables are correctly set. This is useful in some (automated) benchmarking infrastructures, where the heap size is often limited, and it may come in handy to check that the heap size settings are correcly set and well understood by the Scala Native program.

Implementation

Immix and Commix

For the immix and commix garbage collectors, accessing the initial heap size is achieved by using the Settings_MinHeapSize() function. As for the maximum heap size, if the user has set a maximum heap size using the corresponding environment variable, then this variable will be returned. Otherwise, the total amount of the available phyiscal memory on the machine will be returned.

None

For the none gc, if the user did not change the heap size settings, the initial heap size will be 0, and the maximum heap size will be the total amount of physical memory.

Boehm

Unfortuantely, I was not able to get the initial heap size using the public interface of the Boehm GC. I was able to locate it in the codebase of BDWGC Line 928 bdwgc/misc.c, but I could not find directly any public interface to retrieve these results.
As for the maximum heap size, the GC_get_prof_stats function may be used. However when I was testing on my machine (Ubuntu 22.04) with a manually set maximum heap size, GC_prof_stats_s.heapsize_full did not seem to change, which makes me think that I probably misunderstood its usage, and it is perhaps not the maximum heap size.

@WojciechMazur
Copy link
Contributor

Thank you for this contribution! It would be great if you could provide an unit test just to ensure it links correctly and return sane results (probably check returned size to be matching _ > 0 && _ < 32GB should be good enough).
Also maybe you could also provide implementation of Runtime.getRuntime().maxMemory() using these new functionalities, but it could also be introduced later, in the follow up PR.

@Abdullahsab3 Abdullahsab3 marked this pull request as ready for review May 8, 2023 04:42
@Abdullahsab3
Copy link
Contributor Author

Thanks for the feedback! I added a unit test to check whether the returned results are indeed within sound bounderies.

I am still unsure about how to get the initial and maximum heap sizes for the boehm gc. If anyone has suggestions, please feel free to share them. I also tried 'guarding' the results that are returned from boehm gc when environment variables are used (i.e., if the user has set environment variables, return those instead of the retrieved results from boehm gc). However, Parsing.h does not seem to be accessible from gc.c in boehm. I keep getting linking errors (hence the commented code).
The error I am getting is undefined reference to Parse_Env_Or_Default.
Perhaps I am doing something wrong?

I will be opening a follow up PR soon to integrate the new functionalities with javalib.

@WojciechMazur
Copy link
Contributor

The problem with not available Parsing.h is not available because the contents of the shared directory, you can adjust it here to match the other GCs

private[scalanative] case object Boehm
extends GC("boehm", Seq("gc"), Seq.empty)

I've taken a glance at Boehm GC and probably we can start with env variable GC_MAXIMUM_HEAP_SIZE and GC_INITIAL_HEAP_SIZE. The runtime values might be unpredictable because Boehm GC exposes GC_set_max_heap_size but it does not define an accessor for that value.

@Abdullahsab3
Copy link
Contributor Author

Thanks a lot, it makes sense 😅.

Hm, some checks seem to fail. Is this maybe from another commit not related to this PR? Failing tests seem to be from runtimes with immix and commix on Linux, though those were not modified in the latest commit to this PR.

For Boehm, if no environment variables are set, the initial heap size will be 0 and the maximum heap size will be GC_prof_stats_s.heapsize_full. I do not think there is a better way at this moment to retrieve those values from the boehm gc interface. If environment variables are set, then those can be returned.

@WojciechMazur
Copy link
Contributor

Hm, some checks seem to fail. Is this maybe from another commit not related to this PR?

I think these are problems of GitHub infrastructure, I'll restart them later

@Abdullahsab3 Abdullahsab3 changed the title [WIP] Expose heap size information to the user Expose heap size information to the user May 10, 2023
@WojciechMazur WojciechMazur merged commit 89f8ed5 into scala-native:main May 12, 2023
69 checks passed
WojciechMazur pushed a commit that referenced this pull request May 23, 2023
* [WIP] first iteration of exposing heap size info

* Exposed heap size information that the GC utilizes to the user

* updated getMinHeapSize name to getInitHeapSize name

* fixed clangfmt

* fixed gone bracket

* added unit test for the heap size information retrieval

* fixed scalafmt :)

* fixed linking boehm with shared. updated boehm to guard heap size information

(cherry picked from commit 89f8ed5)
WojciechMazur pushed a commit that referenced this pull request Jun 5, 2023
* [WIP] first iteration of exposing heap size info

* Exposed heap size information that the GC utilizes to the user

* updated getMinHeapSize name to getInitHeapSize name

* fixed clangfmt

* fixed gone bracket

* added unit test for the heap size information retrieval

* fixed scalafmt :)

* fixed linking boehm with shared. updated boehm to guard heap size information

(cherry picked from commit 89f8ed5)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants