New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the NEWOBJ API narrower #7393
Conversation
9417280
to
c735d37
Compare
In the commit "Remove RB_RVARGC_EC_NEWOBJ_OF and RB_EC_NEWOBJ_OF", "teh" seems a typo. |
a2bd681
to
c61a058
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a huge fan of passing in the EC to RVARGC_NEWOBJ_OF
since the EC is just an implementation detail of the current GC, so in the future we should think about how we can get rid of passing the EC into the GC.
fe3148c
to
5cc67cc
Compare
This is to fix a weird bindgen behavior on Matt's branch: #7393
0df5558
to
5607e41
Compare
5607e41
to
06e04c0
Compare
2b2630d
to
c8bd5ef
Compare
b7c7181
to
0e0c7e5
Compare
The background to pass the
However, now C-compiler specific TLS feature ( So if you measured at least microbenchmarks the following:
and if we don't have a difference, I agree to avoid passing the |
For me, this kind of branch should be avoided to reduce branches. |
0e0c7e5
to
7efc642
Compare
I got around to looking at these benchmarks recently, and to summarise the situation so far: I refactored the NEWOBJ api's to reduce complexity at the call sites. This has resulted in an api that requires us to either pass an @peterzhu2118 had a concern that requiring call sites to pass the @ko1 was concerned that not passing the Additionally @ko1 was concerned that the extra branch point required now in the I built a micro-benchmark, designed to produce many This is the benchmark:
These are the branches I used, and how they compare:
results
Benchmarking was done on Fedora 37 on a Ryzen 3600, running at a fixed 3.6GHz, with each process restricted to a single CPU core and with ASLR disabled, to try and reduce variance. We can see from these results that So I would like to merge the patches that remove the |
One thing I don't have a good answer for at this time is why We can even see this by looking at the generated code for the
In addition, in
So given that it generates more code, and that extra code is fetching the I don't think this is anything to worry about, given that removing the |
7efc642
to
14a6834
Compare
We should probably try this benchmark with more than 1 thread live in the system. I'd guess we'll see stalls when there are multiple threads? |
bb20847
to
4c9440e
Compare
@tenderlove It looks like you're right. I think restricting the benchmark too far was skewing the results. I changed the benchmark script to this:
And ran all the benchmarks again, outside of the
So, as expected, this time removing the explicit If you're happy with this, then I'll tidy up the commits and merge the refactor with the macro. |
4c9440e
to
5882f3a
Compare
Sounds good to me! |
We can just make newobj_of take a ractor
so that now shape can happily include gc.h
90c0ae2
to
5ba533b
Compare
NEWOBJ_OF is now our canonical newobj macro. It takes an optional ec
The socket extensions rubysocket.h pulls in the "private" include/gc.h, which now depends on vm_core.h. vm_core.h pulls in id.h when tool/update-deps generates the dependencies for the makefiles, it generates the line for id.h to be based on VPATH, which is configured in the extconf.rb for each of the extensions. By default VPATH does not include the actual source directory of the current Ruby so the dependency fails to resolve and linking fails. We need to append the topdir and top_srcdir to VPATH to have the dependancy picked up correctly (and I believe we need both of these to cope with in-tree and out-of-tree builds). I copied this from the approach taken in https://github.com/ruby/ruby/blob/master/ext/objspace/extconf.rb#L3
5ba533b
to
e9e8f86
Compare
Sorry why does it introduce stalls? |
Since the introduction of variable width allocation with
RVARGC
there are a lot of different*NEWOBJ*
macros. Currently there are:RB_RVARGC_NEWOBJ_OF
RB_RVARGC_EC_NEWOBJ_OF
RB_NEWOBJ_OF
, an alias ofRB_RVARGC_NEWOBJ_OF
RB_EC_NEWOBJ_OF
, an alias ofRB_RVARGC_EC_NEWOBJ_OF
NEWOBJ_OF
, an alias ofRB_RVARGC_NEWOBJ_OF
RVARGC_NEWOBJ_OF
, an alias ofRB_RVARGC_NEWOBJ_OF
This PR merges
RB_RVARGC_NEWOBJ_OF
,RB_RVARGC_EC_NEWOBJ_OF
,RVARGC_NEWOBJ_OF
,RB_NEWOBJ_OF
andNEWOBJ_OF
into a single macro that takes an execution context as an argument (which can beNULL
; if so then the current execution context is found and used usingGET_EC()
).The resulting single macro has been named
NEWOBJ_OF
to reflect that it should now be the only way of creating new objects.Both
RB_NEWOBJ_OF
andNEWOBJ_OF
have seperate implementations that are part of the public API exposed ininclude/ruby/internal/newobj.h
so they are available to extension authors. These have not been modified in any way.Note that implementing this PR required
gc.h
to includevm_core.h
. This was not possibly becausevm_core.h
includedshape.h
, which includedgc.h
creating a circular dependency.To address this I have removed the
shape_list
,next_shape_id
androot_shape
members fromrb_vm_struct
and instead promoted them to their own type:rb_shape_tree_t
. This is defined as a global object accessible usingrb_shape_tree_ptr
and theGET_SHAPE_TREE()
function macro.In practice, this means that
GET_VM()->shape_list
etc are now replaced withGET_SHAPE_TREE()->shape_list
.