Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid generic call in most cases for getproperty #50523

Merged
merged 9 commits into from
Jul 25, 2023

Conversation

gbaraldi
Copy link
Member

More generic than #50444.

Should we keep the boundscheck on the getfield since we got the index from the runtime itself?

@gbaraldi gbaraldi requested a review from vtjnash July 12, 2023 17:25
src/codegen.cpp Outdated Show resolved Hide resolved
src/codegen.cpp Outdated
@@ -3839,9 +3851,17 @@ static bool emit_builtin_call(jl_codectx_t &ctx, jl_cgval_t *ret, jl_value_t *f,
*ret = emit_getfield_knownidx(ctx, obj, 0, utt, order);
return true;
}
else {
Value *index = ctx.builder.CreateCall(prepare_call(jlfieldindex_func),
{emit_typeof(ctx, obj, false, false), boxed(ctx, fld), ConstantInt::get(getInt32Ty(ctx.builder.getContext()), 0)});
Copy link
Member

@NHDaly NHDaly Jul 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

boxed(ctx, fld)

Does LLVM then unbox the fld value, before this actually reaches real code? This PR is meant to remove an allocation here right?

Can you also leave a comment explaining that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

boxed checks if the object needs a box before being passed into julia, if it was an Int or something without a type tag. In this case fld is known to be a Symbol so it's a no op. The boxing that used to happen here is because we used the generic jlcall convention.
I can add a comment if you think it's useful though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think anything where the reason for the core isn't obvious or straightforward should have a comment. 👍 Thanks

Copy link
Member

@NHDaly NHDaly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending my requests for comments and a test. Thanks for fixing this, @gbaraldi!! We really appreciate it

Copy link
Sponsor Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems odd to delete most of the optimization for single-argument functions while adding this

@gbaraldi
Copy link
Member Author

gbaraldi commented Jul 13, 2023

if (nfields == 1) {
So we already have that optimization in emit_getfield_unknownidx, we just weren't hitting it before, the generic one lowers to the same thing as before bar not doing the literal value for the symbol compare, but I'm not sure how much we are saving there?
I can just add it back if we want.

@KristofferC KristofferC added backport 1.9 Change should be backported to release-1.9 backport 1.10 Change should be backported to the 1.10 release labels Jul 14, 2023
@Keno
Copy link
Member

Keno commented Jul 14, 2023

Agree with @vtjnash. This will be slower for the 1-arg case. I can see this being useful for types with interesting homogeneities, where unknownindex actually does something interesting. However, for the general case, I would much prefer a variant of getfield that just takes the object pointer and type tag separately so that we can avoid boxing, but also don't need to do double the number of calls emitted for unknown getfield.

@gbaraldi
Copy link
Member Author

I will readd the special case we had, I guess saving the field_index call (which doesn't need boxing often anyway) doesn't cost us anything.

KristofferC added a commit that referenced this pull request Jul 24, 2023
Backported PRs:
- [x] #50411 <!-- Fix weird dispatch of * with zero arguments -->
- [x] #50202 <!-- Remove dynamic dispatch from _wait/wait2 -->
- [x] #50064 <!-- Fix numbered prompt with input only with comment -->
- [x] #50026 <!-- Store heapsnapshot files in tempdir() instead of
current directory -->
- [x] #50402 <!-- Add CPU feature helper function -->
- [x] #50387 <!-- update newpages pointer after actually sweeping pages
-->
- [x] #50424 <!-- avoid potential type-instability in _replace_(str,
...) -->
- [x] #50444 <!-- Optimize getfield lowering to avoid boxing in some
cases -->
- [x] #50474 <!-- docs: Fix a `!!! note` which was miscapitalized -->
- [x] #50466 <!-- relax assertion involving pg->nold to reflect that it
may be a bit in… -->
- [x] #50490 <!-- Fix compat annotation for italic printstyled -->
- [x] #50488 <!-- fix typo in `Base.isassigned` with `Tridiagonal` -->
- [x] #50476 <!-- Profile: Add specifying dir for `take_heap_snapshot`
and handling if current dir is unwritable -->
- [x] #50461 <!-- fix typo in the --gcthreads argument description -->
- [x] #50528 <!-- ssair: Correctly handle stmt insertion at end of basic
block -->
- [x] #50533 <!-- ensure internal_obj_base_ptr checks whether objects
past freelist pointer are in freelist -->
- [x] #49322 <!-- improve cat design / performance -->
- [x] #50540 <!-- gc: remove over-eager assertion -->
- [x] #50542 <!-- gf: remove unnecessary assert cycle==depth -->
- [x] #50559 <!-- Expand kwcall lowering positional default check to
vararg -->
- [x] #50058 <!-- Add unwrapping mechanism for triangular mul and solves
-->
- [x] #50551 <!-- typeintersect: also record chained `innervars` -->
- [x] #50552 <!-- read(io, Char): fix read with too many leading ones
-->
- [x] #50541 <!-- precompile: ensure globals are not accidentally
created where disallowed -->
- [x] #50576 <!-- use atomic compare exchange when setting the GC
mark-bit -->
- [x] #50578 <!-- gf: make method overwrite/delete an error during
precompile -->
- [x] #50516 <!-- Fix visibility of assert on GCC12/13 -->
- [x] #50597 <!-- Fix memory corruption if task is launched inside
finalizer -->
- [x] #50591 <!-- build: fix various makefile bugs -->
- [x] #50599 <!-- faster invalid object lookup in conservative gc -->
- [x] #50634 <!-- 🤖 [master] Bump the SparseArrays stdlib from b4b0e72
to 99c99b4 -->
- [x] #50639 <!-- Backport LLVM patches to fix various issues. -->
- [x] #50546 <!-- Revert storage of method instance in LineInfoNode -->
- [x] #50631 <!-- Shift DCE pass to optimize imaging mode code better
-->
- [x] #50525 <!-- only check that values are finite in `generic_lufact`
when `check=true` -->
- [x] #50587 <!-- isassigned for ranges with BigInt indices -->
- [x] #50144 <!-- Page based heap size heuristics -->


Need manual backport:
- [ ] #50595 <!-- Rename ENV variable `JULIA_USE_NEW_PARSER` ->
`JULIA_USE_FLISP_PARSER` -->



Non-merged PRs with backport label:
- [ ] #50637 <!-- Remove SparseArrays legacy code -->
- [ ] #50618 <!-- inference: continue const-prop' when concrete-eval
returns non-inlineable -->
- [ ] #50598 <!-- only limit types in stack traces in the REPL -->
- [ ] #50594 <!-- Disallow non-index Integer types in isassigned -->
- [ ] #50568 <!-- `Array(::AbstractRange)` should return an `Array` -->
- [ ] #50523 <!-- Avoid generic call in most cases for getproperty -->
- [ ] #50172 <!-- print feature flags used for matching pkgimage -->
@gbaraldi
Copy link
Member Author

Can we merge this @vtjnash

@vtjnash vtjnash merged commit 207c09a into JuliaLang:master Jul 25, 2023
6 checks passed
KristofferC pushed a commit that referenced this pull request Jul 28, 2023
More generic than #50444.

Should we keep the boundscheck on the getfield since we got the index
from the runtime itself?

(cherry picked from commit 207c09a)
@quinnj
Copy link
Member

quinnj commented Aug 1, 2023

Will/can this be backported to 1.9.X as well?

kpamnany added a commit to RelationalAI/julia that referenced this pull request Aug 4, 2023
@KristofferC KristofferC mentioned this pull request Aug 10, 2023
35 tasks
KristofferC pushed a commit that referenced this pull request Aug 10, 2023
@gbaraldi gbaraldi deleted the optimize-getproperty-more branch August 14, 2023 16:15
KristofferC added a commit that referenced this pull request Aug 16, 2023
Backported PRs:
- [x] #50637 <!-- Remove SparseArrays legacy code -->
- [x] #50665 <!-- print `@time` msg into print buffer -->
- [x] #50523 <!-- Avoid generic call in most cases for getproperty -->
- [x] #50635 <!-- `versioninfo()`: include build info and unofficial
warning -->
- [x] #50670 <!-- Make reinterpret specialize fully. -->
- [x] #50666 <!-- include `--pkgimage=no` caches for stdlibs -->
- [x] #50765 
- [x] #50764
- [x] #50768
- [x] #50767
- [x] #50618 <!-- inference: continue const-prop' when concrete-eval
returns non-inlineable -->
- [x] #50689 <!-- Attach `tanpi` docstring to method -->
- [x] #50671 <!-- Fix rdiv of complex lhs by real factorizations -->
- [x] #50598 <!-- only limit types in stack traces in the REPL -->
- [x] #50766 <!-- Don't partition alwaysinline functions -->
- [x] #50771 <!-- re-allow non-string values in ENV `get!` -->
- [x] #50682 <!-- Add fallback if we have make a weird GC decision. -->
- [x] #50781 <!-- fix `bit_map!` with aliasing -->
- [x] #50172 <!-- print feature flags used for matching pkgimage -->
- [x] #50844 <!-- Bump OpenBLAS binaries to use the new GEMM
multithreading threshold -->
- [x] #50826 <!-- Update dependency builds -->
- [x] #50845 <!-- fix #50438, use default pool for at-threads -->
- [x] #50568 <!-- `Array(::AbstractRange)` should return an `Array` -->
- [x] #50655 <!-- fix hashing regression. -->
- [x] #50779 <!-- Minor refactor to image generation -->
- [x] #50791 <!-- Make symbols internal in jl_create_native, and only
externalize them when partitioning -->
- [x] #50724 <!-- Merge opaque closure modules with the rest of the
workqueue -->
- [x] #50738 <!-- Add alignment to constant globals -->
- [x] #50871 <!-- macOS: Don't inspect dead threadtls during exception
handling. -->

Need manual backport:

Contains multiple commits, manual intervention needed:

Non-merged PRs with backport label:
- [ ] #50850 <!-- Remove weird Rational dispatch and add pi functions to
list -->
- [ ] #50823 <!-- Make ranges more robust with unsigned indexes. -->
- [ ] #50809 <!-- Limit type-printing in MethodError -->
- [ ] #50663 <!-- Fix Expr(:loopinfo) codegen -->
- [ ] #50594 <!-- Disallow non-index Integer types in isassigned -->
- [ ] #50385 <!-- Precompile pidlocks: add to NEWS and docs -->
- [ ] #49805 <!-- Limit TimeType subtraction to AbstractDateTime -->
KristofferC added a commit that referenced this pull request Aug 18, 2023
Backported PRs:
- [x] #47782 <!-- Generalize Bool parse method to AbstractString -->
- [x] #48634 <!-- Remove unused "deps" mechanism in internal sorting
keywords [NFC] -->
- [x] #49931 <!-- Lock finalizers' lists at exit -->
- [x] #50064 <!-- Fix numbered prompt with input only with comment -->
- [x] #50474 <!-- docs: Fix a `!!! note` which was miscapitalized -->
- [x] #50516 <!-- Fix visibility of assert on GCC12/13 -->
- [x] #50635 <!-- `versioninfo()`: include build info and unofficial
warning -->
- [x] #49915 <!-- Revert "Remove number / vector (#44358)" -->
- [x] #50781 <!-- fix `bit_map!` with aliasing -->
- [x] #50845 <!-- fix #50438, use default pool for at-threads -->
- [x] #49031 <!-- Update inference.md -->
- [x] #50289 <!-- Initialize prev_nold and nold in gc_reset_page -->
- [x] #50559 <!-- Expand kwcall lowering positional default check to
vararg -->
- [x] #49582 <!-- Update HISTORY.md for `DelimitedFiles` -->
- [x] #50341 <!-- invokelatest docs should say not exported before 1.9
-->
- [x] #50525 <!-- only check that values are finite in `generic_lufact`
when `check=true` -->
- [x] #50444 <!-- Optimize getfield lowering to avoid boxing in some
cases -->
- [x] #50523 <!-- Avoid generic call in most cases for getproperty -->
- [x] #50860 <!-- Add `Base.get_extension` to docs/API -->
- [x] #50164 <!-- codegen: handle dead code with unsafe_store of FCA
pointers -->
- [x] #50568 <!-- `Array(::AbstractRange)` should return an `Array` -->
- [x] #50871 <!-- macOS: Don't inspect dead threadtls during exception
handling. -->

Need manual backport:
- [ ] #48542 <!-- Add docs on task-specific buffering using
multithreading -->
- [ ] #50591 <!-- build: fix various makefile bugs -->


Non-merged PRs with backport label:
- [ ] #50842 <!-- Avoid race conditions with recursive rm -->
- [ ] #50823 <!-- Make ranges more robust with unsigned indexes. -->
- [ ] #50663 <!-- Fix Expr(:loopinfo) codegen -->
- [ ] #49716 <!-- Update varinfo() docstring signature -->
- [ ] #49713 <!-- prevent REPL from erroring in numbered mode in some
situations -->
- [ ] #49573 <!-- Implement jl_cpu_pause on PPC64 -->
- [ ] #48726 <!-- fix macro expansion of property destructuring -->
- [ ] #48642 <!-- Use gc alloc instead of alloc typed in lowering -->
- [ ] #48183 <!-- Don't use pkgimage for package if any includes fall in
tracked path for coverage or alloc tracking -->
- [ ] #48050 <!-- improve `--heap-size-hint` arg handling -->
- [ ] #47615 <!-- Allow threadsafe access to buffer of type inference
profiling trees -->
@KristofferC KristofferC removed backport 1.10 Change should be backported to the 1.10 release backport 1.9 Change should be backported to release-1.9 labels Aug 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants