-
Notifications
You must be signed in to change notification settings - Fork 602
OP_EMPTYAVHV - optimized empty ANONLIST/ANONHASH #20036
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OP_EMPTYAVHV - optimized empty ANONLIST/ANONHASH #20036
Conversation
|
Comments:
|
fb6b801 to
61dfc3f
Compare
|
On Wed, Aug 03, 2022 at 03:20:49PM -0700, Richard Leach wrote:
This commit introduces a new OP to replace cases of `OP_ANONLIST` and
`OP_ANONHASH` where there are zero elements, which is very common in
Perl code.
I like this idea. I also think it can be taken further. Here are my
observations.
…------
Name: how about OP_EMPTYAVHV.
(I don't like 'MT' - had no idea it was short for 'empty').
------
A further optimisation might be, for the case of
[my] $x = {};
to skip the padmy and sassign, and set op_targ to be the reference
directly. I.e. something along the lines of this pseudocode:
SV *rv;
SV *sv = newSV_type(SVt_PVAV // SVt_PVHV);
if (OPpTARGET_MY) {
rv = PL_curpad[op_targ];
/* coerce rv to be RV-capable, then ...*/
SvRV_set(rv, sv);
if (OPpLVAL_INTRO) /* is 'my' declaration */
save_clearsv(op_targ);
if (G_VOID)
return; /* skip extending and pushing */
}
else {
rv = newSV_type_mortal(SVt_IV);
/* .... */
SvRV_set(rv, sv);
}
XPUSHs(rv);
Since the op_targ is very likely to be an undef SVt_IV from a previous
iteration, converting it to a live RV can typically be special-cased.
------
OPf_SPECIAL: an easy way to see if/where the OPf_SPECIAL flag isn't used
in pp_anaonlist/hash() is to add a temporary assert at the start of
both functions, like
assert(PL_op->op_flags & OPf_SPECIAL);
then run it and see what breaks. In this case, it's used (at least) by the
constant-folder to fold a constant list, as in e.g. @A[1..5] where the
(1..5) gets converted into the constant array (1,2,3,4,5) and the AV is
stored directly in the CONST op, rather than a ref to it.
But in this case, only optimising the OPf_SPECIAL case looks right.
------
The OPpANONTYPE flag: for flags associated with just one op, they're
typically named as OPp[name of op]_[description of flag] So for this,
I'd suggest OPpEMPTYAVHV_IS_HV. I suggest _IS_HV rather than _TYPE,
since then whether the flag being on indicates AV or HV is built into the
name.
------
Don't test the opcode structure in ext/B/t/optree_samples.t; that tends to
require fixing up any time there's a minor change in an unrelated opcode.
It's easier and more robust to add something to t/perf/opcount.t - this
test file checks for certain opcodes being present (or absent) in the
right quantity. For example with $a[0] it checks that there are zero
alem's and one aelemfast, confirming that the optimisation is present.
--
Modern art:
"That's easy, I could have done that!"
"Ah, but you didn't!"
|
|
On Thu, Aug 04, 2022 at 04:41:13AM -0700, iabyn wrote:
On Wed, Aug 03, 2022 at 03:20:49PM -0700, Richard Leach wrote:
> This commit introduces a new OP to replace cases of `OP_ANONLIST` and
> `OP_ANONHASH` where there are zero elements, which is very common in
> Perl code.
And one last thing. Add a few entries to t/perf/benchmarks: at a minimum,
code => '$x = []'
setup => 'my $x', code => '$x = []'
code => 'my $x = []'
and ditto for {}.
Then run Porting/bench.pl on pre- and post- perls.
…--
More than any other time in history, mankind faces a crossroads. One path
leads to despair and utter hopelessness. The other, to total extinction.
Let us pray we have the wisdom to choose correctly.
-- Woody Allen
|
8f0b0aa to
9764573
Compare
|
Thanks for the helpful comments, @iabyn. Think I've actioned all of them, apart from running bench.pl, which I'll do once the pp_ function is finalised. Besides a general re-review, two specific new things I'd appreciate you taking a look at
|
9764573 to
094764e
Compare
|
On Fri, Aug 05, 2022 at 01:07:12PM -0700, Richard Leach wrote:
Thanks for the helpful comments, @iabyn. Think I've actioned all of
them, apart from running _bench.pl_, which I'll do once the pp_ function
is finalised.
Personally I add tests to benchmarks at the *start* of optimisation work,
then run Porting/bench.pl (on just the selected tests) frequently during
development. Since its very fast (takes just a couple of minutes with a
suitable -j N arg) and 100% reproducible - it's running a cachgrind
simulation of the CPU - it can quickly spot mistakes - e.g. "why has the
number of branch executions just gone up by 1 after I tweaked pp_foo()?"
It can save the results of a run in a data file; then later you can
compare several such files to see how things are going (up and down).
E.g. do a run and have a file for every significant commit.
* I wasn't sure how concise "coerce rv to be RV-capable" can safely be. The current code passes all tests, but...that doesn't mean it's okay.
I think you've unrolled too much there - it should be just a quick check
for a simple common special-case and if not, fallback to the slow
SvSetMagicSV() or whatever. No need to unroll SvSetMagicSV() itself.
Thinking about it further, the 3 common cases to expect are:
1) first iteration using the pad var:
condition: SVt_NULL
2) second+ iteration if the variable goes out of scope between
iterations; e.g.:
for (...) { my $x = {}; ... }
sub f { my $x; ...; $x = {}; .... }
condition: SVt_IV && !SvOK
3) second+ iteration if the variable doesn't go out scope; e.g.:
my $x; for (...) { ...; $x = {}; ... }
condition: SVt_IV && SvROK
Any other combo and the programmer is doing something weird - e.g. using a
variable initially as a string, then as a reference.
I think it's reasonable to short-cut just for case 2: The container is
already set up to hold an RV, it can't have any magic weirdness, and it
doesn't have an old RV value which needs freeing first.
NB: in your code, you're testing the SV type against OP_NULL rather than
SVt_NULL.
* I haven't figured out how to do without the mortal RV (`refsv` in
`pp_emptyavhv`) when doing a lexical assignment, since leaving it out
causes a leakage in _t/op/svleak.t_.
`
Well, creating a mortal in the assignment case is completely wrong
and defeats most of the point of the OPpTARGET_MY optimisation!
It's difficult to comment much more without seeing the original failing
code. But, in principle, doing
sv = newSV_type(SVt_PVAV);
creates an orphaned SV with a reference count of 1. If the code exits or
dies at that moment, it will leak. So it needs to be attached somewhere.
Turning the lexical var pointed to by op_targ into an RV and pointing it
at sv is an example of attaching it. Normally when creating an RV you
increment the ref count of the thing being pointed at, but in this case
there's no need to do so as sv already has a refcount that's "one too
high".
It's possible that you had a false positive in your t/op/svleak.t test.
The number of SVs should go up by 1 because an anon array/hash is being
created. The important thing is that it doesn't go up by *10+* on 10
iterations.
In t/perf/opcount.t it would be better to have separate tests for each of
the {}, $x={}, my $x; $x={}, my $x={} etc.
In creating the array/hash, wouldn't it be slightly more efficient to
select just the type rather than having two newSV_type() calls? i.e.
SV * const sv = MUTABLE_SV(newSV_type(
(op->op_private & OPpEMPTYAVHV_IS_HV) ?
SVt_PVHV : SVt_PVAV
));
I haven't reviewed your code in rpeep() nor in B::Deparse.
…--
Any [programming] language that doesn't occasionally surprise the
novice will pay for it by continually surprising the expert.
-- Larry Wall
|
|
Thanks for the updated comments. I'll try to do an updated push later this week. |
|
On Mon, Aug 08, 2022 at 04:04:21PM -0700, Richard Leach wrote:
FWIW, the test in _t/op/svleak.t_ which fails if the `refsv` mortal isn't created is this one:
```
{ # broken by 304474c, fixed by cefd5c7, but didn't seem to cause
# any other test failures
# base test case from ribasushi (Peter Rabbitson)
no warnings 'experimental::builtin';
use builtin 'weaken';
my $weak;
{
$weak = my $in = {};
weaken($weak);
my $out = { in => $in, in => undef }
}
ok(!$weak, "hash referenced weakened SV released");
}
```
That's mainly testing that assigning two values to the same hash key in
sequence decrements the ref count of the first value. It's not obvious to
me why that's the only test that broke.
…--
Hofstadter's Law: It always takes longer than you expect, even when you
take into account Hofstadter's Law.
|
80e57a1 to
bb3d06f
Compare
|
Ok, think I sorted out the reference counting. t/perf/benchmark differences are as follows: |
a0a635a to
dd5f28b
Compare
dd5f28b to
abd3ade
Compare
|
Rebased after #20077 merge. |
abd3ade to
395fd1a
Compare
|
Rebased after aafefcb merge. |
|
I can't test anything till Thursday. I'm comfortable with whatever you
choose. Sorry about that.
Yves
…On Sat, 17 Sept 2022, 23:15 Richard Leach, ***@***.***> wrote:
@iabyn <https://github.com/iabyn> / @demerphq
<https://github.com/demerphq> - I'll merge this within a few days unless
you'd prefer me to hold off for further review.
—
Reply to this email directly, view it on GitHub
<#20036 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAZ5R4IT6OLVF5ZD3H2HUTV6Y7H5ANCNFSM55QNVBWA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
|
Unless anyone has objections, I'd like to merge this before the next dev release in just over a week's time. |
a274920 to
525a9e8
Compare
|
You need a version bump. It looks ok to me. |
|
Thanks. lib/B/op_private.pm seems to need a bump as part of the v.5.37.5 release, so I'll hold off until that has happened. |
Apologies @richardleach please rebase and force push to your branch. This should correct the issue. |
This commit introduces a new OP to replace cases of OP_ANONLIST and
OP_ANONHASH where there are zero elements, which is very common in
Perl code.
As an example, `my $x = {}` is currently implemented like this:
...
6 <2> sassign vKS/2 ->7
4 <@> anonhash sK* ->5
3 <0> pushmark s ->4
5 <0> padsv[$x:1,2] sRM*/LVINTRO ->6
The pushmark serves no meaningful purpose when there are zero
elements and the anonhash, besides undoing the pushmark,
performs work that is unnecessary for this special case.
The peephole optimizer, which also checks for applicability of a
related TARGMY optimization, transforms this example into:
...
- <1> ex-sassign vKS/2 ->4
3 <@> emptyavhv[$x:1,2] vK*/LVINTRO,ANONHASH,TARGMY ->4
- <0> ex-pushmark s ->3
- <0> ex-padsv sRM*/LVINTRO ->-
525a9e8 to
b1b67a3
Compare
|
Thanks @toddr. Merging. |
|
On Sat, Sep 17, 2022 at 03:15:26PM -0700, Richard Leach wrote:
@iabyn / @demerphq - I'll merge this within a few days unless you'd prefer me to hold off for further review.
Sorry, I didn't get round to doing this. In short, lools good, apart from
these nits:
the op has OPf_SPECIAL flag set - this is a hangover from when the op
started out as an OP_ANONLIST/HASH. It aught really to be cleared when
converted into an OP_EMPTYAVHV
The special-case code for when the SV is SVt_IV just assumes all other
flags can be blown away. This is probably ok at the moment, but might
change. Best to put in an assert to see if any unexpected flags are set:
if (SvTYPE(rv) == SVt_IV && !SvOK(rv)) {
assert((SvFLAGS(rv) & ~(SVf_IVisUV|SVs_PADSTALE)) == SVt_IV);
...
SVs_PADSTALE mat be set but will get blown away anyway by save_clearsv()
in the OPpLVAL_INTRO case, and otherwise its a likely a bug if its set -
but harmless to unset.
It might be worth adding a couple of tests for
state $r = {};
such as checking that the second time round a loop the contents of %$r
haven't been lost.
Might be worth adding in a few deparse tests for the various
/my/state $x = ....
permutations
…--
The optimist believes that he lives in the best of all possible worlds.
As does the pessimist.
|
Thanks, I've started working on them. |
This commit introduces a new OP to replace cases of
OP_ANONLISTandOP_ANONHASHwhere there are zero elements, which is very common inPerl code.
As an example,
my $x = {}is currently implemented like this:The
pushmarkserves no meaningful purpose when there are zeroelements and the
anonhash, besides undoing thepushmark,performs work that is unnecessary for this special case.
The peephole optimizer transforms this example into:
The following toy code shows the op-related speedup:
for (0 .. 100_000_000) { {} }With blead:
Patched: