Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hints hash should not store references #37

Closed
atoomic opened this issue Dec 20, 2018 · 5 comments
Closed

Hints hash should not store references #37

atoomic opened this issue Dec 20, 2018 · 5 comments

Comments

@atoomic
Copy link

atoomic commented Dec 20, 2018

This was discussed already on RT:

But I think it worth giving this some visibility there.
It's a bad idea to store references in the Hash Hint a.k.a. %^H

A quick grep show that both the .pm and .xs code need some fixup

Changes:32:          - rewrite pragma implementation and the way %^H is used
Parameters.xs:2172:            croak("%s: internal error: $^H{'%s'} not a hashref: %"SVf, MY_PKG, HINTK_CONFIG, SVfARG(sv));
Parameters.xs:2182:            croak("%s: internal error: $^H{'%s'}{'%.*s'} not a hashref: %"SVf, MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, SVfARG(sv));
Parameters.xs:2205:        croak("%s: internal error: $^H{'%s'}{'%.*s'}{'%s'} not set", MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, HINTSK_ ## NAME); \
Parameters.xs:2218:            croak("%s: internal error: $^H{'%s'}{'%.*s'}{'%s'} not a coderef: %"SVf, MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, HINTSK_REIFY, SVfARG(sv));
Parameters.xs:2234:                    croak("%s: internal error: $^H{'%s'}{'%.*s'}{'%s'}: expected '$', found '%.*s'", MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, HINTSK_SHIFT, (int)(sv_p_end - p), p);
Parameters.xs:2238:                    croak("%s: internal error: $^H{'%s'}{'%.*s'}{'%s'}: expected idfirst, found '%.*s'", MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, HINTSK_SHIFT, (int)(sv_p_end - p), p);
Parameters.xs:2246:                    croak("%s: internal error: $^H{'%s'}{'%.*s'}{'%s'}: can't use global $_ as a parameter", MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, HINTSK_SHIFT);
Parameters.xs:2252:                            croak("%s: internal error: $^H{'%s'}{'%.*s'}{'%s'}: %"SVf" can't appear twice", MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, HINTSK_SHIFT, SVfARG((*ppspec)->shift.data[i].name));
Parameters.xs:2269:                            croak("%s: internal error: $^H{'%s'}{'%.*s'}{'%s'} not an arrayref: %"SVf, MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, HINTSK_SHIF2, SVfARG(sv));
Parameters.xs:2274:                        croak("%s: internal error: $^H{'%s'}{'%.*s'}{'%s'}: tix [%ld] out of range [%ld]", MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, HINTSK_SHIFT, (long)tix, (long)(av_len(shift_types) + 1));
Parameters.xs:2278:                        croak("%s: internal error: $^H{'%s'}{'%.*s'}{'%s'}: tix [%ld] doesn't exist", MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, HINTSK_SHIFT, (long)tix);
Parameters.xs:2282:                        croak("%s: internal error: $^H{'%s'}{'%.*s'}{'%s'}: tix [%ld] is not an object (%"SVf")", MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, HINTSK_SHIFT, (long)tix, SVfARG(type));
Parameters.xs:2289:                        croak("%s: internal error: $^H{'%s'}{'%.*s'}{'%s'}: expected ' ', found '%.*s'", MY_PKG, HINTK_CONFIG, (int)kw_len, kw_ptr, HINTSK_SHIFT, (int)(sv_p_end - p), p);
lib/Function/Parameters.pm:351:    my %config = %{$^H{+HINTK_CONFIG} // {}};
lib/Function/Parameters.pm:378:    $^H{+HINTK_CONFIG} = \%config;
lib/Function/Parameters.pm:385:        delete $^H{+HINTK_CONFIG};
lib/Function/Parameters.pm:389:    my %config = %{$^H{+HINTK_CONFIG}};
lib/Function/Parameters.pm:391:    $^H{+HINTK_CONFIG} = \%config;

It appears that this is the only CPAN module to store HINTK_CONFIG there so this should be safe, to move it to a better location.

https://grep.metacpan.org/search?size=20&q=HINTK_CONFIG&qd=&qft=

Maybe a GV in Function::Parameters itself?

@mauke
Copy link
Owner

mauke commented Mar 21, 2023

The problem can be reproduced without Function::Parameters or Regexp::Grammars:

#!/usr/bin/env perl
use strict;
use warnings;

use re 'eval';
use overload ();
BEGIN {
    overload::constant qr => sub { 
        qq{(?{ "\\N{EURO SIGN}" })}
    };
}

use charnames qw(:full);
qr/a/;
__END__

Result:

Undefined subroutine &main::CODE(0x564e9441abe0) called at (eval 5) line 1.

I cribbed the use of %^H from charnames.pm. If Function::Parameters is broken, then so is the charnames pragma.
(I still need to understand WTF is going on in this code, though.)

PS: HINTK_CONFIG is an internal constant; you'd have to search for Function::Parameters/config to find real users: https://grep.metacpan.org/search?q=Function%3A%3AParameters%2Fconfig&qd=&qft=

@mauke
Copy link
Owner

mauke commented Mar 21, 2023

I've reported it as Perl/perl5#20950 for now.

@demerphq
Copy link

demerphq commented Mar 21, 2023

I dont really get what you want to do here.

BEGIN {
    overload::constant qr => sub { 
        qq{(?{ "\\N{EURO SIGN}" })}
    };
}

Is it deliberate you are using (?{ ... }) and not (??{ ... }) ?

I dont really get what you are trying to do here. You want to change all qr//'s to return a deferred pattern match the euro-sign?

It feels like there are multiple ways to do what you want to do that would work.

@mauke
Copy link
Owner

mauke commented Mar 21, 2023

@demerphq As mentioned in https://rt.cpan.org/Ticket/Display.html?id=128044, what the user was trying to do was:

use Function::Parameters;
use Regexp::Grammars;
qr{ <grammar:Sympa::Scenario> }x;

Which results in the this error:

Function::Parameters: internal error: $^H{'Function::Parameters/config'} not a hashref: HASH(0x56276feec8c0) at (eval 7) line 1.

As you can see on https://metacpan.org/release/DCONWAY/Regexp-Grammars-1.058/source/lib/Regexp/Grammars.pm, the Regexp::Grammars module is 2700 lines of code, so naturally I tried to reduce the code to something simpler first.

Regexp::Grammars does many different things. Among them:

The code for the latter looks like

        return qq{(?{
            warn "Can't match directly against a pure grammar: <grammar: $grammar_name>\n";
        })(*COMMIT)(?!)};

So at this point, we have an overloaded qr handler that returns an object that has an overloaded "" (stringification) handler that returns a regex fragment that, when executed, dynamically calls warn and always fails to match.

This is what triggers the error inside Function::Parameters. However, despite what Damian claims, you can put references in %^H. In fact, this is documented as part of the public interfaces of the core charnames pragma:

The mechanism of translation of \N{...} escapes is general and not hardwired into charnames.pm. A module can install custom translations (inside the scope which uses the module) with the following magic incantation:

sub import {
    shift;
    $^H{charnames} = \&translator;
}

Function::Parameters uses a similar mechanism. Thus, as a general rule, any code that breaks because Function::Parameters puts references in %^H also breaks when charnames is used.

That's why my reduced code uses charnames instead of Function::Parameters. All other elements were taken from Regexp::Grammars. I later realized that overloading is not necessary to reproduce the issue:

$ perl -e 'use re "eval"; use charnames ":full"; my $x = qq<(?{ "\\N{COMMA}" })>; qr/$x/'
Undefined subroutine &main::CODE(0x56439fd0a6d0) called at (eval 5) line 1.

All you need is dynamically eval'd regex code that tries to read what should be a reference from %^H. This can be the charnames handler (triggered by \N{...} in double-quotish context) or Function::Parameter's config settings for the current lexical scope (triggered by its keyword plugin and any bareword in the code, such as warn ).

@mauke
Copy link
Owner

mauke commented Mar 31, 2023

I've put a workaround into Function::Parameters. If the %^H entry is found not to be a reference, the former error is now downgraded to a warning (which can be disabled with no warnings 'Function::Parameters') and Function::Parameters disables itself in that scope (but could be re-enabled with an explicit use Function::Parameters).

BTW, the reason why this error even happens for qr{ <grammar:Sympa::Scenario> }x; (which contains no keywords) is a bit convoluted:

  • Regexp:Grammars enables use re 'eval' in the calling scope and rewrites the regex (using qr overloading) into something using embedded (?{ ... }) blocks.
  • Perl implements re 'eval' by taking the regex string at runtime, slapping a literal qr/.../ around it, and then calling eval.
  • The keyword plugin in Function::Parameters then gets called for "qr" and chokes because unlike a real eval op, the implicit eval hidden inside the regex engine doesn't capture the %^H context properly.

@mauke mauke closed this as completed Mar 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants