Skip to content

add a new warning against using \ in qw() #23403

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: blead
Choose a base branch
from
Open

Conversation

book
Copy link
Contributor

@book book commented Jul 5, 2025

I've seen AI-generated code try to use qw() to create lists containing strings with embedded whitespace using qw and \ to "protect" the whitespace. Things like:

my @list = qw(
    foo
    bar\ baz
);

Just like occurences of ',' and '#', I believe this should warn.

Note that the warning will only be emitted when the \ is followed by actual whitespace, so code like the following (from lib/App/Cpan.pm) will not warn:

my $epic_fail_words = join '|',
        qw( Error stop(?:ping)? problems force not unsupported
                fail(?:ed)? Cannot\s+install );
  • This set of changes requires a perldelta entry, and it is included.

@book book marked this pull request as draft July 5, 2025 01:39
Copy link
Contributor

@jkeenan jkeenan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you ever seen this backslash-to-escape-whitespace-in-qw in code written by a human?

If not, then maybe we should not apply this p.r. so that we have a way to readily identify AI-generated code.

@Grinnz
Copy link
Contributor

Grinnz commented Jul 5, 2025

This is adding a warning that the code will not do what was expected, which would not affect "identifying" the code. It seems a reasonable indication to me.

@book
Copy link
Contributor Author

book commented Jul 5, 2025

I saw this in AI-generated code, but it looked plausible enough that I had to double-check. ("What?! I didn't know you could do that... Oh. It turns out you can't." "You are absolutely right!")

Given the ubiquity of backslash as an escape character, it's reasonable to think that someone not fluent in Perl would try it.

The example in the description is what actually led me to show the warning only when immediately followed by whitespace.

@guest20
Copy link

guest20 commented Jul 5, 2025

This sounds like this warning will punishing people who actually write code and run it in production just to make it slightly more convenient for people who do not write code to copy/paste generated nonsense. Is that valuable?

Nobody "fixed the language" when the markov bot on IRC produced code that almost looks right. And if that is useful, I think that at absolute minimum the request should come from an actual person who's had this problem. I don't think perl should pro-actively change in response to "umm, i saw a screen shot of a tweet on reddit that got cross-posted to my telegram group where chatGPT was wrong about programming".

Nobody reported this bug. Nobody ran this code. Nobody even wrote this code. I think the same person should fix it.

@book
Copy link
Contributor Author

book commented Jul 5, 2025

Who uses qw to create strings that end with \, and will be "punished" by this warning?

I think it's going to be the same person who reported the bug, ran and wrote such code. (Nobody.)

To be honest, I'm not surprised nobody reported the absence of a warning. The bug is not the absence of a warning, it's the expectation that \ will DWIM. And the code that was fixed was in that qw expression. Nothing to report, move along.

However, in my first encounter with AI-generated Perl code in a realistic Perl project, the AI spit that out.

I think this kind of "you're holding it wrong" warning will only show up for someone (or something 🤖) who makes that very understandable mistake of assuming you can protect whitespace with backlashes. And that is exactly the kind of helpful warning Perl has been dispensing for a very long time.

It could be argued that allowing \ (backslash-space) in qw would be the more valuable fix. We've spent almost 40 years without it, so that ship has definitely sailed.

@book book force-pushed the book/qw-backslash branch 2 times, most recently from d5f5af1 to ee06f66 Compare July 8, 2025 16:35
@book book marked this pull request as ready for review July 8, 2025 16:38
@bulk88
Copy link
Contributor

bulk88 commented Jul 8, 2025

This sounds like this warning will punishing people who actually write code and run it in production just to make it slightly more convenient for people who do not write code to copy/paste generated nonsense. Is that valuable?

Nobody "fixed the language" when the markov bot on IRC produced code that almost looks right. And if that is useful, I think that at absolute minimum the request should come from an actual person who's had this problem. I don't think perl should pro-actively change in response to "umm, i saw a screen shot of a tweet on reddit that got cross-posted to my telegram group where chatGPT was wrong about programming".

Nobody reported this bug. Nobody ran this code. Nobody even wrote this code. I think the same person should fix it.

You don't know what a 20 year old college student with a sci/tech/eng degree is capable of, when they try a task that is wayyy off their knowledge base. While I don't advocate for Google style fuzzing and security paranoia to be applied to the P5 interp.

I personally just use my significant other as a human guinea pig for Perl 5. We don't look at each others screens or IDEs. If she can screw it up in Perl then every beginner dev/sysadmin/gaming teenager can. In the old days, someone would get a paper book, or a man file that took 5-10 mins to download before the first keystroke. Nowadays, "copy paste save run", if your code didn't instantly SEGV or throw a fatal runtime exception in DevTools, YOU ARE WINNING!

So very experienced non-AI humans from other prog languages, attempting to write "functions" in Perl, basically are AI fuzz testing. They did less than 20 seconds of reading on what Pearl is, before trying to edit and compile a FOSS production grade PP Perl 5 written OS package manager tool.

The warning sounds like its the better choice, since P5P has NEVER EVER published a BNF grammar for Perl 5, and therefore no AI bots can write valid Perl 5 code. AI bots, after using a baysian filter on Perl 5, think Perl is 75% Javascript, 20% shell scripting, 5% ISO C. And the 5% C made AI think it can use the CPP feature.

#line works great in Perl 5 remember!!!!

Copy link
Contributor

@khwilliamson khwilliamson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for the nit in the pod, lgtm.

That this hasn't shown up until now indicates to me that this is unlikely to break code that wasn't already broken. So I think this should be merged

book added 2 commits July 9, 2025 08:02
I've seen AI-generated code try to use qw() to create lists containing
strings with embedded whitespace using qw and \ to "protect" the
whitespace. Things like:

    my @list = qw(
        foo
        bar\ baz
    );

Just like occurences of ',' and '#', I believe this should warn.

Note that the warning will only be emitted when the \ is followed by
actual whitespace, so code like the following (from lib/App/Cpan.pm)
will not warn:

    my $epic_fail_words = join '|',
            qw( Error stop(?:ping)? problems force not unsupported
                    fail(?:ed)? Cannot\s+install );
@book book force-pushed the book/qw-backslash branch from ee06f66 to 486d859 Compare July 9, 2025 06:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants