Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gsub has trouble with \0 #759

Closed
mqnc opened this issue Oct 17, 2021 · 5 comments
Closed

gsub has trouble with \0 #759

mqnc opened this issue Oct 17, 2021 · 5 comments
Labels

Comments

@mqnc
Copy link

mqnc commented Oct 17, 2021

txt1 = "\0"
print((txt1:gsub("\0", "o")))
-- lua:o jit:oo

txt2 = "\0\1\2"
print((txt2:gsub("\0\1\2", "o")))
-- lua:o jit:oooo

txt3 = "\0\1"
print((txt3:gsub("[\0]", "o")))
-- lua:o jit: malformed pattern (missing ']')

This is probably an issue with string termination in C but it works fine in lua

@GitSparTV
Copy link

LuaJIT is compatible to Lua 5.1 which has the same problem, \0 can be used in patterns since Lua 5.2. Use %z.

@mqnc
Copy link
Author

mqnc commented Oct 17, 2021

thank you!

@vanc
Copy link

vanc commented Oct 18, 2021

LuaJIT is compatible to Lua 5.1 which has the same problem, \0 can be used in patterns since Lua 5.2. Use %z.

That doesn't seem to be the case. Lua 5.1.5 behaves the same as Lua 5.3 for the first two cases.

> txt1 = "\0"
> print((txt1:gsub("\0", "o")))
o
> txt2 = "\0\1\2"
> print((txt2:gsub("\0\1\2", "o")))
o
> txt3 = "\0\1"
> print((txt3:gsub("[\0]", "o")))
stdin:1: malformed pattern (missing ']')
stack traceback:
        [C]: in function 'gsub'
        stdin:1: in main chunk
        [C]: ?

For the 3rd case, Lua 5.3 printed a special character after 'o' which should be the "\1".

$ lua5.3
Lua 5.3.3  Copyright (C) 1994-2016 Lua.org, PUC-Rio
> txt3 = "\0\1"
> print((txt3:gsub("[\0]", "o")))
o�
>

The "%z" does seem work with LuaJIT. By replacing the "\0" in the patterns with "%z", LuaJIT behaves the same as Lua 5.3

@MikePall
Copy link
Member

You need to inspect the bytes of the returned string to see what's actually happening. Your console doesn't show some of these characters.

The pattern string is terminated at the first "\0", so the pattern is empty. And a gsub with an empty pattern has weird semantics.

Oh, and string.format("%q") is partially broken in Lua 5.1, but fixed in LuaJIT -- so don't use that one to inspect the bytes.

But all of this is moot, since the Lua 5.1 manual clearly says A pattern cannot contain embedded zeros. So, don't do that.

@GitSparTV
Copy link

You can see actual content using this function

function showbytes(str)
    local comma = false
    for i = 1, #str do
        io.write(comma and ", " or "", str:sub(i, i), " (", str:byte(i, i), ")")
        comma = true
    end
    io.write("\n")
    io.flush()
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants