Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

string.gsub() doesn't accept a pattern with zero byte in a middle #8094

Open
Totktonada opened this issue Dec 25, 2022 · 2 comments
Open

string.gsub() doesn't accept a pattern with zero byte in a middle #8094

Totktonada opened this issue Dec 25, 2022 · 2 comments
Labels
feature A new functionality

Comments

@Totktonada
Copy link
Member

tarantool> string.gsub('\x00abc', '[%z\x01-\xff]', 'x')
---
- xxxx
- 4
...

-- Expected "xxxx".
tarantool> string.gsub('\x00abc', '[\x00-\xff]', 'x')
---
- error: malformed pattern (missing ']')
...

-- Expected "axxxd".
tarantool> string.gsub('ab\x00cd', 'b\x00c', 'xxx')
---
- "axxx\0cd"
- 1
...

Other functions that accept a pattern look affected by the problem too, at least string.find() is affected.

@Totktonada Totktonada added the bug Something isn't working label Dec 25, 2022
@Buristan
Copy link
Collaborator

Hi, Alexander!

As you can see from Lua 5.1 Reference Manual for string.find()

This function does not accept string values containing embedded zeros, except as arguments to the q option.

PUC RIO Lua accept embedded zeros in patterns since Lua 5.2.
So, we can implement this feature within full Lua 5.2 compatibility, if we want.

Also, see the same ticket in LuaJIT repo: LuaJIT/LuaJIT#759.

@Buristan Buristan removed the bug Something isn't working label Dec 26, 2022
@Totktonada
Copy link
Member Author

So, we can implement this feature within full Lua 5.2 compatibility, if we want.

We can implement it within partial Lua 5.2 compatibility as well, because the change is compatible.

As you can see from Lua 5.1 Reference Manual for string.find()

This function does not accept string values containing embedded zeros, except as arguments to the q option.

The quote is about string.format(). However you're right, the manual says the same for patterns:

A pattern cannot contain embedded zeros. Use %z instead.

So, technically speaking, this is not a bug. It is a feature request.

@Totktonada Totktonada added the feature A new functionality label Dec 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new functionality
Projects
None yet
Development

No branches or pull requests

2 participants