Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support '|' in path #189

Merged
merged 2 commits into from Mar 20, 2024
Merged

Conversation

Arkaeriit
Copy link
Contributor

Using '|' as the data separator caused crashes if
a path with a '|' char in it was in the data file. Changing that separator to a null byte ensures
that this kind of issue can never happen again as
the null byte can't be in a path.

Using '|' as the data separator caused crashes if
a path with a '|' char in it was in the data file.
Changing that separator to a null byte ensures
that this kind of issue can never happen again as
the null byte can't be in a path.
@Arkaeriit
Copy link
Contributor Author

I rebased the branch onto master.

@skywind3000
Copy link
Owner

skywind3000 commented Mar 6, 2024

The original format is compatible with z.sh, I believe a lot of users may benefit from this PR,
before it get merged there are still some problems to deal with:

  1. after the upgrading, will the old history data corrupt ?
  2. if users have no idea about this upgrading (just run a clink update *), will they find all of they history data have been lost ?

Some possible options:

  1. make the separator configurable ? (via a environment variable), when people are changing it they know what they are doing.
  2. make some header in the new format so z.lua can figure out (maybe a "\0\0" marker in the first line) if it is a new format, if not it will convert the old format into the new one ?
  3. what about a binary format ?

@Arkaeriit
Copy link
Contributor Author

Thanks for the feedback. I didn't thought about migration so I definitively need to fix that. Option 2 seems the most sensible to me so I will go with it.

@Arkaeriit Arkaeriit changed the title Change separator in data file Support '|' in path Mar 20, 2024
@Arkaeriit
Copy link
Contributor Author

Turns out changing the separator is not needed. Improving the reading function is enough to fix the bugs and crashes with pipes in filenames.

A smarter reading function lets us support '|' in
filenames whilst still using it as the database
separator.
@skywind3000
Copy link
Owner

skywind3000 commented Mar 20, 2024

I like this solution, no need to convert the old data, the second argument of string:split() function can specify maximum split items:

z.lua/z.lua

Lines 138 to 152 in 3022099

function string:split(sSeparator, nMax, bRegexp)
assert(sSeparator ~= '')
assert(nMax == nil or nMax >= 1)
local aRecord = {}
if self:len() > 0 then
local bPlain = not bRegexp
nMax = nMax or -1
local nField, nStart = 1, 1
local nFirst, nLast = self:find(sSeparator, nStart, bPlain)
while nFirst and nMax ~= 0 do
aRecord[nField] = self:sub(nStart, nFirst - 1)
nField = nField + 1
nStart = nLast + 1
nFirst, nLast = self:find(sSeparator, nStart, bPlain)
nMax = nMax - 1

is it enough to just use

local part = string.split(line, '|', 3)

??

@Arkaeriit
Copy link
Contributor Author

Using the second argument of the split function merges the not split part at the end of the input and not at the start like we want.

line = "path|with|a|lot|of|pipes|1212|9999"
line:split("|", 3)) --> {"path", "with", "a", "lot|of|pipes|1212|9999"}

@skywind3000 skywind3000 merged commit 9112f0e into skywind3000:master Mar 20, 2024
@skywind3000
Copy link
Owner

got it, thank you for this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants