-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug in printf with mix and match of % and %x$ #324
Comments
I did not understand the title of this issue. But after some trail-and-error and some investigation I believe I understand the trouble you expressed. Trouble occurs when x$ aka parameter field is used. (see n$ parameter field ). Example: Reorder parameters and repeat the 2nd:
More troubled output:
|
Yall, I bumped into a problem the other day, and after investigating, think it all converge to this #324, as @hyenias stated, the title is confusing as (to me) the problem as nothing to do with %b (or conversion specifier more generally). To me the title could be In the following examples
Examples WITHOUT mix'n'match (all results correct)
Examples WITH mix'n'match
Indeed when I say 'should' is subject to my interpretation, and it seems that is not normalised, to me indexed access don't advance parameter access in argv[], just pick argv[ndx] while sequential (positional) access do consume (advance) argv[]. There is 2 ways to fix that.
The C way
As we see here C complains about using %1$ while %s as been used already, yet this is a warning, and result is correct. A more complete example showing indexed access don't advance anything while sequential access do advance parameter access.
As we see here despite the warning indexed access access the right argv[ndx] while sequential access do avance in argv[] without bad side effect on argv[ndx]. More thorogh example on ksh side shows that mix'n'match is bugged because sequential access do increment argv[], then subsequent [ndx] are offsetted then screwed... I investigate later... |
Hi All, The patch is posted at In a nutshell the patch behave like if the printf format string is pre-processed before working replacing any sequential argv[] access % and * (with * being width/prec) with its auto mapped index equivalent %x$ *x$ The printf(3) I have says.
This is how the patch behave, yet internally this is not what is implemented, I tried to not touch any existing utility function like sffmtpos(), but instead work around its deficiencies. The same with extend(), instead of changing it I implement yet another function around it reload() to work around the type mistmatch clash, the extend() being unchanged in case it is used by other code than printf (may be scanf)... I used this script to generate the printf.sh I added in the test suite, this script can be used locally to double check the results we got, we can use it with zsh as well, and probably with other shell I don't have. Only when reviewer and testers have double check all this we could accept this patch. As it is now, the output are almost on pare with zsh, when diff occurs, they are zsh deficiencies (IMHO), but indeed this is debatable. Again this snipped is for generating src/cmd/ksh93/tests/printf.sh so it may look ugly, yet it is usable locally
This can be used like, -b stand for bugged, -f stand for fixed
A fragment of an
The Ignore the So lemme know for any trouble with that so I can peek/poke a bit more in there if needed. |
See #613 for notes and such. |
@stephane-chazelas reported: > $ printf '%1$s %1$b\n' '\\\\' > \\\\ b > $ printf '%b %1$s\n' '\\\\' > \\ > > expected: > > $ printf '%1$s %1$b\n' '\\\\' > \\\\ \\ > $ printf '%b %1$s\n' '\\\\' > \\ \\\\ @hyenias commented: > After some trial-and-error and some investigation I believe I > understand the trouble you expressed. Trouble occurs when x$ aka > parameter field is used[*]. > > Example: Reorder parameters and repeat the 2nd: > > $ printf '%3$s %2$s %1$s %2$s\n' a b c > c b a b > > More troubled output: > > $ printf '%b %1$s\n' '\\\\' > \\ > $ printf '%1$b %1$s\n' '\\\\' > \\ \\ > $ printf '%s %1$s\n' '\\\\' > \\\\ > $ printf '%s %1$s\n' 'a' > a > $ printf '%1$s %1$s\n' 'a' > a a > > [*] See n$ parameter field: https://en.wikipedia.org/wiki/Printf Analysis and fix: printf format string can handle sequential argv[] access via % and * as well as indexed argv[] access via %x$ *x$ (* used for width/prec) In a given format string it is possible to mix'n'match sequential/ indexed access, and mix'n'match the argv[] types. '%*2$s' 1 2 Indexed access for 2$ making argv[1] an int, seq access for 's' then accessing arv[0] for s, here we mix seq/ndx access. '%*2$s %s' 1 2 Same as above, the 1st format makes argv[0] a 'string' argv[1] an 'int' then the second format make argv[1] a 'string' this is a mixed type access for arv[1]. The former algo for printf-format handling builtin (say printf) is to save the builtin argv[] in a data structure called struct printf located in src/cmd/ksh93/bltins/print.c. This is located in a .c file, meaning it is private to the ksh builtin. Yet, all the code to scan a printf format string (not limited to printf, may be scanf too) is located in lib/libast/sfio/ meaning all sorts of jazz to to setup and use the private struct printf access at sfio level via function pointers. This printf data struct has a member called nextarg[] that is initially setup to the builtin argv[]. It is called nextarg, because on sequential access the % or * format access *nextarg++, meaning each sequential access, consumes an arg. This is a major design flaw, because it can't support indexed access %x$ *x$ very well. For this, on 1st indexed access, a new sfio level (not builtin) data structure is built called fp[]. It rescans the format string to find out all the indexed access, enters them in this kind of cache, recording the type/value of the argv[x]. Now another problem occurs: the type entered is what we see on 1st index 'x' occurence, so if we got '%2$d %2$f' then for occurence 2 i.e. fp[1] we enter type 'd' and fetch the value nextarg[1]. Since nextarg may have been moved forward by sequential access, it provides a bogus value. Then later access to %$2f will retrieve the cached value for fp[1], find an 'int', and ksh is unable to convert it to 'double' and all sorts of crash occur. This is a complete design flaw, may be due to initial design for sequential access only (like bash) then piled on a kludge to try an indexed access. This patch will leave all the sfio infrastructure (functions) untouched, yet will implement a workaround on each flaw. The reasons not to touch the existing functions is that the format scan may be used by scanf() and I'm not fixing scanf for now. The workaroud is based on adding more members in the struct printf data structure, basically add argv0, the inital backup of argv[] received by the builtin. nextarg is left unchanged, the former infrastructure still maintained i.e. advance; they are tied to sequential access. Yet, indexed access is the argv0[x] access with lazy conversion, i.e. type conversion on demand, i.e. late in the printf() process. In short, this fix works 'as if' the format string is pre-pocessed, replacing any non-indexed access (i.e. seq access) by an explicitly computed index access, the index generated being simply the seq index i.e. something similar to: +----++------+-++-----------+-- Seq | || | || ndx ---+ | v vv v vv v v '%s %*s %*.*s %*1$s' expanded to (ignore spaces) '%1$s %3$*2$s %6$*4$.*5$s %7$*1$s' ^ ^ ^ ^ ^ ^ ^ ^ | | | | | | | | +----+--+----+--+---+----+--+-- Computed Seq index. Note the index inversion due to the syntax of the format string printf '%*.*s' width prec data; is equivalent to printf '%3$*1$.*2$' width prec data; Here I keep all the infrastructure, yet working around bugs. src/lib/libast/sfio/sfvprintf.c: - Implement argp, argn processing so that each argv[] cached is on its right position. - Use nargs/xargs make sure we accees the correct argv[] position. - argv[] vs fp[] type mismatch we do a late argv[] evaluation (uncached). src/lib/libast/sfio/sftable.c: - sffmtpos() duplicates the format string processing done here in sfvprintf(), so we implement the same argp, argn processing so that each argv[] cached is on its right position. src/cmd/ksh93/bltins/print.c: - Add reload() function. It is called when the caller wants to solve a fp[x] cache miss, i.e., a type mismatch between the cached type/value and the desired new type. It does the same as extend(), i.e. setup type/value for a given argp and then bump nextarg, but also saves/restores nextarg. This because 'maybe' extend is used by other code (scanf?). Resolves: #324
The second part of the patch from one of the PR comments[*] didn't make it into the PR, so it failed to fix one of the originally reported reproducers. This patch applies that fix. This corrects the following reproducer: $ printf '%b %1$s\n' '\\\\' \\ to: $ printf '%b %1$s\n' '\\\\' \\ \\\\ [*] #617 (comment)
The second part of the patch from one of the PR comments[*] didn't make it into the PR, so it failed to fix one of the originally reported reproducers. This patch applies that fix. This corrects the following reproducer: $ printf '%b %1$s\n' '\\\\' \\ to: $ printf '%b %1$s\n' '\\\\' \\ \\\\ [*] #617 (comment)
The second part of the patch from one of the PR comments[*] didn't make it into the PR, so it failed to fix one of the originally reported reproducers. This patch applies that fix. This corrects the following reproducer: $ printf '%b %1$s\n' '\\\\' \\ \\ to: $ printf '%b %1$s\n' '\\\\' \\ \\\\ [*] #617 (comment) Co-authored-by: Martijn Dekker <martijn@inlv.org>
The second part of the patch from one of the PR comments[*] didn't make it into the PR, so it failed to fix one of the originally reported reproducers. This patch applies that fix. This corrects the following reproducer: $ printf '%b %1$s\n' '\\\\' \\ \\ to: $ printf '%b %1$s\n' '\\\\' \\ \\\\ [*] #617 (comment) Co-authored-by: Martijn Dekker <martijn@inlv.org>
expected:
(I don't know if it's limited to %b or if that applies to other format specifiers)
(issue already mentioned on the austin-group-l mailing list where %x$y is being discussed atm)
The text was updated successfully, but these errors were encountered: