-
Notifications
You must be signed in to change notification settings - Fork 31
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[v1.1] typeset -f <functname>: generate script using sh_deparse()
Problem reproducer: - Locate or write a script with a function definition - Source the script with '.' or 'source' - Print the function definition with 'typeset -f'; looks good - Edit the file to add or remove characters before or in the function definition, or rename or remove the file - Print the function definition with 'typeset -f' again; the output is corrupted! Cause: ksh remembers the offset[*1] and length[*2] of the function definition in the source file, and prints function definitions by simply dumping that amount of bytes from that offset in the source file. This will obviously break if the file is edited or (re)moved, so that approach is fundamentally flawed and needs to be replaced. All other shells that are capable of printing function definitions (bash, mksh, yash, zsh) regenerate the function definition from the parse tree. ksh does include code for a deparser (in deparse.c) but it is disused (so it was removed in the 1.0 branch). This commit changes 'typeset -f' to use it, as well as 'typeset -T' when printing type discipline functions. It also applies lots of tweaks and fixes to the deparser. More fixes might be needed. So, it's time to test this. [*1] functloc in struct functnod --> hoffset in struct Ufunction [*2] repurposed functline in struct functnod --> nvsize in struct Namval src/cmd/ksh93/sh/deparse.c: - Too many tweaks to mention for more correct script generation and better formatting; this is the result of intermittent work over about a year so I don't quite remember everything. - sh_deparse(): Add argument for initial indentation level. - p_keyword(): Convert the BEGIN/MIDDLE/END argument into a bitmask and add a NOTAB bit flag to avoid printing a tab after a keyword. src/cmd/ksh93/sh/string.c: sh_fmtq(): - For assignments, avoid quoting variable names containing a dot, e.g. .namespace.foo=bar should not become '.namespace.foo=bar'. - Avoid quoting the variable name in additive assignments, e.g. foo+=bar\ baz should become foo+='bar baz', not 'foo+=bar baz'. src/cmd/ksh93/include/shnodes.h, src/cmd/ksh93/sh/parse.c: - funct(): Do not save function location, offset and length. src/cmd/ksh93/bltins/typeset.c: print_namval(): - For 'typeset -f functionname', instead of retrieving the function definition from the saved position in its source file, call sh_deparse() to generate script from the parse tree. Start indentation at level 0. src/cmd/ksh93/bltins/typeset.c: sh_outtype(): - For type discipline functions output by 'typeset -T typename', instead of retrieving the function definition from the saved position in its source file, call sh_deparse() to generate script from the parse tree. Start indentation at level 1.
- Loading branch information
Showing
15 changed files
with
105 additions
and
128 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
f21ace5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW this commit appears to have fixed a rather old bug that caused truncated
typeset -f
output (att#25).Output comparison:
The resulting output may not be perfect, but it is usable for reinput to the shell (certainly an improvement over truncated output).
f21ace5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for that observation. Looks like deparse.c needs a cosmetic tweak for nested function definitions.
f21ace5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For finding other deparser issues, shcomp could be a useful tool. The following patch adds
shcomp --deparse
and modifies shtests to test deparsed scripts:Below is an attached log file containing the results of
bin/shtests -c
after applying the above patch. There are quite a lot of test failures.log.txt
Additionally, comparing differences in a script after deparsing it can be done with a
diff
command like the one below:diff -u src/cmd/ksh93/tests/arrays.sh <(arch/*/bin/shcomp --deparse src/cmd/ksh93/tests/arrays.sh)
f21ace5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of 0975086, nested function definitions should be correctly indented.
f21ace5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very interesting idea, I will be trying that out. I can see at least one problem: aliases. The deparser output has all aliases resolved. But some aliases resolve to words that may be other aliases or even the same alias -- err_exit is one example of that. So the deparser output (assuming no bugs) is only guaranteed to be correct for reinput if no aliases are defined.
mksh deals with this issue by prefixing all words that are in a position where alias expansion may happen with a backslash. I think the effect of that is obnoxious when reading the output as a human, so I've no intention of following that. IMO, it is reasonable to require that no aliases are defined when reusing deparser output as input.
In the meantime I've already found a deparser bug that I don't know how to solve, so I'll create an issue.