Skip to content

Commit

Permalink
python-wrapper: Handle strings at the beginning.
Browse files Browse the repository at this point in the history
This should fix #7366 for now, but using the (IMHO) pragmatic approach
of extending the sed expression to recognize strings.

However, this approach is obviously not parsing the full AST, nor does
it wrap Python itself (as pointed out by @spwhitt in #7366) but tries to
match Python strings as best as possible without getting TOO unreadable.

We also use a little bit of Nix to help generating the SED expression,
because doing the whole quote matching block over and over again would
be quite repetitious and error-prone to change. The reason why I'm using
imap here is that we need to have unique labels to avoid jumping into
the wrong branch.

So the new expression is not only able to match continous regions of
triple-quoted strings, but also regions with only one quote character
(even with escaped inner quotes) and empty strings.

However, what it doesn't correctly recognize is something like this:

"string1" "string2" "multi
line
string"

Which is very unlikely that we'll find something like this in the wild.
Of course, we could handle it as well, but it would mean that we need to
substitute the current line into hold space until we're finished parsing
the strings, branch off to another label where we match multiline
strings of all sorts and swap hold/pattern space and finally print the
result. So to summarize: The SED expression would be 3 to 4 times bigger
than now and we gain very little from that.

Signed-off-by: aszlig <aszlig@redmoonstudios.org>
  • Loading branch information
aszlig authored and Lindstroem committed Apr 16, 2015
1 parent f9b02fb commit d4460d1
Show file tree
Hide file tree
Showing 2 changed files with 27 additions and 6 deletions.
7 changes: 1 addition & 6 deletions pkgs/development/python-modules/generic/wrap.sh
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,7 @@ wrapPythonProgramsIn() {
# dont wrap EGG-INFO scripts since they are called from python
if echo "$i" | grep -v EGG-INFO/scripts; then
echo "wrapping \`$i'..."
sed -i "$i" -re '1 {
/^#!/!b; :r
/\\$/{N;b r}
/__future__|^ *(#.*)?$/{n;b r}
/^ *[^# ]/i import sys; sys.argv[0] = '"'$(basename "$i")'"'
}'
sed -i "$i" -re '@magicalSedExpression@'
wrapProgram "$i" \
--prefix PYTHONPATH ":" $program_PYTHONPATH \
--prefix PATH ":" $program_PATH
Expand Down
26 changes: 26 additions & 0 deletions pkgs/top-level/python-packages.nix
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,32 @@ let
{ deps = pkgs.makeWrapper;
substitutions.libPrefix = python.libPrefix;
substitutions.executable = "${python}/bin/${python.executable}";
substitutions.magicalSedExpression = let
# Looks weird? Of course, it's between single quoted shell strings.
# NOTE: Order DOES matter here, so single character quotes need to be
# at the last position.
quoteVariants = [ "'\"'''\"'" "\"\"\"" "\"" "'\"'\"'" ]; # hey Vim: ''

mkStringSkipper = labelNum: quote: let
label = "q${toString labelNum}";
isSingle = elem quote [ "\"" "'\"'\"'" ];
endQuote = if isSingle then "[^\\\\]${quote}" else quote;
in ''
/^ *[a-z]?${quote}/ {
/${quote}${quote}|${quote}.*${endQuote}/{n;br}
:${label}; n; /^${quote}/{n;br}; /${endQuote}/{n;br}; b${label}
}
'';

in ''
1 {
/^#!/!b; :r
/\\$/{N;br}
/__future__|^ *(#.*)?$/{n;br}
${concatImapStrings mkStringSkipper quoteVariants}
/^ *[^# ]/i import sys; sys.argv[0] = '"'$(basename "$i")'"'
}
'';
}
../development/python-modules/generic/wrap.sh;

Expand Down

0 comments on commit d4460d1

Please sign in to comment.