-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
private/python_ipc_system: Convert to Cygwin path when needed. #1195
Conversation
It should work in theory. I did some tests using shim but not in a real cygwin-like |
46ad605
to
1df0ae0
Compare
Nice work! Maybe we can have |
Is there a reason why you prefer opening a pipe over using Edit: Maybe also check if |
------- Original Message -------
On Tuesday, July 19th, 2022 at 7:54 AM, Markus Mützel ***@***.***> wrote:
Is there a reason why you prefer opening a pipe over using `system`?
Does octsympy try to be Matlab compatible? Afaict, `popen2` is an Octave-specific function.
On POSIX platforms, `popen2` is just an Octave-wrapper for the native function of the same name. But that function doesn't exist on Windows afaict. Octave tries to emulate that by wrapping around some Windows API functions. While that should work for most cases, it might be "safer" to just use `system` instead. (Unless, there is a good reason for using `popen2`.)
I was worried about the use of "system" because of #1143 (escaping string is error-prone and shell-dependent). It's much better to call subprocess w/o going through shell. Anyway, we have to stick with "system" for now as "popen2" is not portable. I have in mind an idea of a portable "system2" function but we can worry about that later.
… —
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
IIUC, it is true that a malevolent user might be able to able to execute arbitrary programs by passing specifically grafted strings. But I'd say that this risk is pretty low if the string to be executed is essentially a literal. (Or the risk is basically the same with |
------- Original Message -------
On Tuesday, July 19th, 2022 at 11:44 AM, Markus Mützel ***@***.***> wrote:
IIUC, it is true that a malevolent user might be able to able to execute arbitrary programs by passing specifically grafted strings. But I'd say that this risk is pretty low if the string to be executed is essentially a literal. (Or the risk is basically the same with `popen2`, e.g., if binaries are replaced.)
The use of "system" in "python_env_is_cygwin_like" is probably fine since the input is a literal but the input to "cygpath" can be anything. I think it'd be much easier to have a secure building block so we don't have to worry about how it's used in the future (the "cygpath" function could be used elsewhere to solve similar issues in the future).
On the other hand, if a user is malevolent, they might as well just use the `system` function directly in Octave. I don't currently see why they would try to route their commands through this package...
The risk of "system" is probably low for local user running octave themselves, which is the most common scenario. The problem arises when someone tries to use octsympy as a backend remotely, say to evaluate symbolic expression in a website for educational purposes. This could potentially be combined with other vulnerabilities to get an exploit.
… —
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
The input to But to get this PR back on track: Your changes look good. But at least for the sake of compatibility with Matlab, it might be better to use |
Yes, to some extent. But the popen2 IPC stuff is Octave-only, and is faster than |
I'd guess opening a pipe is only faster if there is repeated communication over the pipe. So, you don't need to spawn a new process each time. |
1df0ae0
to
76f1b39
Compare
inst/private/cygpath.m
Outdated
|
||
match = regexp(out, '.*', 'match', 'dotexceptnewline'); | ||
assert (length (match) == 1); | ||
posix_path = match{1}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure what these lines are used for. But in case, you just want to strip a trailing newline, you could probably also use strtrim
. Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of using strtrim
/ deblank
but strtrim
/ deblank
removes (horizontal) whitespaces as well from both ends. AFAIK whitespaces are pretty common in windows path. Do you know of any function which is similar to python's rstrip('\n')
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right: Spaces in filenames are (unfortunately) pretty common in Windows paths. But I tried to rename a file in Windows Explorer such that it ends with a space (" ") character. It looks like it is not possible to do that. That space is automatically stripped from the file name. So, using strtrim
or deblank
would most probably do the "correct thing". (Especially with the automatically generated file names this changeset is about.)
Using those functions might also make it a bit easier to understand what this part of the code is actually meant to be doing. (An additional comment might also help.)
Edit: Same for leading spaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, the current approach is too clumsy... Simplified and added comment as well.
inst/private/python_ipc_system.m
Outdated
[status, out] = system ([pyexec ' ' tmpfilename]); | ||
|
||
if python_env_is_cygwin_like (pyexec) | ||
converted_tmpfilename = cygpath (tmpfilename); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you still need the original Windows path of the temporary file?
If not, you could also re-use the existing variable and simplify this portion of the code a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mostly code in a functional style, avoiding unnecessary mutations, maintaining referential transparency, etc whenever possible. I would prefer using a new variable instead of overwriting old ones. (Basically, it's the same "use const
by default" advice provided by many es6 coding styles.)
Sadly, we can't use conditional expression. Otherwise, it'd be a one-liner converted_tmpfilename = cygpath (tmpfilename) if python_env_is_cygwin_like (pyexec) else tmpfilename
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that is a question of style and preference.
My personal preference is to avoid else branches if they "do nothing".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No strong feelings here but in this case I'd probably do as @mmuetzel. A few years of Python coding as also made me adverse to else
(unless needed of course!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally find it more difficult to mentally keep track of all the places where a mutation (of binding or value) has occurred. I prefer to treat variable declarations as mathematical definitions, which cannot be changed in the middle of a proof (an example of programming in the functional style). Anyway, I've removed the else branch in this simple case as both of you find it less readable...
76f1b39
to
9558e9a
Compare
|
||
if ~isempty (python_env_is_cygwin_like_memo) | ||
r = python_env_is_cygwin_like_memo; | ||
elseif ispc () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here I would definitely do an early return. Then if ispc ()
. For me, this makes the persistent semantics clearer.
BTW, I like _memo
for such variables! I shall try to remember that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here I would definitely do an early return. Then
if ispc ()
. For me, this makes the persistent semantics clearer.
Make sense to me, it also makes the code look more symmetric.
BTW, I like
_memo
for such variables! I shall try to remember that.
The name is borrowed from memoization. I think I saw it somewhere as well. This is one of the reasons why pure function (and referential transparency) is good!
Thanks @alexvong1995 and @mmuetzel for detailed reviewing. Merge this whenever you are both happy. |
9558e9a
to
1e544f3
Compare
inst/private/cygpath.m
Outdated
end | ||
|
||
%% validate output | ||
assert (ischar (out) && logical (regexp (out, '^[^\r\n]+[\r]?[\n]$'))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second part of this condition might be too specific. I'm not sure if the second output argument of system
should really end with a new line character.
That might actually be a bug in Octave. (But I didn't come around to investigate.)
Maybe better check that the string after removal of the (potential) trailing newline is non-empty?
Edit: Thinking of it: Do you need the first part of that assertion? Isn't system
guaranteed to return a char vector as the second output argument (potentially empty though)? Would assert (~ isempty (out))
be better here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second part of this condition might be too specific. I'm not sure if
system
should really add a new line to the end of theout
string.
That might actually be a bug in Octave. (But I didn't come around to investigate.)
I would expect a console program (cygpath
in our case) to append exactly 1 newline to the output path. Otherwise, it would be unreadable to human. (0 or 1 is plausible, > 1 would be a bug)
Maybe better check that the string after removal of the (potential) trailing newline is non-empty?
Indeed, it must be chars according to documentation. We can validate the path at the end, which is actually safer.
1e544f3
to
679a161
Compare
error ('python_env_is_cygwin_like: %s exited with status %d', ... | ||
pyexec, status); | ||
end | ||
assert (ischar (out)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly to your other recent change, this assertion might not be needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's now updated.
This should fix some errors when Python is running in Cygwin-like environment. But there could still be errors in other places. See gnu-octave#1182. * inst/private/cygpath.m: New function. * inst/private/python_env_is_cygwin_like.m: New function. * inst/private/python_ipc_system.m: Use them.
679a161
to
6d1f177
Compare
LGTM. |
This should fix some errors when Python is running in Cygwin-like
environment. But there could still be errors in other places.
See #1182.