Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't discard null return value in function with output parameters #2905

Open
vadz opened this issue May 13, 2024 · 13 comments
Open

Don't discard null return value in function with output parameters #2905

vadz opened this issue May 13, 2024 · 13 comments
Assignees

Comments

@vadz
Copy link
Member

vadz commented May 13, 2024

Consider the following example:

%module outparam

%include <std_string.i>

%{
#include <string>
%}

%apply std::string* OUTPUT { std::string* out };

%inline {
std::string hi = "Hello ";

std::string* TestOutput(bool ok, std::string* out) {
  *out = "world";
  return ok ? &hi : nullptr;
}
}

and Python code to test it:

import outparam

print(type(outparam.TestOutput(True)), type(outparam.TestOutput(False)))

I'd expect this to output

<class 'list'> <class 'list'>

but it actually outputs

<class 'list'> <class 'str'>

because SWIG_Python_AppendOutput() replaces null values instead of appending to them and so when the function returns nullptr its output parameter is returned directly, as a string, instead of returning a list.

This is pretty annoying as it means you can't use natural

first, second = outparam.TestOutput(whatever)

in Python as this gives a run-time error if the function happens to return null for some value of whatever.

This behaviour dates from a very long time ago, see 6e77a6b (fixes for swigrun example, 2005-12-07), but I don't think this was done intentionally or makes sense even if it was, and would like to fix this, but please let me know if you think it is and the current behaviour is actually expected.

@vadz vadz added the Python label May 13, 2024
@vadz
Copy link
Member Author

vadz commented May 13, 2024

Hmm, the behaviour is the same (and still wrong IMO) for Ruby too:

#!/usr/bin/env ruby
require './outparam';

puts Outparam.TestOutput(true).class, Outparam.TestOutput(false).class

outputs

Array
String

@vadz vadz added the Ruby label May 13, 2024
@vadz
Copy link
Member Author

vadz commented May 13, 2024

OTOH JS behaves correctly:

#!/usr/bin/env node

let outparam = require('outparam');

console.log(outparam.TestOutput(true));
console.log(outparam.TestOutput(false));

outputs 2 2-element arrays.

@ojwb
Copy link
Member

ojwb commented May 13, 2024

There's a similar (maybe fundamentally the same) issue in SWIG/PHP:

#2027 (comment)

It was a while ago, but ISTR the problem is essentially that NULL gets used as an initial placeholder for "no OUTPUT values accumulated yet", so can't be told apart from an actual NULL value.

@vadz
Copy link
Member Author

vadz commented May 13, 2024

Thanks, this is indeed the same problem. I'm not sure of the best way to fix it yet, but it seems weird to me that both Lib/typemaps/inoutlist.swg and Lib/typemaps/cstrings.swg use %append_output as the latter shouldn't need to append to anything.

Using a different macro, e.g. some %append_output_without_discarding_null, in the former would solve the problem, if we default it to %append_output for the languages not defining it explicitly, but I wonder if we could just change cstrings.swg to use %set_output instead. I'll try to experiment with this soon.

@ojwb
Copy link
Member

ojwb commented May 13, 2024

Thanks, this is indeed the same problem. I'm not sure of the best way to fix it yet, but it seems weird to me that both Lib/typemaps/inoutlist.swg and Lib/typemaps/cstrings.swg use %append_output as the latter shouldn't need to append to anything.

You might reasonably use the macros from cstrings.swg on a parameter of a function which has another parameter to which you've applied an OUTPUT typemap and/or has an actual return value.

@vadz
Copy link
Member Author

vadz commented May 14, 2024

Yes, you're right, of course, sorry for somehow not thinking about this. This means that my proposed solution still wouldn't work for a function like std::string* TestOutput(bool ok, char* out) (with the appropriate cstring typemap for out) returning nullptr.

The real fix would need to distinguish between Py_None assigned to resultobj by default and the same Py_None returned from SWIG_Python_NewPointerObj() if the pointer is null. The problem is that this would require a lot of changes because we'd need to record this in a local variable (probably not too difficult, but would be redundant in many cases but I'm not sure how to avoid using it unless it's really needed) and pass it to SWIG_AppendOutput(), which would require modifying all languages using UTL. Not sure if it's really worth it just for a probably relatively rare case of C functions returning (possibly null) objects... (Edit: please ignore this, see the associated PR instead).

@vadz vadz changed the title Don't discard null return value in function without output parameters Don't discard null return value in function with output parameters May 14, 2024
vadz added a commit to vadz/swig that referenced this issue May 14, 2024
Null pointers returned from functions also using output parameters or C
strings typemaps were lost because the generated code didn't distinguish
between Py_None corresponding to them and Py_None indicating the return
value of a "void" function.

This doesn't seem to be easy to solve properly because the function
return type and output parameters typemaps are independent and there is
no way to customize the behaviour of the latter depending on the former.

So we do it using a hack involving a global variable which is always 0
and is passed to SWIG_Python_AppendOutput() from non-void functions, but
is shadowed by a local variable with the same name set to 1 in the void
ones. This allows us to only discard Py_None if it corresponds to a void
function return value.

This is obviously ugly, but requires only minimal changes, doesn't
impose much extra run-time overhead and is MT-safe (unlike any solution
involving global variables).

Also add a unit test checking for this for Python and for JS, where the
same function already worked correctly (unlike in some other languages,
including at least Ruby and PHP).

Closes swig#2905.
vadz added a commit to vadz/swig that referenced this issue May 14, 2024
Null pointers returned from functions also using output parameters or C
strings typemaps were lost because the generated code didn't distinguish
between Py_None corresponding to them and Py_None indicating the return
value of a "void" function.

This doesn't seem to be easy to solve properly because the function
return type and output parameters typemaps are independent and there is
no way to customize the behaviour of the latter depending on the former.

So we do it using a hack involving a global variable which is always 0
and is passed to SWIG_Python_AppendOutput() from non-void functions, but
is shadowed by a local variable with the same name set to 1 in the void
ones. This allows us to only discard Py_None if it corresponds to a void
function return value.

This is obviously ugly, but requires only minimal changes, doesn't
impose much extra run-time overhead and is MT-safe (unlike any solution
involving changing global variables).

Also add a unit test checking for this for Python and for JS, where the
same function already worked correctly (unlike in some other languages,
including at least Ruby and PHP).

Closes swig#2905.
wsfulton pushed a commit that referenced this issue Jun 13, 2024
Null pointers returned from functions also using output parameters or C
strings typemaps were lost because the generated code didn't distinguish
between Py_None corresponding to them and Py_None indicating the return
value of a "void" function.

This doesn't seem to be easy to solve properly because the function
return type and output parameters typemaps are independent and there is
no way to customize the behaviour of the latter depending on the former.

So we do it using a hack involving a global variable which is always 0
and is passed to SWIG_Python_AppendOutput() from non-void functions, but
is shadowed by a local variable with the same name set to 1 in the void
ones. This allows us to only discard Py_None if it corresponds to a void
function return value.

This is obviously ugly, but requires only minimal changes, doesn't
impose much extra run-time overhead and is MT-safe (unlike any solution
involving changing global variables).

Also add a unit test checking for this for Python and for JS, where the
same function already worked correctly (unlike in some other languages,
including at least Ruby and PHP).

Closes #2905.
@wsfulton
Copy link
Member

Still needs fixing for scripting languages supporting argout. f3d4698...d940190 from #2907 contains the fix for Python, but still needs extending for Ruby, PHP etc. @vadz, perhaps you could add in this and create testcases? If not, I'll get to it later.

@wsfulton wsfulton reopened this Jun 15, 2024
@vadz
Copy link
Member Author

vadz commented Jun 15, 2024

I can do Ruby but I'm completely unfamiliar with PHP, so I'd rather leave this one to someone else.

@ojwb
Copy link
Member

ojwb commented Jun 16, 2024

I can probably do PHP. I actually looked at this before for PHP but couldn't see an easy fix and concluded it was a low priority corner case as Python had essentially the same problem (#2027 (comment)) but if $isvoid solves that problem it should be fairly simple to fix.

@ojwb ojwb self-assigned this Jun 16, 2024
@ojwb ojwb added the PHP label Jun 16, 2024
@ojwb
Copy link
Member

ojwb commented Jun 16, 2024

@wsfulton I looked at sorting this out for PHP but noticed your fix seems to mishandle the case where the first parameter outputs Python None - here's a reproducer:

diff --git a/Examples/test-suite/inout.i b/Examples/test-suite/inout.i
index 26425edc..7baddbb2 100644
--- a/Examples/test-suite/inout.i
+++ b/Examples/test-suite/inout.i
@@ -36,6 +36,15 @@
     a += 1;
   } 
 
+  inline void VoidOutOK(double* b, char* p) {
+    (void)p;
+    if (b) *b += 1;
+  }
+
+  inline void VoidOutBad(char* p, double* b) {
+    (void)p;
+    if (b) *b += 1;
+  }
 %}
 
 %template() std::pair<double, double>;
@@ -46,6 +55,8 @@ void AddOne1p(std::pair<double, double>* INOUT);
 void AddOne2p(std::pair<double, double>* INOUT, double* INOUT);
 void AddOne3p(double* INOUT, std::pair<double, double>* INOUT, double* INOUT);
 void AddOne1r(double& INOUT);
+void VoidOutOK(double* INOUT, char* INOUT);
+void VoidOutBad(char* INOUT, double* INOUT);
 
 %inline %{
   inline void StringNot(char** INOUT) {
diff --git a/Examples/test-suite/python/inout_runme.py b/Examples/test-suite/python/inout_runme.py
index d1f2f5ff..15b1a4b0 100644
--- a/Examples/test-suite/python/inout_runme.py
+++ b/Examples/test-suite/python/inout_runme.py
@@ -20,6 +20,15 @@ a = inout.AddOne3p(1, (1, 1), 1)
 if a != [2, (2, 2), 2]:
     raise RuntimeError
 
+a = inout.VoidOutOK(1, None)
+if a != [2, None]:
+    raise RuntimeError
+
+a = inout.VoidOutBad(None, 1)
+if a != [None, 2]:
+    print(str(a))
+    raise RuntimeError
+
 ret, out = inout.NonVoidOut(-42)
 if ret is not None:
     raise RuntimeError

I'd expect [None, 2] for inout.VoidOutBad(None, 1) (but we actually discard the None), especially since we return [2, None] if the arguments are swapped as in inout.VoidOutOK(1, None).

It seems $isvoid is not enough - we also need to check if this is the first INOUT or OUTPUT parameter we're processing, or something like that.

FWIW, I tested this with cd39cf1 and it seems @vadz approach also mishandles this case in the same way (as does SWIG 4.2.1, tested with the Debian package).

@ojwb
Copy link
Member

ojwb commented Jun 17, 2024

Having though about this more, I'm thinking even more strongly that we need to return [None, 2] for inout.VoidOutBad(None, 1) - otherwise a more realistic example which only sometimes returns NULL results in a variable length array which can't be assigned to a tuple of variables in the pythonic way:

p, b = inout.VoidOutBad(None, 1)

@vadz
Copy link
Member Author

vadz commented Jun 17, 2024

I can fix this with something like this:

commit cbc03c3f062ee4fcf6f46e8edf7c9f4f62e2870f
Author: Vadim Zeitlin <vz-swig@zeitlins.org>
Date:   2024-06-17 20:51:23 +0200

    Python: Fix handling multiple "out" parameters in void functions
    
    The previous fix didn't work correctly for the void functions with more
    than one output parameter if "None" was returned for any but the last
    one of them.
    
    Fix this by only taking "isvoid" flag into account for the first output
    parameter, but not any others.
    
    See #2905.

diff --git a/Lib/python/pyrun.swg b/Lib/python/pyrun.swg
index efe02e41f..9740e758c 100644
--- a/Lib/python/pyrun.swg
+++ b/Lib/python/pyrun.swg
@@ -117,10 +117,11 @@ SWIG_Python_SetConstant(PyObject *d, const char *name, PyObject *obj) {
 /* Append a value to the result obj */
 
 SWIGINTERN PyObject*
-SWIG_Python_AppendOutput(PyObject* result, PyObject* obj, int is_void) {
+SWIG_Python_AppendOutput(PyObject* result, PyObject* obj, int* is_void) {
   if (!result) {
     result = obj;
-  } else if (result == Py_None && is_void) {
+  } else if (result == Py_None && *is_void) {
+    *is_void = 0;
     SWIG_Py_DECREF(result);
     result = obj;
   } else {
diff --git a/Source/Modules/python.cxx b/Source/Modules/python.cxx
index 00c38173c..ad4320474 100644
--- a/Source/Modules/python.cxx
+++ b/Source/Modules/python.cxx
@@ -2769,6 +2769,11 @@ public:
 
     Wrapper_add_local(f, "resultobj", "PyObject *resultobj = 0");
 
+    String *isvoid_decl = NewString("");
+    Printf(isvoid_decl, "int isvoid = %d", Cmp(returntype, "void") ? 0 : 1);
+    Wrapper_add_local(f, "isvoid", isvoid_decl);
+    Delete(isvoid_decl);
+
     // Emit all of the local variables for holding arguments.
     emit_parameter_variables(l, f);
 
@@ -3238,9 +3243,7 @@ public:
 
     /* Substitute the cleanup code */
     Replaceall(f->code, "$cleanup", cleanup);
-
-    bool isvoid = !Cmp(returntype, "void");
-    Replaceall(f->code, "$isvoid", isvoid ? "1" : "0");
+    Replaceall(f->code, "$isvoid", "&isvoid");
 
     /* Substitute the function name */
     Replaceall(f->code, "$symname", iname);

but this breaks the new typemap_isvoid.cpptest as $isvoid is not replaced with 0 or 1 any longer.

I thought about resetting isvoid variable to 0 in the function being wrapped instead of doing it in SWIG_Python_AppendOutput(), but this seems much harder to do and still requires using a variable instead of hard-coded 0 and 1, so the test would still have to be changed.

@wsfulton Can I adjust the test or do you see some other way to do it?

@ojwb
Copy link
Member

ojwb commented Jun 18, 2024

If we go that route, I think the isvoid name should be changed (and possibly $isvoid left in as a feature by itself if it looks more widely useful).

I'm going to hold off actually implementing for PHP until we've agreed a solution to this NULL-handling issue, in case it significantly changes the mechanism.

Incidentally, while looking at this for PHP I noticed the %append_output() macro which seems like a slightly more generic solution (there's code to define %append_output() if it's not already defined, which defines it in terms of SWIG_AppendOutput(), which is itself defined in terms of SWIG_Python_AppendOutput(), etc). The %append_output() interface looks more workable for PHP where there's a standard pre-defined variable for the return value. Mentioning this here in case it helps for other target languages that need changes for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants