Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix wide character conversion bugs #75

Merged
merged 1 commit into from Jan 28, 2022
Merged

Fix wide character conversion bugs #75

merged 1 commit into from Jan 28, 2022

Conversation

jamesjer
Copy link
Contributor

I tried to run the tests on an x86_64 Fedora 35 box. The "run file with filename" test segfaulted, because the attempt to open the test file for reading failed, returning a NULL, and the NULL was then passed into the python library as a FILE *. The attempt to open for reading failed because of 2 bugs. This commit fixes both.

First, the last argument to mbstowcs is not big enough. It has to cover the null terminator. Failure to make it big enough led to garbage bytes in the last position, instead of the null terminator. As the mbstowcs man page says:

In order to avoid the case 2 above, the programmer should make sure n is greater than or equal to mbstowcs(NULL,src,0)+1.

Second, _Py_wfopen wants the mode to be a wide string as well as the filename. I checked old Python versions, and this is the case as far back as Python 3.2, when _Py_wfopen was introduced.

@UnixJunkie
Copy link

Do you have an example to trigger the bug that you are claiming you correct?

@jamesjer
Copy link
Contributor Author

As I noted, the pyml test "run file with filename" segfaults with the current code. This can be replicated by creating a file named test.py with these contents:

print("Hello, world!")

Then build pymltop and do this:

$ ./pymltop
        OCaml version 4.13.1

# Py.initialize ();;
- : unit = ()
# Py.Run.load (Py.Filename "test.py") "test.py";;
Segmentation fault (core dumped)

@UnixJunkie
Copy link

In utop, this works fine:

#require "pyml";;
Py.initialize();;
Py.Run.load (Py.Filename "test.py") "test.py";;

@jamesjer
Copy link
Contributor Author

I can't explain why it works for you in utop, but the code is incorrect. The specification for mbstowcs is here: https://pubs.opengroup.org/onlinepubs/9699919799/functions/mbstowcs.html. It clearly states that an invocation mbstowcs(pwcs, s, n) modifies no more than n bytes of pwcs, which means that if s is n characters long, no null terminator will be written to pwcs. This is the reason for the sentence from the mbstowcs man page that I quoted above. Perhaps in your case, the n+1st byte is serendipitously zero. But it wasn't set to zero by mbstowcs if that function conforms to the standard.

Also, to verify that _Py_wfopen takes two wide string arguments, look at https://github.com/python/cpython/blob/main/Include/internal/pycore_fileutils.h, currently line 105. Download any release tarball from python 3.2 through 3.10, and look at Include/cpython/fileutils.h. The function prototype has not changed.

@thierry-martinez thierry-martinez merged commit 7d30940 into thierry-martinez:master Jan 28, 2022
@thierry-martinez
Copy link
Owner

Thank you very much for the fix, and all your explanations. I am very sorry for the delay: this was a very busy month! Merged.

thierry-martinez added a commit to thierry-martinez/opam-repository that referenced this pull request Mar 22, 2022
This pull-request publishes a new release for pyml (2022-03-22):

- New function `Py.Import.exec_code_module_from_string`
  (suggested by Francois Berenger, thierry-martinez/pyml#78)

- New function `Py.Module.compile` provides a better API than `Py.compile`.

- `Py.Object.t` can now be serialized (with Marshal or output_value), using Python
  pickle module

- Cross-compiling friendly architecture detection
  (suggested by @EduardoRFS, https://discuss.ocaml.org/t/a-zoo-of-values-for-system/8525/20)

- Detect macro `unix` instead of `__linux__`, to handle *BSD OSes
  (reported by Chris Pinnock, thierry-martinez/pyml#74)

- Fix bug in Windows

- Null checks for many functions raising OCaml exceptions, instead of segmentation fault
  (initial implementation by Laurent Mazare, thierry-martinez/pyml#72)

- Fix wide character conversion bugs leading to segmentation fault in Py_wfopen
  (fixed by Jerry James, thierry-martinez/pyml#75)

- `Gc.full_major ()` before unloading `libpython` in `Py.finalize`, to prevent
  segfaulting on finalizing dangling references to Python values after the library
  had been unloaded
  (reported by Denis Efremov on coccinelle mailing list)

- Fix segmentation fault when `~debug_build:true` was passed to `Py.initialize`
  (reported by Stéphane Glondu, thierry-martinez/pyml#79)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants