Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: f2py 1.24 - string scalars fail #23192

Closed
2sn opened this issue Feb 10, 2023 · 9 comments
Closed

BUG: f2py 1.24 - string scalars fail #23192

2sn opened this issue Feb 10, 2023 · 9 comments

Comments

@2sn
Copy link
Contributor

2sn commented Feb 10, 2023

Describe the issue:

Since NumPy Version 1.24 string scalars seem to fail. I have a code that defines a character variable of length 8,

character*8 &
       nameprob

which in the puf file becomes

character*8 :: nameprob

as it should. After compilation, however, when I access the variable though its module (data) or its common block (namcom), either way I get

In [6]: k.kd._kepler.data.nameprob
Out[6]: 
array([b'xxx     ', b'xxxz    ', b'        ', b'xxx_0   ', b'        ',
       b'xxxz    ', b'        ', b''], dtype='|S8')

In [9]: k.kd._kepler.namecom.nameprob
Out[9]: 
array([b'xxx     ', b'xxxz    ', b'        ', b'xxx_0   ', b'        ',
       b'xxxz    ', b'        ', b''], dtype='|S8')

an array of length 8 of character of length 8. It seems 8 has been wrongly used twice, as char length and as array length - despite the variable is a scalar. For another variable defined as character*16 there are 16 units of !S16 type. Etc.

What is returned is just extra stuff from memory.

When a dimension is specified, e.g.,

character*2 dimension(121) :: izsym

all is fine

array([b'pn', b'nt', b' h', b'he', b'li', b'be', b' b', b' c', b' n',
       b' o', b' f', b'ne', b'na', b'mg', b'al', b'si', b' p', b' s',
       b'cl', b'ar', b' k', b'ca', b'sc', b'ti', b' v', b'cr', b'mn',
       b'fe', b'co', b'ni', b'cu', b'zn', b'ga', b'ge', b'as', b'se',
       b'br', b'kr', b'rb', b'sr', b' y', b'zr', b'nb', b'mo', b'tc',
       b'ru', b'rh', b'pd', b'ag', b'cd', b'in', b'sn', b'sb', b'te',
       b' i', b'xe', b'cs', b'ba', b'la', b'ce', b'pr', b'nd', b'pm',
       b'sm', b'eu', b'gd', b'tb', b'dy', b'ho', b'er', b'tm', b'yb',
       b'lu', b'hf', b'ta', b' w', b're', b'os', b'ir', b'pt', b'au',
       b'hg', b'tl', b'pb', b'bi', b'po', b'at', b'rn', b'fr', b'ra',
       b'ac', b'th', b'pa', b' u', b'np', b'pu', b'am', b'cm', b'bk',
       b'cf', b'es', b'fm', b'md', b'no', b'lr', b'rf', b'db', b'sg',
       b'bh', b'hs', b'mt', b'ds', b'rg', b'cn', b'nh', b'fl', b'mc',
       b'lv', b'ts', b'og', b'ue'], dtype='|S2')

and I get the desired array. Only scalar character variables seem to be converted to arrays

I should add that this is an interface module that is linked to library with the actual code (so I can use all Fortran statements w/o interfering with f2py. But I think it is not relevant, and I assume a very easy-to-fix small bug. Hopefully.

In NumPy Version 1.23.5 this issue is not present.

Reproduce the code example:

(see above)

Error message:

N/A

Runtime information:

1.24.2
3.11.1 (main, Jan 21 2023, 21:33:07) [GCC 12.2.1 20221121 (Red Hat 12.2.1-4)]

[{'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
'found': ['SSSE3',
'SSE41',
'POPCNT',
'SSE42',
'AVX',
'F16C',
'FMA3',
'AVX2',
'AVX512F',
'AVX512CD',
'AVX512_SKX'],
'not_found': ['AVX512_KNL',
'AVX512_KNM',
'AVX512_CLX',
'AVX512_CNL',
'AVX512_ICL']}},
{'architecture': 'SkylakeX',
'filepath': '/home/alex/Python_3.11.1/lib/python3.11/site-packages/numpy.libs/libopenblas64_p-r0-15028c96.3.21.so',
'internal_api': 'openblas',
'num_threads': 16,
'prefix': 'libopenblas',
'threading_layer': 'pthreads',
'user_api': 'blas',
'version': '0.3.21'},
{'architecture': 'SkylakeX',
'filepath': '/home/alex/Python_3.11.1/lib/python3.11/site-packages/scipy.libs/libopenblasp-r0-41284840.3.18.so',
'internal_api': 'openblas',
'num_threads': 16,
'prefix': 'libopenblas',
'threading_layer': 'pthreads',
'user_api': 'blas',
'version': '0.3.18'},
{'filepath': '/usr/lib64/libgomp.so.1.0.0',
'internal_api': 'openmp',
'num_threads': 16,
'prefix': 'libgomp',
'user_api': 'openmp',
'version': None},
{'architecture': 'SkylakeX',
'filepath': '/usr/lib64/libopenblaso-r0.3.21.so',
'internal_api': 'openblas',
'num_threads': 16,
'prefix': 'libopenblas',
'threading_layer': 'openmp',
'user_api': 'blas',
'version': '0.3.21'}]
None

Context for the issue:

The code expects a variable, and now returned are other parts of memory that could cause segmentation faults.
Also, the data type is not what is expected, so the python code breaks (in this case, it was used to concatenate a list of strings, but one was an array with arbitrary data from memory, and the GUI function that was passed to (to set as Window title) was not happy about some of the characters, such as \x00.

Could you please also point me where in f2py this issue may occur?

Initially I was happy when I read in the NumPy 1.24 release notes that there have been some improvements to handling of strings - to try them out - but now it seems the change also had some side effect and broke at least my string declarations.
:-(

PS - Is there a more detailed statement / example what has changes and is the new functionality of f2py for strings?

@2sn 2sn added the 00 - Bug label Feb 10, 2023
@2sn 2sn changed the title BUG: f2py 1.24 - strings fail BUG: f2py 1.24 - string scalars fail Feb 10, 2023
@2sn
Copy link
Contributor Author

2sn commented Feb 10, 2023

Even if I use "proper" F90

character(len=8) :: nameprob

I still get the same issue

In [4]: k.kd._kepler.data.nameprob
Out[4]: 
array([b'xxx     ', b'xxxz    ', b'        ', b'xxx_0   ', b'        ',
       b'xxxz    ', b'        ', b''], dtype='|S8')

PS - I'd like to take this opportunity to thank the f2py developers that now (len=xxx) can be used, in particular in conjunction with dimension!

 character(len=2), dimension(-1:nzsym) :: izsym

@2sn
Copy link
Contributor Author

2sn commented Feb 10, 2023

The problem seems to be when the .c interface is generated (XXXmodule.c). I see the line

  {"nameprob",1,{{8}},NPY_STRING, 8},

where I assume the first {{8}} makes it an array of length 8.

@2sn
Copy link
Contributor Author

2sn commented Feb 10, 2023

The issue seems to be with the call

            dm = capi_maps.getarrdims(n, var)

in f90mod_rules.py as it returns

{'dims': '8', 'size': '8', 'rank': '1'}

where vars is

{'typespec': 'character', 'charselector': {'len': '8'}, 'attrspec': []}

for my nameprob variable.

@seberg seberg added this to the 1.24.3 release milestone Feb 10, 2023
@seberg
Copy link
Member

seberg commented Feb 10, 2023

Ping @pearu and @melissawm since this should have to do with the f2py character rework.

@2sn
Copy link
Contributor Author

2sn commented Feb 10, 2023

I propose to change in capi_maps.py

        ret['dims'] = getstrlength(var)
        ret['size'] = ret['dims']
        ret['rank'] = '1'

to

        ret['size'] = getstrlength(var)
        ret['rank'] = '0'
        ret['dims'] = ''

which seems to work for my really big code ...

@seberg
Copy link
Member

seberg commented Feb 10, 2023

Cool, setting the rank to 0 does seem sensible. Would you be happy to create a PR? Maybe we can get one of the f2py folks to look at it, or we just have to hope its right :).

@2sn
Copy link
Contributor Author

2sn commented Feb 10, 2023

I will give it a try.

@2sn
Copy link
Contributor Author

2sn commented Feb 10, 2023

I never submitted to numpy. I tried to push on a new branch, but it seems this is not how you set it up

[f2py-string-scalar] ~/numpy/.git>    git push --set-upstream origin f2py-string-scalar
ERROR: Permission to numpy/numpy.git denied to 2sn.
fatal: Could not read from remote repository.

@seberg
Copy link
Member

seberg commented Feb 10, 2023

You have to create the new branch on your own fork, the rest is the same.

@seberg seberg closed this as completed in 8daec0e Feb 11, 2023
@seberg seberg closed this as completed Feb 11, 2023
charris pushed a commit to charris/numpy that referenced this issue Feb 14, 2023
in previous version, any string scalar was converted to a string array of dimension len, i.e., a definition

character(len=N) :: X
effectively became

character(len=NNN), dimension(NNN) :: X
from the point of few of the numpy (python) interface:

X.shape == (NNN,)
X.dtype == '|SNNN'

Closes numpygh-23192
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants