Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault on isfloat() #1

Closed
Waateur opened this issue Nov 5, 2014 · 4 comments
Closed

Segmentation fault on isfloat() #1

Waateur opened this issue Nov 5, 2014 · 4 comments
Labels

Comments

@Waateur
Copy link

Waateur commented Nov 5, 2014

Hello,
I was using fastnumers.isfloat() to know is a string is a float or not. The string format was like "-0.353" most of the time. I used the function on a tons of data. I coud not spoted the string that made the segfault.
As there is a safe_float function I was wondering if this is normal or not.

Hopping this might help.
D.

Nov  5 16:17:54 (none) kernel: [7791607.024856] python[26864]: segfault at 7f4e52560050 ip 00007f4e5373ae42 sp 00007fff79343ad0 error 4 in fastnumbers.cpython-33m.so[7f4e53739000+4000]
Nov  5 16:18:30 (none) kernel: [7791642.889630] python[26905]: segfault at 7f7286bf9050 ip 00007f7287d56e42 sp 00007fffe251bba0 error 4 in fastnumbers.cpython-33m.so[7f7287d55000+4000]
@SethMMorton
Copy link
Owner

Hmm. This is very difficult to debug without knowing what input caused the segfault. Is this repeatable (given the same input you always get a segfault)? If so, is it possible for you to give me the input, or at least point me to the code you are using that generates the input? I'd like to try and reproduce on my machine to see if I can solve the issue.

BTW, the "safe" in "safe_float" refers to the fact that it does bounds checking so you won't get an overflow error, and it gives better precision for large numbers. It is not related to segfaults... I would not consider any segfault to be normal.

@Waateur
Copy link
Author

Waateur commented Nov 6, 2014

Thanks for the answer.
I'll spend some time trying to spot what in my data made the segfault during the week end :)

@Waateur
Copy link
Author

Waateur commented Nov 19, 2014

Hello,
So i used fastnumbers.isfloat() on every word of my data and i was unable to repeat the segfault again. I don't know if that's the kernel logs that were wrong or i met the bigfoot...
Just for the record I was the using fastnumbers to detect log probabilities in an ARPA language model.
something like that:

for line in open("myFile.arpa")
    element = line.rstrip().split()[-1]
    if fastnumbers.isfloat(element) :
       print("its a log probabilitie")
    else :
       print("it's a word")

Like it's getting nowhere I'll close the issue until may be someone someday reopen it .
Sorry for the disturbance.
@SethMMorton thank you for your time.

@Waateur Waateur closed this as completed Nov 19, 2014
SethMMorton added a commit that referenced this issue Jun 4, 2015
When dealing with unicode input, the python object needs to be
converted to a bytes object before being converted to a character
array. Previously, fastnumbers was relying on the python object
remaining in memory when dealing with character arrays because a
strcpy was not performed.  Because extracting the character array from
unicode requires a temporary python object which is quickly
de-referenced, this is not a safe technique; the segfault is rare
because python garbage collects de-references objects only periodically,
so the character array typically remains in memory.

To solve this issue, all character arrays are now explicitly copied with
strcpy. This required modification of the conversion functions to
free the character array memory before returning from fastnumbers.

This resolves issue #2, and most likely resolves issue #1.
@SethMMorton
Copy link
Owner

SethMMorton commented Jun 4, 2015

@Waateur I am pretty positive that I have just solved this issue with the most recent commit. By chance, was your input unicode?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants