3.7 support #16

SethMMorton · 2018-07-21T07:59:37Z

This is intended to close issue #15. It ended up as a re-write of the internal structure to convert all input to a uniform data structure to make parsing simpler. The result was a net reduction of 10 files.

codecov · 2018-07-22T01:39:51Z

Codecov Report

Merging #16 into master will decrease coverage by 1.98%.
The diff coverage is 88.71%.

@@            Coverage Diff             @@
##           master      #16      +/-   ##
==========================================
- Coverage   92.47%   90.49%   -1.99%     
==========================================
  Files          17        6      -11     
  Lines         851      852       +1     
  Branches      129      134       +5     
==========================================
- Hits          787      771      -16     
- Misses         31       39       +8     
- Partials       33       42       +9

Impacted Files	Coverage Δ
src/fastnumbers.c	`90.71% <47.05%> (+4.24%)`	⬆️
src/objects.c	`86.36% <73.68%> (ø)`
src/numbers.c	`91.42% <81.81%> (ø)`
src/strings.c	`86.14% <86.14%> (ø)`
src/unicode_character.c	`92.85% <92.85%> (ø)`
src/parsing.c	`94.73% <94.73%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4dd0ac7...4adb85d. Read the comment docs.

- Update Travis-CI, Appveyor, and tox config files. - Add tests that cover certain edge cases and Py37 situations. - Autoformat the tests.

Instead of having ~1 function per file, functions have been grouped into files with a higher level concept. In general there is now a 1:1 relationship between source and header files. Along with the refactor has been improved extraction of single numeric unicode characters that works for all python versions and is significantly more performant than the previous implementation. Also, some wonderful things have happened with the string parsing: - On Python >=3.6, if an underscore is detected in a string the underscores are preemptively removed. This way no processing of underscores is required in downstream parsing. - The issue that started #15 was actually solved! Essentially the Logic of _PyUnicode_TransformDecimalAndSpaceToASCII from Python 3.6 has been reimplemented (with fewer memory allocations to boot!).

If it looks like an octal with base 0 it should be illegal, UNLESS the number would be 0. That UNLESS was missed previously.

If a unicode object is stored internally like ASCII, it can be cast to a char* array directly instead of allocating new memory. This is a significant time saver. In cases where the internal unicode data is not already stored in ASCII form, the speed of the for loop to perform the conversion has been improved. This has been done by putting the simplest conditions to test at the top of the if statement, and also by applying the register modifier to the looping variables.

When allocating memory for char* arrays or converting to float, it is possible that some sort of memory error will occur. The intention is that these events will ALWAYS raise an exception. The top-level object conversion code now checks for errno==ENOMEM, and if so returns NULL no matter if exceptions are requested or not.

When using Python's built-in string->float converters, fastnumbers used to pre-validate that the input was OK in order to avoid the python built-in converter from setting an exception. The rational was that exception setting takes a long time. There were a couple of invalid assumptions with that logic. - Setting an exception does not take that long, and it's certainly not offset by the amount of time it takes to pre-validate. - The float built-in can be made to not raise an exception, which makes pre-validation a complete waste of time. This is still done for integers because there is no way to disable exception generation for PyLong_FromString, and generating an exception actually does take a long time for that function because of an unnecessary creation of a PyUnicode object just to build the exception.

The new algorithm tries to do as little work as possible, and uses memchr to find the decimal place instead of explicit searching.

1) All sign parsing happens in one place. 2) Gratuitous use of the register keyword. 3) Use of booleans to determine input validity instead of marking string locations.

Clang on OSX doesn't seem to emit many warnings...

There are two sources of error occurring. 1) For older versions of MSVC, the conversion in PyLong_FromDouble does not seem to be robust... the same value is not returned on subsequent calls. This is "fixed" simply by marking those functions as flaky and ignoring them if they fail on windows for Python 3.7 and 3.4. 2) For older versions of MSVC the smallest negative exponent that can be safely converted is smaller than it used to be. This may have been because of a separation of float parsing and sign handling. An attempt to unify them again has been made.

- Remove extraneous includes and improve include order. - Move PyNumberType from objects.h to options.h. - Apply static keyword consistently. - typedef struct Options to Options. - Remove PyBool_from_bool.

SethMMorton · 2018-08-02T05:20:01Z

All the failures are related to #17.

SethMMorton mentioned this pull request Jul 23, 2018

Segfaults on tests for Python 3.7.0 #17

Closed

SethMMorton added 8 commits July 30, 2018 19:30

Update development infrastructure and expand tests.

c302c2c

- Update Travis-CI, Appveyor, and tox config files. - Add tests that cover certain edge cases and Py37 situations. - Autoformat the tests.

Fix int detection bug with octals on Py3.

49d9547

If it looks like an octal with base 0 it should be illegal, UNLESS the number would be 0. That UNLESS was missed previously.

Improve speed of float overflow detection.

7303c98

The new algorithm tries to do as little work as possible, and uses memchr to find the decimal place instead of explicit searching.

Parsing speedup.

5e94a43

1) All sign parsing happens in one place. 2) Gratuitous use of the register keyword. 3) Use of booleans to determine input validity instead of marking string locations.

SethMMorton force-pushed the 3.7-support branch from 9c977a0 to 5e94a43 Compare July 31, 2018 02:51

Fix compiler warnings found in CI.

96023a8

Clang on OSX doesn't seem to emit many warnings...

SethMMorton force-pushed the 3.7-support branch 8 times, most recently from 21459c2 to 51ac4b6 Compare August 1, 2018 05:47

SethMMorton force-pushed the 3.7-support branch from 51ac4b6 to d6444bb Compare August 1, 2018 05:57

Final cleanup.

4adb85d

- Remove extraneous includes and improve include order. - Move PyNumberType from objects.h to options.h. - Apply static keyword consistently. - typedef struct Options to Options. - Remove PyBool_from_bool.

SethMMorton merged commit a159a03 into master Aug 2, 2018

SethMMorton deleted the 3.7-support branch August 2, 2018 05:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

3.7 support #16

3.7 support #16

SethMMorton commented Jul 21, 2018

codecov bot commented Jul 22, 2018 •

edited

SethMMorton commented Aug 2, 2018

3.7 support #16

3.7 support #16

Conversation

SethMMorton commented Jul 21, 2018

codecov bot commented Jul 22, 2018 • edited

Codecov Report

SethMMorton commented Aug 2, 2018

codecov bot commented Jul 22, 2018 •

edited