Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.7 support #16

Merged
merged 11 commits into from Aug 2, 2018
Merged

3.7 support #16

merged 11 commits into from Aug 2, 2018

Conversation

SethMMorton
Copy link
Owner

This is intended to close issue #15. It ended up as a re-write of the internal structure to convert all input to a uniform data structure to make parsing simpler. The result was a net reduction of 10 files.

@codecov
Copy link

codecov bot commented Jul 22, 2018

Codecov Report

Merging #16 into master will decrease coverage by 1.98%.
The diff coverage is 88.71%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #16      +/-   ##
==========================================
- Coverage   92.47%   90.49%   -1.99%     
==========================================
  Files          17        6      -11     
  Lines         851      852       +1     
  Branches      129      134       +5     
==========================================
- Hits          787      771      -16     
- Misses         31       39       +8     
- Partials       33       42       +9
Impacted Files Coverage Δ
src/fastnumbers.c 90.71% <47.05%> (+4.24%) ⬆️
src/objects.c 86.36% <73.68%> (ø)
src/numbers.c 91.42% <81.81%> (ø)
src/strings.c 86.14% <86.14%> (ø)
src/unicode_character.c 92.85% <92.85%> (ø)
src/parsing.c 94.73% <94.73%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4dd0ac7...4adb85d. Read the comment docs.

 - Update Travis-CI, Appveyor, and tox config files.
 - Add tests that cover certain edge cases and Py37 situations.
 - Autoformat the tests.
Instead of having ~1 function per file, functions have been grouped into
files with a higher level concept. In general there is now a 1:1
relationship between source and header files.

Along with the refactor has been improved extraction of single numeric
unicode characters that works for all python versions and is
significantly more performant than the previous implementation.

Also, some wonderful things have happened with the string parsing:

 - On Python >=3.6, if an underscore is detected in a string the
   underscores are preemptively removed. This way no processing of
   underscores is required in downstream parsing.
 - The issue that started #15 was actually solved! Essentially the
   Logic of _PyUnicode_TransformDecimalAndSpaceToASCII from Python 3.6
   has been reimplemented (with fewer memory allocations to boot!).
If it looks like an octal with base 0 it should be illegal,
UNLESS the number would be 0. That UNLESS was missed previously.
If a unicode object is stored internally like ASCII, it can be
cast to a char* array directly instead of allocating new memory.
This is a significant time saver.

In cases where the internal unicode data is not already stored in
ASCII form, the speed of the for loop to perform the conversion has
been improved. This has been done by putting the simplest conditions
to test at the top of the if statement, and also by applying the
register modifier to the looping variables.
When allocating memory for char* arrays or converting to float,
it is possible that some sort of memory error will occur. The intention
is that these events will ALWAYS raise an exception. The top-level
object conversion code now checks for errno==ENOMEM, and if so
returns NULL no matter if exceptions are requested or not.
When using Python's built-in string->float converters, fastnumbers
used to pre-validate that the input was OK in order to avoid the
python built-in converter from setting an exception. The rational was
that exception setting takes a long time.

There were a couple of invalid assumptions with that logic.
 - Setting an exception does not take that long, and it's certainly
   not offset by the amount of time it takes to pre-validate.
 - The float built-in can be made to not raise an exception, which
   makes pre-validation a complete waste of time.

This is still done for integers because there is no way to disable
exception generation for PyLong_FromString, and generating an exception
actually does take a long time for that function because of an
unnecessary creation of a PyUnicode object just to build the exception.
The new algorithm tries to do as little work as possible, and uses
memchr to find the decimal place instead of explicit searching.
 1) All sign parsing happens in one place.
 2) Gratuitous use of the register keyword.
 3) Use of booleans to determine input validity instead of marking
    string locations.
Clang on OSX doesn't seem to emit many warnings...
@SethMMorton SethMMorton force-pushed the 3.7-support branch 8 times, most recently from 21459c2 to 51ac4b6 Compare August 1, 2018 05:47
There are two sources of error occurring.

 1) For older versions of MSVC, the conversion in PyLong_FromDouble
    does not seem to be robust... the same value is not returned on
    subsequent calls. This is "fixed" simply by marking those functions
    as flaky and ignoring them if they fail on windows for Python 3.7
    and 3.4.
 2) For older versions of MSVC the smallest negative exponent that can
    be safely converted is smaller than it used to be. This may have
    been because of a separation of float parsing and sign handling.
    An attempt to unify them again has been made.
 - Remove extraneous includes and improve include order.
 - Move PyNumberType from objects.h to options.h.
 - Apply static keyword consistently.
 - typedef struct Options to Options.
 - Remove PyBool_from_bool.
@SethMMorton
Copy link
Owner Author

All the failures are related to #17.

@SethMMorton SethMMorton merged commit a159a03 into master Aug 2, 2018
@SethMMorton SethMMorton deleted the 3.7-support branch August 2, 2018 05:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant