Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C Extension #15

Merged
merged 23 commits into from
Apr 5, 2020
Merged

C Extension #15

merged 23 commits into from
Apr 5, 2020

Conversation

pganssle
Copy link
Owner

@pganssle pganssle commented Mar 5, 2020

Originally this was just a stub, but I guess now it's a full-on feature branch, because I keep doing fixup commits to the implementation.

Remaining things to do:

  • repr and str
  • ZoneInfo.from_file
  • Rules applied after the last transition
  • Add test dealing with datetime subclasses.
  • Check for refleaks
  • Pickle support
  • Make the pure python implementation a fallback implementation

@pganssle pganssle force-pushed the c_extension branch 30 times, most recently from fa9035d to e895316 Compare March 6, 2020 18:04
@pganssle pganssle force-pushed the c_extension branch 2 times, most recently from 8a77ff4 to 763197e Compare April 4, 2020 17:15
@pganssle
Copy link
Owner Author

pganssle commented Apr 4, 2020

OK, I think I'm pretty happy with this. I was thinking I might squash a lot of the C implementation together into one big commit, but I think they're still mostly atomic (although it isn't until somewhat late in the game that the _czoneinfo module would be independently useful.

I think there are one or two commits that segfault, but I think debugging those individual commits isn't worth it at this point.

This does not add a lint check for clang-format because it's not obvious
how to do that. Future refinements will make `clang-format` fail with a
warning rather than an error if it's not present.
This C module doesn't do anything yet, but it's the receptacle for the
new ZoneInfo implementation.
The hack in the setup.py file is there because `gcov` generates a
*required* .gcno file at build time at the location of the object files.
In our case, these object files only ever exist in an ephemeral
temporary build directory that pip has moved copied all of our files
into, for build isolation purposes.

This allows you to copy those into some location, specified by the
GCNO_TARGET_DIR directory, at build time. Maybe in the future we either
use a different coverage generation tool or figure out a way to relocate
this .gcno file automatically, without hacking it into setup.py.

Additionally, I should note that the ideal target for this would
actually be {envtmpdir}, because that's cleared before the environment
is run each time, but unfortunately it's cleared *after* the install
command, not before, so anything we copy in there as part of the build
will get deleted. There is currently no way to insert a command before
the build occurs to establish our own version of envtmpdir (...with
blackjack ...and hookers...).
C code has a large number of hard-to-hit paths, so getting to 100% code
coverage, even 100% diff coverage, would be a pretty significant challenge.

Instead, we'll measure the code coverage in C but not enforce any particular
number.
This adds the basic caching behavior to the empty ZoneInfo stub, along
with the first tests.
Use a legacy editable installation, plus tox with gcc and gcc-9, all
side-by-side.
This should clean up the references to global caches and imports and
whatnot on deallocation of the module.
This loads all data into the struct except the _ttinfo_after value,
which may be a _TZStr.
This adds support for utcoffset(), dst() and tzname() before the final
transition (which is more complicated as it involves parsing the TZStr
and generating a _tzrule.
This rounds out the tzinfo abstract base class, making the C
implementation an actual, usable time zone.
This does not include .from_file, which is not yet implemented.
Rather than try and overwrite the value of __reduce__ on a per-instance
basis as was done in the pure Python implementation, this simply
reinterprets "from_cache" as a variable that keeps track of the source
of the instance, which is referenced in __reduce__.
This requires changing the struct a bit because we don't have a __dict__
in which to conditionally store the repr of the file object that was
used during construction.
In Version 2+ files, after the last transition in a file, a string based
on the POSIX TZ environment variable is used to determine transitions
going forward indefinitely.

This brings the C extension to feature parity with the Python module,
and now all tests enabled for the Python version are enabled for the C
extension as well.
These tests rely on the fact that in Python 3.8+, the __add__ and
__sub__ methods of datetime subclasses will preserve their subclass.

If support is added for older Pythons, the Datetime subclass will need
to be adjusted accordingly.
Subclasses of the C ZoneInfo implementation must have a __dict__, but
now they will have a per-class cache. We may improve the state of things
by converting ZoneInfo to a heap-allocated class (and we may want to
block __slots__ subclasses).
This does not have proper support for subinterpreters at the moment,
because it still has global references to the timedelta and zoneinfo
caches, but it should at least allow for the possibility of initializing
multiple copies of the module.
This seems like it would be simple, but in order to test this we need to
import the module multiple times, which reveals all kinds of problems
with test.support.import_fresh_module, and some problems with the
implementation of our tests.

Here we move the various workarounds and test support into a dedicated
test support submodule, so that they can be used in both the unit and
proper tests.

Additionally, it seems like it's not possible to pickle something where
the name refers to a different module when imported fresh (e.g. from
sys.modules). This workaround makes sure that the class we are testing
is always the one you get from `import zoneinfo`.
@pganssle pganssle merged commit 9829972 into master Apr 5, 2020
@pganssle pganssle deleted the c_extension branch May 21, 2020 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants