Skip to content

enhancements division

robertwb edited this page Nov 9, 2009 · 15 revisions

CEP 516 - Division Semantics

Abstract

Python and C have different semantics for division with signed integers. Python rounds towards infinity while C (in the C99 standard, and by convention before that) rounds towards 0. Thus -1 % 5 yields 4 in Python and -1 in C. The question is how should we handle this in Cython for cdef integers (including literals). The current implementation follows C, this proposal suggests changing the semantics to follow Python.

Both the division and modulo operator should be affected, as we wish to maintain (a//b) * b + (a % b) == a.

We also will test for division by zero, raising a ZeroDivisionError in that case rather than aborting.

Pros

  • More natural migration from Python (both users and code)
  • Fits better with the philosophy that "Cython is compiled Python"
  • Can prevent hard to find bugs when incrementally moving code from Python into C space
  • Different behavior between C and Python types will be harder to do once we have type inference

Cons

  • Slower (20-30% according to some timings)
  • Surprising to C users
  • Backwards incompatable with existing code
  • 100% cdef variables != 100% "stepping into" C

Obtaining the original behavior

Because of speed considerations, existing code, and people who prefer or need the original semantics, it is imperative we provide a way to get the original behavior. Some alternatives:

  • Do nothing, a sophisticated user can define their own macro.
    • Pro: this would only affect C users, who can usually be expected to know how to do this
    • Con: performance is a major reason for many users to want C semantics, so this may lead to a multiplication of effort
  • Define a new operator like a %% b or a %- b.
    • Pro: explicit code semantics; similar to Python's //
    • Con: operator may become available in Python; users may expect a certain behaviour if they know the operator from a different background; if %% is used, what should truncating division be called (///?)
  • Modified operator a c(%) b or a c_op(%) b.
    • Pro: short; reuses known operator
    • Con: rather unusual syntax (function call on an operator?)
  • Emulate a method a.cmod(b)
    • Pro: Python syntax; explicit; readable; similar to infix operator
    • Con: not quite as clear as an operator (what is divided by what?)
  • Functions cython.cmod and cython.cdiv
    • Pro: similar to the above, explicit, uses real import and clear code
    • Con: rather long; prefix function instead of infix operator
  • Use a compiler directive like cython.cdivision which would affect an entire block
    • Pro: relatively constrained impact; easy to adapt existing code; one often wants the same semantics for an entire block
    • Con: may easily be overlooked in non-trivial blocks (e.g. longer loops)
  • Create a new set of types cint, clong, ... which have the different division semantics
    • Pro: explicit
    • Con: duplicates a large part of the type system to fix one use case; more to learn; semantics and reasoning may not be obvious to new users (when to use what?); doesn't immediatly indicate it has anything to do with division
  • A from __python__ import idiv, imod (and letting the default be C-style)
    • Pro: explicit, mimics __future__ imports
    • Con: only works in Cython files; people might confuse this with a real import (as __future__ is really special and rather rare in code)
  • Looking at the file extension, .py files would behave as in Python, .pyx files as in C.
    • Pro: changing the file extension changes the language
    • Con: may lead to hidden (untested) bugs when switching file types or copying code while still using the same compiler; still requires special syntax to use the "other" definition

Final Decision and Status

This CEP will go into place in Cython 0.12, with a warning beforehand.

A compiler directive cython.cdivision and two special cython.cdiv/cmod are implemented. Runtime warnings can be enabled with the compiler directive cython.cdivision_warnings.

Python division seems to incur a 30-40% speed regression (timings below on OS X intel core duo, but should be representative of x86 at least).

attachment:time_mod.pyx

sage: import time_mod
sage: time time_mod.mod_c(2, 11, 10^8)
CPU times: user 1.68 s, sys: 0.01 s, total: 1.69 s
Wall time: 1.73 s
0
sage: time time_mod.mod_py(2, 11, 10^8)
CPU times: user 2.31 s, sys: 0.01 s, total: 2.31 s
Wall time: 2.33 s
11
sage: 2.33 / 1.73
1.34682080924855
Clone this wiki locally