Skip to content

Beginner Contributor Guide Design of SymEngine

Ralf Stephan edited this page Dec 2, 2015 · 6 revisions

SymEngine is used from different languages

(created from some gitter messages by Ondřej Čertík)

The C++ SymEngine library doesn't care about SymPy objects at all. We are just trying to implement things in some maintainable way, currently we settled on using Basic, Mul, Pow, ... hierarchy and implement most functionality using the visitor pattern or single dispatch, so Basic doesn't need many methods. We are keeping an option to perhaps do things differently if they turn out to be faster. Either way though, this shouldn't matter at all for how it is actually used from Python, Ruby or Julia.

Let's talk about just Python: the wrappers are in the symengine.py project. They are implemented using Cython, and they are free to introduce any kind of classes (including SymPy's Expr or Sage's Expression if needed), and the point of the wrappers is to make sure that things work out of the box from SymPy and Sage. The only job of the C++ SymEngine library is to ensure that the library's C++ API is implemented in such a way so that the wrappers can be written to do what they need. For example, we could easily introduce SymPy's Expr into the wrappers, by simply introducing the Expr class and make all the other classes subclass from it instead of from Basic.

That was the reason we split the wrappers, so now in the (pure) C++ symengine/symengine repository, we only have to worry about speed, correctness, maintainability and a usable API, and we can concentrate on these things without worrying or even testing any kind of wrappers. In the wrappers (symengine/symengine.py, or .jl, .rb), we simply just use the C++ (or C) API and the only thing we care is so that the (Python) wrapper can be used from sympy/Sage (and we test that in the test suite), and that it doesn't introduce unnecessary overhead in terms of speed. Ruby or Julia wrappers then care about interoperability with other libraries in those languages.

Object creation and is_canonical()

(from source and PR comments by Ondřej Čertík)

Classes like Add, Mul, Pow are initialized through their constructor using their internal representation. Add, Mul have a coeff and dict, while Pow has base and exp. There are restrictions on what coeff and dict can be (for example coeff cannot be zero in Mul, and if Mul is used inside Add, then Mul's coeff must be one, etc.). All these restrictions are checked when SYMENGINE_ASSERT is enabled inside the constructors using the is_canonical() method. That way, you don't have to worry about creating Add/Mul/Pow with wrong arguments, as it will be caught by the tests. In the Release mode no checks are done, so you can construct Add/Mul/Pow very quickly. The idea is that depending on the algorithm, you sometimes know that things are already canonical, so you simply pass it directly to Add/Mul/Pow and you avoid expensive type checking and canonicalization. At the same time, you need to make sure that tests are still running with SYMENGINE_ASSERT enabled, so that Add/Mul/Pow are never in an inconsistent state.

The philosophy of symengine is that you impose as many restrictions in is_canonical() for each class as you can (and only check that in Debug mode), so that inside the class you can assume all those things and call faster algorithms (e.g. in Rational you know it's not an integer, so you don't need to worry about that special case, at the same time if you have an integer, you are forced to use the Integer class, thus automatically using faster algorithms for just integers). Then the idea is to use the information about the algorithm to construct arguments of the symengine classes in canonical form and then call the constructor without any checks.

For cases where you can't or don't want to bother constructing in canonical form, we provide high level functions like add, mul, pow, rational, where you just provide arguments that are not necessarily in canonical form, and these functions will check and simplify. E.g. add(x, x) will check and simplify to Mul(2, x), e.g. you never have the instance Add(x, x). In the same spirit, rational(2, 1) will check and convert to Integer(2), e.g. you never have Rational(2, 1).

Summary: always try to construct objects directly using their constructors and all the knowledge that you have for the given algorithm, that way things will be very fast. If you want slower but simpler code, you can use the add(), mul(), pow(), rational() functions that perform general and possibly slow canonicalization first.