GSoC 2016 Application : Haskell Bindings to SymPy Engine

Siddharth edited this page Mar 24, 2016 · 6 revisions
Clone this wiki locally

Title

Create Haskell bindings for SymEngine by extending and wrapping the C FFI layer of SymEngine.

About Me

Personal Information

Name: Siddharth Bhat

Email: siddu.druid@gmail.com

University: Manipal Institute of Technology, 4th semester, Computer Science Undergraduate

IRC nick: bollu, on #math, #haskell, #diagrams, #rust at freenode

Github username: bollu

Other contact methods: Skype - druidofdclaw

Country of residence: India

Timezone: IST (GMT +5:30)

Primary language: English

Short Bio

I'm 19 years old, going to turn 20 on 10th July.

I enjoy math and programming immensely, math more than programming a lot of the time. I like the elegance of pure math, and I try to find programming languages that reflect that. I'd say Haskell, Rust and Python achieve that state of zen in wildly different ways.

I play the piano as a hobby, and I've been trying to pick up the guitar. Cooking is something else that I love to do as well, though I'm not great at it.

Me as a Programmer

Operating Systems

I'm currently using Mac OS X, but I used to run ArchLinux. I still own Kali Linux and Windows 10, though they're not my daily drivers.

Text Editors

I switch between a bunch of them depending on what I'm doing:

  • If it's Python, then I generally use PyCharm / Sublime Text

  • If it's C++, then Vim with YouCompleteMe.

  • HTML/CSS/Javascript is Sublime Text and WebStorm

  • General text editing is again mostly Vim

Tooling

I contribute to a decent number of projects on Github, so git is something I use everyday.

As for shells, I use zsh with oh-my-zsh.

I prefer ag(silver searcher) over grep due to performance.

For one-off shell scripts, I tend to write Python scripts, or use Haskell's turtle library for "shell" scripting.

Stuff I've contributed to

PSSSPP - A PSP Emulator

Link to commits

I wrote a decent amount of code for PPSSPP, mainly related to the touch screen controls and

VisPy

Link to commits

I programmed for VisPy during GSoC 2015 (and I still continue to work on it). My project was to rewrite a large chunk of code with one of the mentors to improve performance and provide flexibility to future versions of VisPy, and to integrate Cassowary into VisPy for nice-looking plots.

Haskell ecosystem

I have random commits into different Haskell projects. None of them are large enough to write about, but they're small housekeeping stuff I took up while learning the language.

I also hang out on the Haskell IRC as bollu to answer questions and learn stuff.

Rust ecosystem

I have contributed to the Rust ecosystem here-and-there, reporting compiler bugs and sending PR's to a couple of Rust packages.

I was also a member of Piston Developers, a group of Rust programmers who are trying to experiment with different game engine architectures in Rust.

Python and me.

I like the terse and yet expressive nature of python, and I dislike it for the exact same reason.

Python has brilliantly written libraries, and feels very natural to use in a REPL for things like data exploration and web programming (and attacking).

One of the coolest things I've done in Python was to probably SQL inject databases during Microsoft's Build the Shield competition to access a 20-character long password.

Python's ability for higher-order-functions and the awesome requests library saved the day.

Code for SQL injection

Most advanced feature I've used

I've probably used most Python features here-and-there:

decorators, generators, even creating objects on-the-fly, and other weird things. I prefer to stick to simple code though, since it aids comprehension for everyone involved.

Some of the most advanced standard library features might be stuff like multiprocessing to parallelise an attack on a particular server, and pickle to serialise data.

Cool python trick I like:

# transpose a 2-d list with a zip

x = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
x_transpose = list(zip(*x))

Stuff I dislike about python

Python sometimes has "too much magic", and I've submitted pull requests to actually reduce the amount of action-at-a-distance that Python allows.

Example PR to reduce magic: don't walk the module tree to create custom objects at import time.

The lack of types is a huge pain point as well, and is the reason I've been gravitating towards languages with strong type systems like Rust and Haskell in the recent past.

Favourite Sympy feature

[TODO: Need to fill this up]

Stuff I've created

I've written a sublime text plugin called SublimeBookmarks to manage bookmarks in Sublime Text.

There's another tool called teleport, to manage directories of projects.

Unfortunately, this one is much rougher around the edges both due to Haskell's own limitations in distribution, and due to my schedule that didn't let me polish it. I do plan on coming back to this and showing it some love over the summer.

I've started many toy language projects, though none of them are complete. In no real order:

  • A Lisp interpreter in C

  • Achilles - Rust implementation a toy programming language of my own called "achilles". It was going to be functional, with heavy inspiration from Haskell. Rust's closure + lifetime interaction problems during beta made me give this one up.

  • cellular Automata - A collection of cellular automata written in Haskell. This is written so it's more of a library than an executable, though it does contain sample cellular automata. The goal was to make pretty GIF's with a neat API (using commands).

Me and my project

The goal of the project is to create well formed Haskell bindings from SymPyEngine to Haskell. As of now, SymPyEngine is a subset of SymPy (in terms of feature set), and does not have SymPy running on it yet, though that is in the roadmap.

Github issue for migration.

SymPyEngine's existence is crucial because it allows for a C++ compiled codebase for symbolic computation, which is much faster than Python, and is also much more portable (most languages have C/C++ FFI bindings).

Why this project?

I'd love to do this for two reasons:

  1. I like math, and I'm sure that by spelunking the SymPyEngine codebase, I'll pick up some amount of math / algorithms / optimization lore by osmosis. So that makes the project "fun" as a whole

  2. I'd get to push Haskell's story of symbolic mathematics, something that is somewhat lacking right now as agreed upon by the Haskell community. SymPy bindings would be huge since the toolkit is mature, and would encourage a lot of interesting code in Haskell.

I have some selfish interest in this as well - I wanted to create really slick visualisations for my blog in relation to Group Theory (since I find the subject beautiful). Haskell has a brilliant library for visualisations called diagrams.

However, it lacks a proper library for symbolic math. Getting this into Haskell would mean cooler libraries to use and better data visualisation for me :)

Qualifications

I've written a decent amount of Haskell, C, and Python code in my past. This project needs all 3 of those languages - C for the FFI, Haskell for the Haskell bindings, Python to look at the Cython wrappers and read SymPy code in general.

I also understand pure math (I've taken analysis, algebra and topology, though I know more algebra than the other two branches of math), so I won't get "lost" in the docs so to speak.

Implementation Details

Stuff to Do

Approaches for implementation

Type Heirarchy

Operator Overloading for Symbol

New operators

One way to implement "operator overloading" is to simply construct new operators.

One of the neatest approaches is to create pairs of operators like so:

(.+) :: Num a -> Symbol -> Symbol
(+.) :: Num a -> Symbol -> Symbol

that allow adding a Num instance and a Symbol instance together.

Custom Typeclasses

Another design approach is to have a custom typeclass that can be added together using a new custom operator, say <+>.

class SymbolOperations where
    <+> :: SymbolOperations a -> SymbolOperations a -> SymbolOperations a

instance (Num a) => SymbolOperations a where
    ...

instance SymbolOperations Symbol where
    ...

this approach is "neater" in the sense that it just needs one operator, but the typeclass feels like a hack since it doesn't really impose any laws. I'm okay with both approaches.

Timeline

Past work in the same area

I'll be dedicating 8 hours a day, hopefully every day. It'll probably be a little less during weekends, but not a drastic drop in productivity. That brings it to ~50 hours a week, give or take.

As far as I am aware, This has not been done in the past, so there is no real "reference" implementation to go to.

Google's rough timeline

  • 23 May: Students begin coding for their Google Summer of Code projects; Google begins issuing initial student payments provided tax forms are on file and students are in good standing with their communities. Work Period Mentors give students a helping hand and guidance on their projects.

  • 20 June 19:00 UTC: Mentors and students can begin submitting mid-term evaluations.

  • 27 June 19:00 UTC: Mid-term evaluations deadline; Google begins issuing mid-term student payments provided passing student survey is on file. Work Period Mentors give students a helping hand and guidance on their projects.

  • 15 August - 23 August 19:00 UTC Final week: Students tidy code, write tests, improve documentation and submit their code sample. Students also submit their final mentor evaluation.

My timeline breakdown

Week 1, 2 - [23 May to 6 June]
  • Work on finishing simple operations supported currently by crwapper.h which is the basic C FFI layer to the C++ codebase.

  • Setup QuickCheck checking for the Haskell codebase, which provides automated testing given invariants. This ensures that there is a proper test framework in place to prevent regressions.

Week 3, 4 - [6 June to 20 June]
  • Export symbols from symbol.h which represents basic symbolic manipuation.

  • Export Integers from integer.h.

  • Export Rationals form rational.h.

  • Export Complex numbers from complex.h

  • Exporting these types will allow us to have the basic numeric tower for math computation.

  • Export trignometric, hyperbolic trig functions from functions.h submodule of SymEngine for trig support.

  • Export basic functions (max, abs, etc.) from functions.h.

Week 5 - [20 June to 27 June] Mid term Evaluation prep
  • Export special functions (Riemann Zeta, Dirichlet Eta, Gamma, Upper Gamma, Levi-Civita, etc) functions from functions.h

  • Push version 0.2.0.0 of the package version to Hackage, which is the Haskell community's open source package repository.

  • Since the package will be decently feature-complete, get the package onto Stackage, which is where stable, community-curated packages live.

27 June - Mid term Evaluation

Week 6, 7 - [27 June to 11 July]
  • Lambdify implementation for Haskell (convert from SymPy expressions to a proper Haskell function that can be used for computation). This will be restricted to simple functions as a starting point. (trig, basic functions, special functions will work.)
Week 8, 9 - [11 July to 25 July]
  • Implement numerical evaluation functions from symengine to Haskell
Week 10, 11 - [25 July - 8 August]
  • Implement polynomials FFI into cwrapper.h from polynomial.h.

  • Create Haskell bindings for polynomial.h.

  • Implement Matrix FFI from matrix.h.

  • Create Haskell bindings for matrix.h.

  • Write tests cases (for both the C FFI side and the Haskell side), library documentation for the above classes.

Week 12, 13 - [8 August - 22 August] Final Evaluation prep

23 August - Final Evaluation

Stretch Goals