Skip to content
bjpop edited this page Sep 14, 2010 · 13 revisions

Why did you make berp?

  • For fun. I enjoy programming. Haskell is a pleasure to use and Python is a fun language to (try to) implement.
  • For learning. When I started berp I was just beginning to learn Python (because I had to teach it to first-year university students). I wanted a better understanding of how Python worked, so I tried to implement it.
  • For a challenge. It was not immediately obvious to me how much of Python I could implement by translation to Haskell. In many ways berp is an experiment to see just how far I can get.

What are the potential technical advantages of your approach?

Berp was written primarily for the reasons stated in the previous section. Therefore technical merit was less important when I started. However, there may just be some benefit in the approach I have taken. Here are some speculative reasons:

  • The GHC runtime system. GHC provides a very sophisticated runtime system. By translating to Haskell I get to use all of its great features for free. One of the big early wins is that I get a garbage collector (you don’t want to write those if you don’t have to). Later on I hope to take advantage of GHC’s concurrency features, such as lightweight threads and software transactional memory.
  • Monad transformers. Haskell’s monad transformers, particularly the continuation monad transformer ContT, make it relatively easy to implement the effectful features of Python. This means that berp can translate Python constructs into high-level Haskell constructs. This made the compiler very easy to write, which in turn has allowed me to play with some fun extensions like tail call optimisation and first-class continuations (callCC). The interaction of control flow operators in Python can get rather tricky, so it has been a great advantage to work with high-level Haskell code as output. Sometimes I think of berp as a very detailed study of the continuation monad.
  • Safety. Haskell’s type system is essentially a form of lightweight formal verification. Compilers are hard to get right, so it helps to implement your compiler in a type safe language. In future work I might see if I can make the compiler even safer by using more sophisticated aspects of the type system.

What are the potential technical disadvantages of your approach?

  • Incompleteness. Python is known as a dynamic language. Haskell is known as a static language. To some extent this difference is hard to reconcile. For example, Python supports an eval() primitive which allows you to compile code from a string and load it into a running program. This might be possible in Haskell with dynamic linking, but it is likely to be tricky. It is quite likely that there will be features of Python than cannot reasonably be implemented in the approach taken by berp.
  • Performance. This is not necessarily a problem with Haskell, but more with the way things are currently represented in berp. There are occasions in berp where the encoding of Python concepts is less than ideal in terms of efficiency. Sometimes the fact that berp can generate machine code makes up for the overheads, but what you win on the swings you sometimes lose on the roundabouts. At the moment it is far too early to tell how the performance is going to compare with CPython. It would be unwise to pay much attention to micro-benchmarks because the code is too immature.
  • Laziness. I like lazy evaluation in Haskell, but Python is a strict language, so there is little benefit in translating Python to a lazy language. Perhaps down the track it might come in handy, but, by-and-large, berp does not take advantage of it. There is a potential runtime cost to laziness that we’d like to avoid. At the moment there are probably some performance warts in berp which are caused by unwarranted laziness. I plan to investigate the matter further when I start tuning the performance. In some ways it might make better sense to translate Python to a strict functional language, such as OCaml, but I’ve chosen Haskell for other reasons (as noted above).