在本文中我会尽我的知识水平,讲解haskell实用的多样的编程领域、任务。这篇文章的目的是讨论优劣,顺便广告一下我认为的haskell能够有所提升的地方。
这篇文章被划分为两部分: 第一部分haskell适合的特殊的编程领域, 诸如:服务端、游戏、数据科学。第二部分则包含haskell适用的一般�编程需要, 诸如: 测试,、IDE、还有高并发
这篇文章按照优势--劣势来排列, 每一个编程领域将会总结如下:
- 优秀: 在所有语言中的最佳选择
- 成熟: 适合所有的编程者
- 不成熟: 只被积极的探索者所接受
- 坏: 不能用
在正向的排序中,我会列举成功的案例。 在负向的排序中,我会提供一些完善实物的建设性的意见
Disclaimer #1: I obviously don't know everything about the Haskell ecosystem, so whenever I am unsure I will make a ballpark guess and clearly state my uncertainty in order to solicit opinions from others who have more experience. I keep tabs on the Haskell ecosystem pretty well, but even this post is stretching my knowledge. If you believe any of my ratings are incorrect, I am more than happy to accept corrections (both upwards and downwards)
Disclaimer #2: There are some "Educational resource" sections below which are remarkably devoid of books, since I am not as familiar with textbook-related resources. If you have suggestions for textbooks to add, please let me know.
Disclaimer #3: I am very obviously a Haskell fanboy if you haven't guessed from the name of my blog and I am also an author of several libraries mentioned below, so I'm highly biased. I've made a sincere effort to honestly appraise the language, but please challenge my ratings if you believe that my bias is blinding me! I've also clearly marked Haskell sales pitches as "Propaganda" in my external link sections. :)
Disclaimer #4: I've contributed the majority of these recommendations and I also play en editorial role. This means that although some contributions have been crowd-sourced I reserve the right to decline a pull request or edit/delete content if I feel that a resource is abandoned or if I feel there are better alternatives already listed. I try to be as fair as possible and if you disagree with any decision of mine or you feel that my recommendation does not reflect the consensus of the Haskell community you can challenge my decision by opening an issue and I will either defend my decision or change my mind.
- 应用程序领域
- Common Programming Needs
- Maintenance
- Single-machine Concurrency
- Types / Type-driven development
- Parsing / Pretty-printing
- Domain-specific languages (DSLs)
- Testing
- Data structures and algorithms
- Benchmarking
- Unicode
- Stream programming
- Serialization / Deserialization
- Support for file formats
- Package management
- Logging
- Education
- Databases and data stores
- Debugging
- Cross-platform support
- Hot code loading
- IDE support
排名: 优秀
haskell 是编写编译器梦寐以求的语言, 如果你还在使用其他的语言来编写编译器的话,你应该考虑换用haskell
Haskell起源于学术界, 同大多数学术起源的语言一样(诸如ML家族)非常适用与编译器相关领域。一些明显的原因在于, 拥有丰富的编译器相关任务的库,诸如:解析器、打印工具、、变量作用域、语法树操作和优化。
任何写过编译器的人豆能够认识到操作一个弱类型的数据结果是多么的困难。然后,编译器需要处理一个巨大的错误,从类型检查到优化、到代码生活生成,haskell 将这些拒之门外,因为,一个强大的带有许多扩展的类型系统可以在编译阶段解决绝大部分的类型错误。
我一直认为拥有很多的教育资源适合编译器作者,无论是在论文、书籍。但是我不是最合适的人选来总结这些资源, 但那是我回列举出我读过的高质量的文章列出来。
最后,这里还有大量的其他语言的解释器和打印工具供你使用。
值得关注的库:
parsec
/megaparsec
/attoparsec
/trifecta
/alex
+happy
- parsing librariesbound
/unbound
- manipulating bound variableshoopl
- optimizationwl-pprint
/ansi-wl-pprint
- pretty-printingllvm-general
- LLVM APIlanguage-
{ecmascript
|python
|c-quote
|lua
|java
|objc
|cil
} - parsers and pretty-printers for other languages
一些haskell写的编译器:
Elm
Purescript
Idris
Agda
Pugs
(the first Perl 6 implementation)oden
ghc
(self-hosting)frege
(very similar to Haskell, also self-hosting)hython
(a Python3 interpreter written in Haskell)
教育资源:
- Write you a Haskell
- A Tutorial Implementation of a Dependently Typed Lambda Calculus
- Binders Unbound
宣传:
排名: 成熟
Haskell 第二个最大的优势在于back-end(后端服务), web应用或者服务。haskell带来的最大特点有:
- 服务稳定
- 性能
- 并行编程
- 优秀的web标准支持
强类型系统、优雅的运行时大大的提升的服务的稳定性和维护。 这个是haskell最大区别于其他的语言。因为他大量的降低了总成本。你可以其他你拥有更少的haskell程序员来维持服务相比于其他的静态类型语言。
但是,haskell最大服务稳定行在于内存不足。 大部分的问题解决方案我知道的是使用 ekg
(一个进程管理器) 来检查服务端的内存稳定在部署到线上之前。 第二个普遍的解决方案时学习探测和预防内存不足经验, 这点并没有人们想像中的难。
Haskell的性能表现与java比较起来是优秀的。 两种语言在初学者或者高手中都拥有大概的性能。 haskell的闪光点表现在对运行时的如下特性支持:
- 软件事物性内存 (区别于Go语言)
- 轻量级的线程、不阻塞 I/O (区别与JVM)
- 垃圾回收 (区别于Rust)
如果你从没有尝试过haskell的软件事物性内存, 你实在是应该尝试一下,因为他最大限度的解决了并发的逻辑bug, STM 是最被低估了的haskell运行时特性。
值得关注的库:
warp
/wai
- the low-level server and API that all server libraries share, with the exception ofsnap
scotty
- A beginner-friendly server framework analogous to Ruby's Sinatraspock
- Lighter than the "enterprise" frameworks, but more featureful than scotty (type-safe routing, sessions, conn pooling, csrf protection, authentication, etc)yesod
/yesod-*
/snap
/snap-*
/happstack-server
/happstack-*
- "Enterprise" server frameworks with all the bells and whistlesservant
/servant-*
- Library for type-safe REST servers and clients that might blow your mindauthenticate
/authenticate-*
- Shared authentication librariesekg
/ekg-*
- Haskell service monitoringstm
- Software-transactional memorylucid
- Haskell DSL for building HTMLmustache
/karver
- Templating libraries
一些使用haskell构建的网站和服务:
- Facebook's spam filter: Sigma
- IMVU's REST API
- Utrecht's bicycle parking guidance system
- elm-lang.org
- glot.io
- The Perry Bible Fellowship
- Silk
- Shellcheck
- instantwatcher.com
- markup.rocks
推广:
- Fighting spam with Haskell - Haskell in production, at scale, at Facebook
- IMVU Engineering - What it's like to use Haskell
- Haskell-based Bicycle Parking Guidance System in Utrecht
- Mio: A High-Performance Multicore IO Manager for GHC
- The Performance of Open Source Applications - Warp
- Optimising Garbage Collection Overhead in Sigma
- instantwatcher.com author comments on rewrite from Ruby to Haskell - [1] [2]
- A lot of websockets in Haskell - A load test showing that a Haskell server can handle 500K connections in 10 GB of memory. The load tester requires more resources than the server
教育资源:
- Making a Website With Haskell
- Beautiful concurrency - a software-transactional memory tutorial
- The Yesod book
- The Servant tutorial
- Overview of Happstack
排名: 成熟
haskelll作为一本脚本语言的最大的优势在于: haskell是采用类型推导最广泛的类型。许多语言支持本地类型推导(诸如: Rust、Go、Java、C#),意味着函数参数类型和接口类型必须显示的声明出来,但是其他的是可以推导的。在haskell中,你可以避免所有类型声明,所有的类型和接口是可以完全被编译器推导出来的。 全局的类型推导,给人一种脚本语言同时提供静态类型检查安全暗感觉。类型安全尤其在一些企业、胶水脚本运行时提升权限时最薄弱的环节。 第二个受益于haskell的类型安全时脚本维护。 许多动态类型语言脚本在他们超出1000LOC的体量时候,开始变得非常难以维护。 人们很少有时间将测试覆盖全部的测试路径。拥有一个强类型的语言就像一个免费、自动生成测试的脚本语言, 更进一步,这个类型系统比测试更容易重构。
但是,我推荐haskell的最大的原因是因为haskell写一些一次性的脚本也是非常容易的。这些haskell脚本与同等的Bash脚本、或者Python脚本比起来在体积、简单方面更具有优势。这些让你些更少的代码完成更多的任务。
haskell对比与其他的动态脚本语言
Haskell has one advantage over many dynamic scripting languages, which is that Haskell can be compiled into a native and statically linked binary for distribution to others.
haskell脚本库在将来将会包含入你期望的Pythong或者Ruby的特性,包括如下几点:
- 丰富的工具库 Unix-like 集合
- 更高级的子进程管理
- POSIX support
- 更轻快的语法出来异常和自动资源处置
值得关注的库:
shelly
/turtle
- scripting libraries (Full disclosure: I authoredturtle
)optparse-applicative
/cmdargs
- command-line argument parsinghaskeline
- a complete Haskell implementation ofreadline
for console buildingprocess
- low-level library for sub-process management
一些haskell撰写的脚本工具:
教育资源:
排名: 不成熟
haskell的数控编程刚刚开始,有待提高。 我在这方面的主要经验是从几年前做的数值计算涉及大量的矢量和矩阵的生物信息学编程,我的评价在很大程度上是由经验决定。 生态存在的最大问题是:
- 非常少的矩阵api
- 基于规则的优化重写
当
When the optimizations work they are amazing and produce code competitive with C. However, small changes to your code can cause the optimizations to suddenly not trigger and then performance drops off a cliff.
There is one Haskell library that avoids this problem entirely which I believe
holds a lot of promise: accelerate
generates LLVM and CUDA code at runtime
and does not rely on Haskell's optimizer for code generation, which side-steps
the problem. accelerate
has a large set of supported algorithms that you
can find by just checking the library's reverse dependencies:
However, I don't have enough experience with accelerate
or enough familiarity
with numerical programming success stories in Haskell to vouch for this just
yet. If somebody has more experience then me in this regard and can provide
evidence that the ecosystem is mature then I might consider revising my rating
upward.
Notable libraries:
accelerate
/accelerate-*
- GPU programmingvector
- high-performance arraysrepa
/repa-*
- parallel shape-polymorphic arrayshmatrix
/hmatrix-*
- Haskell's BLAS / LAPACK wrapperad
- automatic differentiation
Propaganda:
- Exploiting vector instructions with generalized stream fusion
- Type-safe Runtime Code Generation: Accelerate to LLVM
Educational Resources:
排名: 不成熟
这个归因于haskell可以编程成js的特性, gjcjs
是一个前端工具,但是建立在ghcjs
上是一个非平凡的,现在 工具链支持 ghchs
你可以非常容易通过下面的几个工具来建立一个新的 ghcjs
项目:
ghchs
区分与其他的haskell转js的编译器的一个巨大的不同点在于许多的haskell库运行在盒子之外,因为他支持大部分的 ghc
原始操作
值得一提的是,这里存在两种你值的尝试用作前端开发的haskell相像的语言, elm
和 purescript
。
他们被应用在生产环境中,而且各自拥有活跃的维护者和社区。purescript
时最像haskell的语言。
有待提升的地方:
- 针对已经存在的js项目需要一个明确清晰的解决方案
- 需要更多的教育资源针对没有经验的开发者, 解释如何转换已经存在的前端代码到haskell
- There need to be several well-maintained and polished Haskell libraries for front-end programming
- 整个 ghchs生态都需要更多的文档,甚至没有一个基本的如何使用ghcjs教程
值得关注的haskell to javascript 编译器:
值得关注的库:
- reflex / reflex-dom - Functional reactive programming library for the front end
排名: 不成熟
我使用这个题目来特指 分布式运算和分布式服务。 对于分布式服务架构haskell凭借一些列的服务端工具库可以跟其他的相并列。但是对于分布式运算haskell还是还是慢一拍的。 通过Cloud Hashkell项目在haskell中复制Erlang-like函数,还需要很多的工作要做,而不仅仅是创建底层级的原始支持 分布/网络/传输, 而且还需要装配上Erlang的 OTP。在高层面的库已经停止,但是开发底层面的库依然是对分布式运算有意的。
有待提升的空间:
- We need more analytics libraries. Haskell has no analog of
scalding
orspark
. The most we have is just a Haskell wrapper aroundhadoop
- We need a polished consensus library (i.e. a high quality Raft implementation in Haskell)
Notable libraries:
glue-core
/glue-ekg
/glue-example
- Service toolkit supportinghaxl
- Facebook library for efficient batching and scheduling of concurrent data accessdistributed-process
/distributed-process-*
- Haskell analog to Erlanghadron
- Haskell wrapper aroundhadoop
amazonka
/amazonka-*
- Auto-generated bindings to the entire Amazon Web Services SDK
排名: 不成熟
所有的haskell工具库都是封装了其他语言写的工具库,我最后以此查看 gtk
bindings
这里说不成熟的原因在于
All Haskell GUI libraries are wrappers around toolkits written in other
languages (such as GTK+ or Qt). The last time I checked the gtk
bindings
were the most comprehensive, best maintained, and had the best documentation.
The reason for the "Immature" rating is that there still isn't a Haskell binding to a widget toolkit that doesn't have some sort of setup issues with the toolkit.
However, the Haskell bindings to GTK+ have a strongly imperative feel to them.
The way you do everything is communicating between callbacks by mutating
IORef
s. Also, you can't take extensive advantage of Haskell's awesome
threading features because the GTK+ runtime is picky about what needs to happen
on certain threads. I haven't really seen a Haskell library that takes this
imperative GTK+ interface and wraps it in a more idiomatic Haskell API.
My impression is that most Haskell programmers interested in applications programming have collectively decided to concentrate their efforts on improving Haskell web applications instead of standalone GUI applications. Honestly, that's probably the right decision in the long run.
Another post that goes into more detail about this topic is this post written by Keera Studios:
Areas for improvement:
- A GUI toolkit binding that is maintained, comprehensive, and easy to use
- Polished GUI interface builders
Notable libraries:
gtk
/glib
/cairo
/pango
- The GTK+ suite of librarieswx
- wxWidgets bindingsX11
- X11 bindingsthreepenny-gui
- Framework for local apps that use the web browser as the interfacehsqml
- A Haskell binding for Qt Quick, a cross-platform framework for creating graphical user interfaces.fltkhs
- A Haskell binding to FLTK. Easy install/use, cross-platform, self-contained executables.FregeFX
- Frege bindings to Java FX (Frege is essentially the Haskell for the JVM)typed-spreadsheet
- Library for building composable interactive forms
Some example applications:
Educational resources:
- Haskell port of the GTK tutorial
- Building pragmatic user interfaces in Haskell with HsQML
- FLTK GUIs, including support for the Fluid visual interface builder
Rating: Immature? (Uncertain)
Native Haskell implementations in this area have been pioneered almost single-handedly
by one person: Mike Izbicki. He maintains the HLearn
suite of libraries for machine
learning in Haskell.
If you would like to learn more about this area the best place to begin is the
Github page for the HLearn
project:
Tweag.io has released Sparkle
, a Haskell integration with Spark. This
enables the use of MLib from Haskell. MLib is widely used in the industry
for machine learning. Sparkle itself is fairly new.
Notable libraries:
Rating: Immature
Haskell really lags behind Python and R in this area. Haskell is somewhat usable for data science, but probably not ready for expert use under deadline pressure.
I'll primarily compare Haskell to Python since that's the data science
ecosystem that I'm more familiar with. Specifically, I'll compare to the
scipy
suite of libraries:
The Haskell analog of NumPy
is the hmatrix
library, which provides Haskell
bindings to BLAS, LAPACK. hmatrix
's main limitation is that the API is a bit
clunky, but all the tools are there.
Haskell's charting story is okay. Probably my main criticism of most charting APIs is that their APIs tend to be large, the types are a bit complex, and they have a very large number of dependencies.
Fortunately, Haskell does integrate into IPython so you can use Haskell within an IPython shell or an online notebook. For example, there is an online "IHaskell" notebook that you can use right now located here:
- IHaskell notebook - Click on "Welcome to Haskell.ipynb"
If you want to learn more about how to setup your own IHaskell notebook, visit this project:
The closest thing to Python's pandas
is the frames
library. I haven't used
it that much personally so I won't comment on it much other than to link to
some tutorials in the Educational Resources section.
I'm not aware of a Haskell analog to SciPy
(the library) or sympy
. If
you know of an equivalent Haskell library then let me know.
One Haskell library that deserves honorable mention here is the diagrams
library which lets you produce complex data visualizations very easily if
you want something a little bit fancier than a chart. Check out the diagrams
project if you have time:
Areas for improvement:
- Smooth user experience and integration across all of these libraries
- Simple types and APIs. The data science programmers I know dislike overly complex or verbose APIs
- Beautiful data visualizations with very little investment
Notable libraries:
cassava
- CSV encoding and decodinghmatrix
- BLAS / LAPACK wrapperFrames
- Haskell data analysis tool analogous to Python'spandas
statistics
- Statistics (duh!)Chart
/Chart-*
- Charting librarydiagrams
/diagrams-*
- Vector graphics libraryihaskell
- Haskell backend to IPython
Rating: Immature
Haskell is a garbage collected language, so Haskell is more appropriate for the scripting / logic layer of a game but not suitable manipulating a large object graph or for implementing a high-performance game engine due to the risk of introducing perceptible pauses due to GC pauses. Also, for simple games you can realistically use Haskell for the entire stack.
Examples of games that could be fully implemented in Haskell:
- Casual games
- Turn-based strategy games
- Adventure games
- Platform / side-scrolling games
- First-person shooter
Examples of games that are difficult to implement at all in Haskell:
- Real-time strategy games
- MMORPGs
Haskell has SDL and OpenGL bindings, which are actually quite good, but that's about it. You're on your own from that point onward. There is not a rich ecosystem of higher-level libraries built on top of those bindings. There is some work in this area, but I'm not aware of anything production quality or easy to use.
The primary reason for the immature rating is the difficulty of integrating Haskell with existing game platforms, which often are biased towards a particular language or toolchain. The only game platform where Haskell has no issues is native binaries for desktop games. For the web, you must compile to Javascript, which is doable. For mobile games on Android you have to cross compile and interface the Haskell logic with Android through JNI + Haskell's foreign function interface. For console games, you have no hope.
Areas for improvement:
- Improve the garbage collector and benchmark performance with large heap sizes
- Provide higher-level game engines
- Improve distribution of Haskell games on proprietary game platforms
Notable libraries:
gloss
- Simple graphics and game programming for beginners- Code World - Similar to
gloss
, but you can try it in your browser gl
- Comprehensive OpenGL bindingsSDL
/SDL-*
/sdl2
- Bindings to the SDL librarySFML
- Bindings to the SFML libraryquine
- Github project with cool 3D demosGPipe
- Type-safe OpenGL API that also lets you embed shader code directly within Haskell. See the GPipe wiki to learn more
Rating: Bad / Immature (See description)
Since systems programming is an abused word, I will clarify that I mean programs where speed, memory layout, and latency really matter.
Haskell fares really poorly in this area because:
- The language is garbage collected, so there are no latency guarantees
- Executable sizes are large
- Memory usage is difficult to constrain (thanks to space leaks)
- Haskell has a large and unavoidable runtime, which means you cannot easily embed Haskell within larger programs
- You can't easily predict what machine code that Haskell code will compile to
Typically people approach this problem from the opposite direction: they write the low-level parts in C or Rust and then write Haskell bindings to the low-level code.
It's worth noting that there is an alternative approach which is Haskell DSLs that are strongly typed that generate low-level code at runtime. This is the approach championed by the company Galois.
Notable libraries:
atom
/ivory
- DSL for generating embedded programscopilot
- Stream DSL that generates C codeimprove
- High-assurance DSL for embedded code that generates C and Ada
Educational resources:
Rating: Immature? / Bad? (Uncertain)
This greatly lags behind using the language that is natively supported by the mobile platform (i.e. Java for Android or Objective-C / Swift for iOS).
I don't know a whole lot about this area, but I'm definitely sure it is far from mature. All I can do is link to the resources I know of for Android and iPhone development using Haskell.
I also can't really suggest improvements because I'm pretty out of touch with this branch of the Haskell ecosystem.
Educational resources:
Rating: Immature
On hobbyist boards like the Raspberry Pi its possible to compile haskell code with ghc. There are limitations; some libraries have problems on the arm platform, and ghci only works on newer compilers. Cross compiling doesn't work with template haskell. Stack and other large projects can take more than 1g of memory to compile.
However, if the haskell code builds, it runs with respectable performance on these machines.
**Arch (Banana Pi) ** update 2016-02-25:
- installed today from pacman, current versions are ghc 7.10.3 and cabal-install 1.22.6.0
- a compatable version of llvm also installed automatically.
- ghci passes hello world test; cabal/ghc compiled a modest project normally.
Raspian (Raspberry Pi, pi2, others)
- current version: ghc 7.4, cabal-install 1.14
- ghci doesn't work.
Debian Jesse (Raspberry Pi 2)
- current version: ghc 7.6
- Requires
llvm
version 3.5.2 or higher. Do not use thellvm-3.5
provided by default in the Jessie package distribution
Arch (Raspberry Pi 2)
- current version 7.8.2, but llvm is 3.6, which is too new.
- downgrade packages for llvm not officially available.
- with llvm downgrade to 3.4, ghc and ghci work, but problems compiling yesod, scotty.
- compiler crashes, segfaults, etc.
Rating: Immature
There are Haskell bindings for OpenCV available via HOpenCV
which has bindings
for versions upto OpenCV 2.0
. A fork maintained by Anthony Cowley has bindings
available for versions upto OpenCV 2.4
, but it pretty much stops there.
Currently, OpenCV 3.0
has been released, and there are no Haskell bindings
covering it.
There are some interesting projects which try to tackle computer vision in a
purely functional manner. cv-combinators
, easyVision
, and Zef
are some
examples.
As for real world usage, Anthony Cowley has a success story in using Haskell for Robots, which likely used quite a bit of Computer Vision.
To be fair, OpenCV
is very complex and has many APIs, and the OpenCV bindings
so far are pretty extensive. Libraries like easyVision
can't compete with
OpenCV in terms of features, but they are very much feature rich. However, there
is still a lot of scope for improvement.
Notable libraries:
Rating: Best in class
Haskell is unbelievably awesome for maintaining large projects. There's nothing that I can say that will fully convey how nice it is to modify existing Haskell code. You can only appreciate this through experience.
When I say that Haskell is easy to maintain, I mean that you can easily approach a large Haskell code base written by somebody else and make sweeping architectural changes to the project without breaking the code.
You'll often hear people say: "if it compiles, it works". I think that is a bit of an exaggeration, but a more accurate statement is: "if you refactor and it compiles, it works". This lets you move fast without breaking things.
Most statically typed languages are easy to maintain, but Haskell is on its own level for the following reasons:
- Strong types
- Global type inference
- Type classes
- Laziness
The latter three features are what differentiate Haskell from other statically typed languages.
If you've ever maintained code in other languages you know that usually your test suite breaks the moment you make large changes to your code base and you have to spend a significant amount of effort keeping your test suite up to date with your changes. However, Haskell has a very powerful type system that lets you transform tests into invariants that are enforced by the types so that you can statically eliminate entire classes of errors at compile time. These types are much more flexible than tests when modifying code and types require much less upkeep as you make large changes.
The Haskell community and ecosystem use the type system heavily to "test" their applications, more so than other programming language communities. That's not to say that Haskell programmers don't write tests (they do), but rather they prefer types over tests when they have the option.
Global type inference means that you don't have to update types and interfaces as you change the code. Whenever I do a large refactor the first thing I do is delete all type signatures and let the compiler infer the types and interfaces for me as I go. When I'm done refactoring I just insert back the type signatures that the compiler infers as machine-checked documentation.
Type classes also assist refactoring because the compiler automatically infers type class constraints (analogous to interfaces in other languages) so that you don't need to explicitly annotate interfaces. This is a huge time saver.
Laziness deserves special mention because many outsiders do not appreciate how laziness simplifies maintenance. Many languages require tight coupling between producers and consumers of data structures in order to avoid wasteful evaluation, but laziness avoids this problem by only evaluating data structures on demand. This means that if your refactoring process changes the order in which data structures are consumed or even stops referencing them altogether you don't need to reorder or delete those data structures. They will just sit around patiently waiting until they are actually needed, if ever, before they are evaluated.
Rating: Best in class
I give Haskell a "Best in class" rating because Haskell's concurrency runtime performs as well or better than mainstream languages and is significantly easier to use due to the runtime support for software-transactional memory.
The best explanation of Haskell's threading module is the documentation in
Control.Concurrent
:
Concurrency is "lightweight", which means that both thread creation and context switching overheads are extremely low. Scheduling of Haskell threads is done internally in the Haskell runtime system, and doesn't make use of any operating system-supplied thread packages.
In Haskell, all I/O is non-blocking by default, so for example a web server will just spawn one lightweight thread per connection and each thread can be written in an ordinary synchronous style instead of nested callbacks like in Node.js.
The best way to explain the performance of Haskell's threaded runtime is to give hard numbers:
- The Haskell thread scheduler can easily handle millions of threads
- Each thread requires 1 kb of memory, so the hard limitation to thread count is memory (1 GB per million threads).
- Haskell channel overhead for the standard library (using
TQueue
) is on the order of one microsecond per message and degrades linearly with increasing contention - Haskell channel overhead using the
unagi-chan
library is on the order of 100 nanoseconds (even under contention) - Haskell's
MVar
(a low-level concurrency communication primitive) requires 10-20 ns to add or remove values (roughly on par with acquiring or releasing a lock in other languages)
Haskell also provides software-transactional memory, which allows programmers build composable and atomic memory transactions. You can compose transactions together in multiple ways to build larger transactions:
- You can sequence two transactions to build a larger atomic transaction
- You can combine two transactions using alternation, falling back on the second transaction if the first one fails
- Transactions can retry, rolling back their state and sleeping until one of their dependencies changes in order to avoid wasteful polling
A few other languages provide software-transactional memory, but Haskell's implementation has two main advantages over other implementations:
- The type system enforces that transactions only permit reversible memory modifications. This guarantees at compile time that all transactions can be safely rolled back.
- Haskell's STM runtime takes advantage of enforced purity to improve the efficiency of transactions, retries, and alternation.
Haskell is also the only language that supports both software transactional memory and non-blocking I/O.
Notable libraries:
stm
- Software transactional memoryunagi-chan
- High performance channelsasync
- Futures library
Educational resources:
- Parallel and Concurrent Programming in Haskell
- Parallel and Concurrent Programming in Haskell - Software transactional memory
- Beautiful concurrency - a software-transactional memory tutorial
- Performance numbers for primitive operations - Latency timings for various low-level operations
Propaganda:
Rating: Best in class
Haskell definitely does not have the most advanced type system (not even close if you count research languages) but out of all languages that are actually used in production Haskell is probably at the top. Idris is probably the closest thing to a type system more powerful than Haskell that has a realistic chance of use in production in the foreseeable future.
The killer features of Haskell's type system are:
- Type classes
- Global type and type class inference
- Light-weight type syntax
Haskell's type system really does not get in your way at all. You (almost) never need to annotate the type of anything. As a result, the language feels light-weight to use like a dynamic language, but you get all the assurances of a static language.
Many people are familiar with languages that support "local" type inference (like Rust, Java, C#), where you have to explicitly type function arguments but then the compiler can infer the types of local variables. Haskell, on the other hand, provides "global" type inference, meaning that the types and interfaces of all function arguments are inferred, too. Type signatures are optional (with some minor caveats) and are primarily for the benefit of the programmer.
Here is an example of writing a function without any types or interfaces at all and asking the compiler to infer them for you:
>>> let addAndShow x y = show (x + y)
>>> :type addAndShow
addAndShow :: (Num a, Show a) => a -> a -> String
This really benefits projects where you need to prototype quickly but refactor painlessly when you realize you are on the wrong track. You can leave out all type signatures while prototyping but the types are still there even if you don't see them. Then when you dramatically change course those strong and silent types step in and keep large refactors painless.
Some Haskell programmers use a "type-driven development" programming style, analogous to "test-driven development":
- they specify desired behavior as a type signature which initially fails to type-check (analogous to adding a test which starts out "red")
- they create a quick and dirty solution that satisfies the type-checker (analogous to turning the test "green")
- they improve on their initial solution while still satisfying the type-checker (analogous to a "red/green refactor")
"Type-driven development" supplements "test-driven development" and has different tradeoffs:
- The biggest disadvantage of types is that they don't test as many things as full-blown tests, because Haskell is not (yet) dependently typed
- The biggest advantage of types is that they can prove the complete absence of programming errors for all possible cases, whereas tests cannot examine every possibility
- Type-checking is much faster than running tests
- Type error messages are informative: they explain what went wrong
- Type-checking never hangs and never gives flaky results
Haskell also provides the "Typed Holes" extension, which lets you add an
underscore (i.e. "_
") anywhere in the code whenever you don't know what
expression belongs there. The compiler will then tell you the expected type of
the hole and suggest terms in scope with related types that you can use to fill
the hole.
There is also a newly added "Liquid Haskell" extension under development which you can use to program with "refinement types". These types enrich Haskell's type system with the ability to decorate type signatures with logical predicates and arithmetic, and increases the number of invariants that you can encode at the type level.
Educational resources:
- Learn you a Haskell - Types and type classes
- Learn you a Haskell - Making our own types and type classes
- Typed holes
- Partial type signatures proposal
- Programming with refinement types - Very extensive tutorial on how to use Liquid Haskell with interactive examples you can run in your browser
Propaganda:
- What exactly makes the Haskell type system so revered (vs say, Java)?
- Difference between OOP interfaces and FP type classes
- Compile-time memory safety using Liquid Haskell - post illustrating an example use case for refinement types
Rating: Best in class
Haskell parsing is sooooooooooo slick. Recursive descent parser combinators are far-and-away the most popular parsing paradigm within the Haskell ecosystem, so much so that people use them even in place of regular expressions. I strongly recommend reading the "Monadic Parsing in Haskell" functional pearl linked below if you want to get a feel for why parser combinators are so dominant in the Haskell landscape.
If you're not sure what library to pick, I generally recommend the parsec
library as a default well-rounded choice because it strikes a decent balance
between ease-of-use, performance, good error messages, and small dependencies
(since it ships with GHC). There is also the megaparsec
library, which is
modern and improved version of parsec
.
attoparsec
deserves special mention as an extremely fast backtracking parsing
library. The speed and simplicity of this library will blow you away. The
main deficiency of attoparsec
is the poor error messages.
The pretty-printing front is also excellent. Academic researchers just really love writing pretty-printing libraries in Haskell for some reason.
Notable libraries:
parsec
- Best overall "value"megaparsec
- Modern, actively maintained fork ofparsec
attoparsec
- Extremely fast backtracking parserEarley
- Earley parsing embedded within the Haskell language. Parses all context-free grammars, even ambiguous ones, with no need to left factor. Returns all valid parses.trifecta
- Best error messages (clang
-style)parsers
- Interface compatible withattoparsec
,parsec
andtrifecta
which lets you easily switch between them. People commonly use this library to begin withtrifecta
orparsec
(for better error messages) then switch toattoparsec
when done for performancealex
/happy
- Likelexx
/yacc
but with Haskell integrationansi-wl-pprint
- Pretty-printing librarytext-format
- High-performance string formatting
Educational resources:
Propaganda:
Rating: Mature
Haskell rocks at DSL-building. While not as flexible as a Lisp language I would venture that Haskell is the most flexible of the non-Lisp languages. You can overload a large amount of built-in syntax for your custom DSL.
The most popular example of overloaded syntax is do
notation, which you can
overload to work with any type that implements the Monad
interface. This
syntactic sugar for Monad
s in turn led to a huge overabundance of Monad
tutorials.
However, there are lesser known but equally important things that you can overload, such as:
- numeric and string literals
if
/then
/else
expressions- list comprehensions
- numeric operators
Educational resources:
Rating: Mature
There are a few places where Haskell is the clear leader among all languages:
- property-based testing
- mocking / dependency injection
Haskell's QuickCheck
is the gold standard which all other property-based
testing libraries are measured against. The reason QuickCheck
works so
smoothly in Haskell is due to Haskell's type class system and purity. The type
class system simplifies automatic generation of random data from the input type
of the property test. Purity means that any failing test result can be
automatically minimized by rerunning the check on smaller and smaller inputs
until QuickCheck
identifies the corner case that triggers the failure.
Mocking is another area where Haskell shines because you can overload almost all built-in syntax, including:
do
notationif
statements- numeric literals
- string literals
Haskell programmers overload this syntax (particularly do
notation) to write
code that looks like it is doing real work:
example = do str <- readLine
putLine str
... and the code will actually evaluate to a pure syntax tree that you can use to mock in external inputs and outputs:
example = ReadLine (\str -> PutStrLn str (Pure ()))
Haskell also supports most testing functionality that you expect from other languages, including:
- standard package interfaces for testing
- unit testing libraries
- test result summaries and visualization
Notable libraries:
QuickCheck
- property-based testingdoctest
- tests embedded directly within documentationfree
- Haskell's abstract version of "dependency injection"hspec
- Testing library analogous to Ruby's RSpecHUnit
- Testing library analogous to Java's JUnittasty
- Combination unit / regression / property testing library
Educational resources:
Rating: Mature
Haskell primarily uses persistent data structures, meaning that when you "update" a persistent data structure you just create a new data structure and you can keep the old one around (thus the name: persistent). Haskell data structures are immutable, so you don't actually create a deep copy of the data structure when updating; any new structure will reuse as much of the original data structure as possible.
The Notable libraries sections contains links to Haskell collections libraries that are heavily tuned. You should realistically expect these libraries to compete with tuned Java code. However, you should not expect Haskell to match expertly tuned C++ code.
The selection of algorithms is not as broad as in Java or C++ but it is still pretty good and diverse enough to cover the majority of use cases.
Notable libraries:
vector
- High-performance arrayscontainers
- High-performanceMap
s,Set
s,Tree
s,Graph
s,Seq
sunordered-containers
- High-performanceHashMap
s, HashSetsaccelerate
/accelerate-*
- GPU programmingrepa
/repa-*
- parallel shape-polymorphic arrays
Rating: Mature
This boils down exclusively to the criterion
library, which was done so well
that nobody bothered to write a competing library. Notable criterion
features include:
- Detailed statistical analysis of timing data
- Beautiful graph output: (Example)
- High-resolution analysis (accurate down to nanoseconds)
- Customizable HTML/CSV/JSON output
- Garbage collection insensitivity
Notable libraries:
Educational resources:
Rating: Mature
Haskell's Unicode support is excellent. Just use the text
and text-icu
libraries, which provide a high-performance, space-efficient, and easy-to-use
API for Unicode-aware text operations.
Note that there is one big catch: the default String
type in Haskell is
inefficient. You should always use Text
whenever possible.
Notable libraries:
Rating: Mature
Haskell's streaming ecosystem is mature. Probably the biggest issue is that there are too many good choices (and a lot of ecosystem fragmentation as a result), but each of the streaming libraries listed below has a sufficiently rich ecosystem including common streaming tasks like:
- Network transmissions
- Compression
- External process pipes
- High-performance streaming aggregation
- Concurrent streams
- Incremental parsing
Notable libraries:
conduit
/io-streams
/pipes
- Stream programming libraries (Full disclosure: I authoredpipes
and wrote the officialio-streams
tutorial)machines
- Networked stream transducers library
Educational resources:
Rating: Mature
Haskell's serialization libraries are reasonably efficient and very easy to use. You can easily automatically derive serializers/deserializers for user-defined data types and it's very easy to encode/decode values.
Haskell's serialization does not suffer from any of the gotchas that object-oriented languages deal with (particularly Java/Scala). Haskell data types don't have associated methods or state to deal with so serialization/deserialization is straightforward and obvious. That's also why you can automatically derive correct serializers/deserializers.
Serialization performance is pretty good. You should expect to serialize data at a rate between 100 Mb/s to 1 Gb/s with careful tuning. Serialization performance still has about 3x-5x room for improvement by multiple independent estimates. See the "Faster binary serialization" link below for details of the ongoing work to improve the serialization speed of existing libraries.
Notable libraries:
Educational resources:
- Faster binary serialization / Better, faster binary serialization - Slides on serialization efficiency improvements
Rating: Mature
Haskell supports all the common domain-independent serialization formats (i.e. XML/JSON/YAML/CSV). For more exotic formats Haskell won't be as good as, say, Python (which is notorious for supporting a huge number of file formats) but it's so easy to write your own quick and dirty parser in Haskell that this is not much of an issue.
Notable libraries:
aeson
- JSON encoding/decodingcassava
- CSV encoding/decodingyaml
- YAML encoding/decodingxml
- XML encoding/decoding
Rating: Mature
This rating is based entirely on the recent release of the stack
package tool
by FPComplete which greatly simplifies package installation and dependency
management. This tool was created in response to a broad survey of existing
Haskell users and potential users where cabal-install
was identified as the
single greatest issue for professional Haskell development.
The stack
tool is not just good by Haskell standards but excellent even
compared to other language package managers. Key features include:
- Excellent project isolation (including compiler isolation)
- Global caching of shared dependencies to avoid wasteful rebuilds
- Easily add local repositories or remote Github repositories as dependencies
stack
is also powered by Stackage, which is a very large Hackage mono-build
that ensures that a large subset of Hackage builds correctly against each
other and automatically notifies package authors to fix or update libraries
when they break the mono-build. Periodically this package set is frozen as a
Stackage LTS release which you can supply to the stack
tool in order to
select dependencies that are guaranteed to build correctly with each other.
Also, if all your projects use the same or similar LTS releases they will
benefit heavily from the shared global cache.
Educational resources:
Propaganda:
Haskell has decent logging support. That's pretty much all there is to say.
Rating: Mature
fast-logger
- High-performance multicore logging systemhslogger
- Logging library analogous to Python'sConfigParser
librarymonad-logger
- add logging with line numbers to your monad stack. Uses fast-logger under the hood.
Rating: Immature
This rating will switch to "Mature" once the "Haskell Programming from first principles" book is published. I highly recommend this book, even though it is still in Early Access form, for the following reasons:
- The book does not assume any prior programming experience
- The book does not have any conceptual gaps or out-of-order dependencies
- The book is extremely comprehensive
Educational resources:
- Haskell Programming from first principles - The best Haskell resource learn from. The book costs $60, but it's worth the price.
- Haskell Wikibook — One of the highest quality among Wikimedia's Wikibooks, which starts from zero, with no assumption of previous programming experience
- How I Start - Haskell — Example development environment and workflow
- Learn a Haskell for Great Good — A beginning Haskell book
- Real world Haskell — A book that contains several practical cookbook-style examples. Many code examples are out of date, but the book is still useful
- Parallel and Concurrent Programming in Haskell — Exactly what the title says
- Thinking Functionally with Haskell — Book targeting people who are interested in Haskell in order to "think differently"
- Haskell wiki — Grab bag of Haskell-related information with wide variation in quality. Excels at large lists of resources or libraries if you don't mind sifting through stale or abandoned entries
- The Haskell 2010 Report — The Haskell language specification
Rating: Immature
This is is not one of my areas of expertise, but what I do know is that Haskell
has bindings to most of the open source databases and datastores such as MySQL,
Postgres, SQLite, Cassandra, Redis, DynamoDB and MongoDB. However, I haven't really
evaluated the quality of these bindings other than the postgresql-simple
library, which is the only one I've personally used and was decent as far as I
could tell.
The "Immature" ranking is based on the lack of bindings to commercial databases like Microsoft SQL server and Oracle. So whether or not Haskell is right for you probably depends heavily on whether there are bindings to the specific data store you use.
Notable libraries:
mysql-simple
- MySQL bindingspostgresql-simple
- Postgres bindingspersistent
- Database-agnostic ORM that supports automatic migrationsesqueleto
/relational-record
/opaleye
- type-safe APIs for building well-formed SQL queriesacid-state
- Simple ACID data store that saves Haskell data types nativelyaws
- Bindings to Amazon DynamoDBhedis
- Bindings to Redis
Rating: Immature
The main Haskell debugging features are:
- Memory and performance profiling
- Stack traces
- Source-located errors, using the
assert
function - Breakpoints, single-stepping, and tracing within the GHCi REPL
- Informal
printf
-style tracing usingDebug.Trace
- ThreadScope
The two reasons I still mark debugging "Immature" are:
- GHC's stack traces require profiling to be enabled
- There is only one IDE that I know of (
leksah
) that integrates support for breakpoints and single-stepping andleksah
still needs more polish
ghc-7.10
also added preliminary support for DWARF symbols which allow support
for gdb
-based debugging and perf
-based profiling, but there is still more
work that needs to be done. See the following page for more details:
Educational resources:
- GHC Manual - Profiling chapter - Read the whole thing; you will thank me later
- Debugging runtime options - See the
+RTS -xc
flag which adds stack traces to all exceptions (requires profiling enabled) GHC.Stack
- Programmatic access to the call stack- Pinpointing space leaks in big programs
- Real World Haskell - Profiling and Optimization
- The GHCi Debuggger - Manual for GHCi-based breakpoints and single-stepping
- Parallel and Concurrent Programming in Haskell - Debugging, Tuning, and Interfacing with Foreign Code - Debugging concurrent programs
- Haskell wiki - ThreadScope
Rating: Immature
I give Haskell an "Immature" rating primarily due to poor user experience on Windows:
- Most Haskell tutorials assume a Unix-like system
- Several Windows-specific GHC bugs
- Poor IDE support (Most Windows programmers don't use a command-line editor)
This is partly a chicken-and-egg problem. Haskell has many Windows-specific issues because it has such a small pool of Windows developers to contribute fixes. Most Haskell developers are advised to use another operating system or a virtual machine to avoid these pain points, which exacerbates the problem.
The situation is not horrible, though. I know because I do half of my Haskell programming on Windows in order to familiarize myself with the pain points of the Windows ecosystem and most of the issues affect beginners and can be worked around by more experienced developers. I wouldn't say any individual issue is an outright dealbreaker; it's more like a thousand papercuts which turn people off of the language.
If you're a Haskell developer using Windows, I highly recommend the following installs to get started quickly and with as few issues as possible:
- Git for Windows - A Unix-like
command-line environment bundled with
git
that you can use to follow along with tutorials - MinGHC - Use this for project-independent Haskell experimentation
- Stack - Use this for project development
Additionally, learn to use the command line a little bit until Haskell IDE support improves. Plus, it's a useful skill in general as you become a more experienced programmer.
For Mac, the recommended installation is:
- Haskell for Mac - A self-contained relocatable GHC build for project-independent Haskell experimentation
- Stack - Use this for project development
For other operating systems, use your package manager of choice to install
ghc
and stack
.
Educational resources:
- Haskell wiki - Windows - Windows startup guide for Haskell
Rating: Immature
Haskell does provide support for hot code loading, although nothing in the same ballpark as in languages like Clojure.
There are two main approaches to hot code loading:
- Compiling and linking object code at runtime (i.e. the
plugins
orhint
libraries) - Recompiling the entire program and then reinitializing the program with the
program's saved state (i.e. the
dyre
orhalive
libraries)
You might wonder how Cloud Haskell sends code over the wire and my understanding is that it doesn't. Any function you wish to send over the wire is instead compiled ahead of time on both sides and stored in a shared symbol table which each side references when encoding or decoding the function.
Haskell does not let you edit a live program like Clojure does so Haskell will probably never be "Best in class" short of somebody releasing a completely new Haskell compiler built from the ground up to support this feature. The existing Haskell tools for hot code swapping seem as good as they are reasonably going to get, but I'm waiting for commercial success stories of their use before rating this "Mature".
The halive
library has the best hot code swapping demo by far:
Notable libraries:
plugins
/hint
- Runtime compilation and linkingdyre
/halive
- Program reinitialization with saved state
Rating: Immature
The best supported editors at the moment appear to be:
- Emacs/Spacemacs (via
haskell-mode
) - Vim (via
haskell-vim-now
) - Atom (via
ide-haskell
)
I am not the best person to review this area since I do not use an IDE myself. I'm basing this "Immature" rating purely on what I have heard from others. The impression I get is that the biggest pain point is that Haskell IDEs, IDE plugins, and low-level IDE tools keep breaking. The above three editors are the ones that have historically had the fewest setup issues.
Most of the Haskell early adopters have been vi
/vim
or emacs
users so
those editors have gotten the most love. Support for more traditional IDEs
has improved recently with Haskell plugins for Atom, IntelliJ and Eclipse and
also the Haskell-native leksah
IDE.
FPComplete has also released a web IDE for Haskell programming that is also worth checking out which is reasonably polished but cannot be used offline.
Also, if you have a Mac then the "Haskel for Mac" development environment is supposed to work really well for learning since it provides an interactive and visual playground for exploring the code.
Notable tools:
hoogle
— Type-based function searchhayoo
— Haskell function search covering more librarieshlint
— Code linterghc-mod
— editor agnostic tool that powers many IDE-like featuresghcid
— lightweight background type-checker that triggers on code changeshaskell-vim-now
- streamlined Haskell setup forvim
haskell-mode
— Umbrella project for Haskellemacs
supportstructured-haskell-mode
- structural editing based on Haskell syntax foremacs
codex
— Tags file generator for cabal project dependencies.hdevtools
— Persistent GHC-powered background server for development toolsghc-imported-from
— editor agnostic tool that finds Haddock documentation page for a symbol
IDE plugins:
- Atom (the
ide-haskell
plugin) - IntelliJ (the official plugin or Haskforce)
- Eclipse (the EclipseFP plugin)
IDEs:
- Haskell for Mac
- FPComplete Center
leksah
Educational resources:
- A Vim + Haskell Workflow
- Survey: Which Haskell development tools are you using that make you a more productive Haskell programmer?
- FPComplete Center - A web-based Haskell IDE
- Aaron Levin
- Alois Cochard
- Ben Kovach
- Benno Fünfstück
- Carlo Hamalainen
- Chris Allen
- Curtis Gagliardi
- Deech
- David Howlett
- David Johnson
- Edward Cho
- Greg Weber
- Gregor Uhlenheuer
- Juan Pedro Villa Isaza
- Kazu Yamamoto
- Kevin Cantu
- Kirill Zaborsky
- Liam O'Connor-Davis
- Luke Randall
- Marcio Klepacz
- Mitchell Rosen
- Nicolas Kaiser
- Oliver Charles
- Pierre Radermecker
- Rodrigo B. de Oliveira
- Stephen Diehl
- Tim Docker
- Tran Ma
- Yuriy Syrovetskiy
- @bburdette
- @co-dan
- @ExternalReality
- @GetContented
- @psibi