Permalink
Switch branches/tags
Nothing to show
Find file Copy path
3aa7399 Dec 3, 2016
715 lines (586 sloc) 20.6 KB
title
Building a JSON REST API in Haskell

I spent the last couple weeks building a JSON REST API in Haskell. I called it Hairy. It's available on GitHub and on Hackage. I learned a lot in the process and wanted to share what I learned. I also wanted to put my code out there to see what I could do better. Patches welcome!

Before we dive into the code, let's take a look at what we'll end up with. Assuming you're running it on port 3000, which is the default, here's what some CRUD actions look like.

$ http post :3000/tasks \
  content='Do something!' created='2014-10-21T00:00:00Z'
{ "content": "Do something!"
, "created": "2014-10-21T00:00:00.000Z"
}
$ http :3000/tasks
[ { "content": "Do something!"
  , "created": "2014-10-21T00:00:00.000Z"
  , "id": 2
  }
]
$ http :3000/tasks/2
{ "content": "Do something!"
, "created": "2014-10-21T00:00:00.000Z"
}
$ http put :3000/tasks/2 \
  content='Do something else!' created='2014-10-21T00:00:00Z'
{ "content": "Do something else!"
, "created": "2014-10-21T00:00:00.000Z"
}
$ http delete :3000/tasks/2
null

This is a Literate Haskell file. Only lines that start with > are program code. Assuming you have the dependencies, you should be able to run it with this command:

$ runhaskell -optL -q this-post.lhs Hairy/Models.hs

Without any further ado, let's get to it!

Boilerplate

Before we can begin, we need to enable a few language extensions. The first allows string literals (like "cheese") to represent string-like types such as Text. It's not strictly required since you can accomplish the same thing using pack. But it's so convenient that it's hard to live without.

> {-# LANGUAGE OverloadedStrings #-}

These next couple are a little harder to explain, so I'll hold off doing that until we use them.

> {-# LANGUAGE FlexibleContexts #-}
> {-# LANGUAGE GeneralizedNewtypeDeriving #-}

Now that we've enabled the language extensions that we need, we have to let GHC know that our module is called Hairy. If we didn't do this, it would assume it's called Main.

> module Hairy where

Imports make up the last bit of the boilerplate. These are a little overly-specific in order to make it easier to see where everything comes from. In the real world you might import everything from Web.Scotty.Trans instead of explicitly listing the things you need from it.

For the most part, you shouldn't worry about these. If you're curious about something later on, come back up here to see where it's imported from. Then look it up on Hackage.

> import Control.Applicative (Applicative)
> import Control.Monad.IO.Class (MonadIO, liftIO)
> import Control.Monad.Logger (runNoLoggingT, runStdoutLoggingT)
> import Control.Monad.Reader (MonadReader, ReaderT, asks,
>   runReaderT)
> import Control.Monad.Trans.Class (MonadTrans, lift)
> import Data.Aeson (Value (Null), (.=), object)
> import Data.Default (def)
> import qualified Data.Text as T
> import Data.Text.Encoding (encodeUtf8)
> import Data.Text.Lazy (Text)
> import qualified Database.Persist as DB
> import qualified Database.Persist.Postgresql as DB
> import Network.HTTP.Types.Status (created201,
>   internalServerError500, notFound404)
> import Network.Wai (Middleware)
> import Network.Wai.Handler.Warp (Settings, defaultSettings,
>   setFdCacheDuration, setPort)
> import Network.Wai.Middleware.RequestLogger (logStdout,
>   logStdoutDev)
> import System.Environment (lookupEnv)
> import Web.Heroku (parseDatabaseUrl)
> import Web.Scotty.Trans (ActionT, Options, ScottyT,
>   defaultHandler, delete, get, json, jsonData, middleware,
>   notFound, param, post, put, scottyOptsT, settings,
>   showError, status, verbose)

This next import is special. If you're following along at home, you'll need to create a folder called Hairy and put Models.hs in it. I didn't include the models in this file because I couldn't get the Template Haskell to play nicely with the Literate Haskell.

> import Hairy.Models (Task, TaskId, migrateAll)

Main

With all that out of the way, we can start on the program itself. The top-level entry point, main, only has two responsibilities: get the current configuration and run the application with that configuration.

> main :: IO ()
> main = do
>   c <- getConfig
>   runApplication c

This is the same as the more succinct point-free version.

main :: IO ()
main = getConfig >>= runApplication

Configuration

Getting the current configuration involves reading the environment from the system and then setting up the database connection pool. After doing both of those, we create a new Config value with the environment and pool.

> getConfig :: IO Config
> getConfig = do
>   e <- getEnvironment
>   p <- getPool e
>   return Config
>     { environment = e
>     , pool = p
>     }

The data type for Config is pretty simple. It has two fields. The first is the environment, which uses a data type we'll define shortly. The second is Persistent's ConnectionPool.

> data Config = Config
>   { environment :: Environment
>   , pool :: DB.ConnectionPool
>   }

Environment

We want to read the environment from the SCOTTY_ENV environment variable, then parse that string as our Environment data type and return it. If it doesn't parse, we'll just blow up.

$ env SCOTTY_ENV=not-an-environment cabal run
hairy: Prelude.read: no parse

If we wanted to handle it more gracefully, we could use Text.Read.readMaybe. However I'm ok with the exception here.

> getEnvironment :: IO Environment
> getEnvironment = do
>   m <- lookupEnv "SCOTTY_ENV"
>   let e = case m of
>         Nothing -> Development
>         Just s -> read s
>   return e

This also could've been written in the point-free style.

getEnvironment :: IO Environment
getEnvironment = fmap
  (maybe Development read)
  (lookupEnv "SCOTTY_ENV")

Now that we've seen how to get the environment, let's see what the possible environments are. You could add more environments, like Staging, to suite your particular needs.

> data Environment
>   = Development
>   | Production
>   | Test
>   deriving (Eq, Read, Show)

The only instance we need to derive is Read. Show is handy for debugging and error reporting. Eq is useful if you want to do different things based on the environment without pattern matching.

Database

Next let's take a look at the database connection pool. It will be used by the application to make database queries, so it's responsible for configuring the database itself. That means logging, connection parameters, and pool size. To start, the top-level function gets the connection parameters and pool size, then determines which kind of logging to use.

> getPool :: Environment -> IO DB.ConnectionPool
> getPool e = do
>   s <- getConnectionString e
>   let n = getConnectionSize e
>   case e of
>     Development -> runStdoutLoggingT
>       (DB.createPostgresqlPool s n)
>     Production -> runStdoutLoggingT
>       (DB.createPostgresqlPool s n)
>     Test -> runNoLoggingT
>       (DB.createPostgresqlPool s n)

This function is a little weird. I wish it could be written like this:

getPool :: Environment -> IO DB.ConnectionPool
getPool e = do
  s <- getConnectionString e
  let n = getConnectionSize e
      p = DB.createPostgresqlPool s n
      t = case e of
        Development -> runStdoutLoggingT
        Production -> runStdoutLoggingT
        Test -> runNoLoggingT
  t p

Unfortunately the type system won't allow it. runStdoutLoggingT and runNoLoggingT work on different monad transformers. createPostgresqlPool is fine with either of them, but it can't accept both simultaneously.

Anyway, just as we looked up the environment through SCOTTY_ENV, we're going to look up the database connection parameters through DATABASE_URL. It's expected to look like this: postgres://user:pass@host:port/db. If it doesn't look like that, we'll blow up.

$ env DATABASE_URL=not-a-database-url cabal run
hairy: couldn't parse absolute uri
> getConnectionString :: Environment -> IO DB.ConnectionString
> getConnectionString e = do
>   m <- lookupEnv "DATABASE_URL"
>   let s = case m of
>         Nothing -> getDefaultConnectionString e
>         Just u -> createConnectionString (parseDatabaseUrl u)
>   return s

If it's not given at all, we'll fall back to using a hard-coded default based on the environment.

> getDefaultConnectionString :: Environment -> DB.ConnectionString
> getDefaultConnectionString e =
>   let n = case e of
>         Development -> "hairy_development"
>         Production -> "hairy_production"
>         Test -> "hairy_test"
>   in  createConnectionString
>         [ ("host", "localhost")
>         , ("port", "5432")
>         , ("user", "postgres")
>         , ("dbname", n)
>         ]

We used a utility function that converts a list of text tuples into a database connection string, which is byte string. It joins each tuple with an equals sign and then joins each element in the list with a space.

createConnectionString [("k1", "v1"), ("k2", "v2")]
-- "k1=v1 k2=v2"

This is necessary to convert what Web.Heroku.parseDatabaseUrl gives us into something that Persistent can understand. It also makes our definition of getDefaultConnectionString easier to read.

> createConnectionString
>   :: [(T.Text, T.Text)] -> DB.ConnectionString
> createConnectionString l =
>   let f (k, v) = T.concat [k, "=", v]
>   in  encodeUtf8 (T.unwords (map f l))

The last piece of the database puzzle is the size of the connection pool. In the real world you'd need to benchmark performance using different sizes to see what works best. A good baseline is two times the number of cores. That could be expressed here using GHC.Conc.numCapabilities, but there's no guarantee that the web server and the database server are even running on the same machine.

> getConnectionSize :: Environment -> Int
> getConnectionSize Development = 1
> getConnectionSize Production = 8
> getConnectionSize Test = 1

Transformers

So we've set up our environment and our database connection. That's enough to let us move on to setting up the application itself. All we need to do here is get the options for Scotty and set up a runner for reading the configuration.

> runApplication :: Config -> IO ()
> runApplication c = do
>   o <- getOptions (environment c)

This takes Scotty's monad m and adds the ability to read our custom config c from it. This is called a monad transformer stack. It allows us to use any monad in the stack. So after adding our reader monad, we can both deal with requests (using Scotty's monad) and read our config (using our monad).

>   let r m = runReaderT (runConfigM m) c
>   scottyOptsT o r r application

Next we'll actually define our reader monad. This requires GeneralizedNewtypeDeriving to easily and efficiently derive instances for our type alias. The type signature of runConfigM tells us that it adds the ability to read Config to the IO monad, which is the bottom of Scotty's monad transformer stack.

> newtype ConfigM a = ConfigM
>   { runConfigM :: ReaderT Config IO a
>   } deriving (Applicative, Functor, Monad, MonadIO,
>     MonadReader Config)
Monad transformers are a confusing topic. Read my other post about them for a deeper look.

Options

Let's circle back and see how we get Scotty's options. The data type exposed only has two fields, so there's not a lot for us to do here.

> getOptions :: Environment -> IO Options
> getOptions e = do
>   s <- getSettings e
>   return def
>     { settings = s
>     , verbose = case e of
>       Development -> 1
>       Production -> 0
>       Test -> 0
>     }

I explicitly listed all of the environments here to ensure that I got all of them. In the real world you might use a wildcard match.

verbose = case e of
  Development -> 1
  _ -> 0

Or you can accomplish the same thing on a single line. (This uses the Eq instance we derived for Environment.)

verbose = fromEnum (e == Development)

Settings

Most of the real options are in Wai's settings. The defaults are good for most of them, but we want to make two changes. First, we need to remove the file cache so that static file changes will be picked up. We only want to do this in development since static files should be static in other environments. Then if the PORT environment variable exists, we want to use it to set the port.

> getSettings :: Environment -> IO Settings
> getSettings e = do
>   let s = defaultSettings

Here I'm using primes (') to mark altered versions of the settings. There are probably better ways to do this type of modification, but this works and is fairly straightforward.

>       s' = case e of
>         Development -> setFdCacheDuration 0 s
>         Production -> s
>         Test -> s
>   m <- getPort
>   let s'' = case m of
>         Nothing -> s'
>         Just p -> setPort p s'
>   return s''

Finally we need to handle looking up the port. Like our other functions that read from environment variables, this one will blow up if you give it something it's not expecting.

$ env PORT=not-a-port cabal run
hairy: Prelude.read: no parse
> getPort :: IO (Maybe Int)
> getPort = do
>   m <- lookupEnv "PORT"
>   let p = case m of
>         Nothing -> Nothing
>         Just s -> Just (read s)
>   return p

Errors

The last bit of configuration is to set up our error type. We're going to make it an alias for Text. You could do something fancier here by enumerating the possible error states for your application.

data Error = NotFoundError | ForbiddenError | ...
instance ScottyError Error where ...

We're alright with the default textual errors, so we don't need anything that fancy yet.

> type Error = Text

That wraps up all of the configuration, options, and settings. Everything from here on out deals with the application itself.

Application

Our application has several responsibilities. It needs to run database migrations, set up middleware, install a default exception handler, and define routes. Since everything else could conceivably depend on the database, we'll run the migrations first.

> application :: ScottyT Error ConfigM ()
> application = do
>   runDB (DB.runMigration migrateAll)

runDB is a utility function we'll define a little later. It basically lifts a database operation into the current Scotty monad. migrateAll comes from Hairy.Models and is generated by Persistent's mkMigrate Template Haskell function.

Now that the database has been migrated, we can set up middleware and exception handlers. Both of them depend on the environment, so we have to get that from our reader monad first.

>   e <- lift (asks environment)
>   middleware (loggingM e)
>   defaultHandler (defaultH e)

Finally we can do the routing for our application. All we need is the HTTP method, the path, and the action to route it to.

>   get "/tasks" getTasksA
>   post "/tasks" postTasksA
>   get "/tasks/:id" getTaskA
>   put "/tasks/:id" putTaskA
>   delete "/tasks/:id" deleteTaskA

Routes are matched top down, so if nothing else matched we'll render our not found action.

>   notFound notFoundA

That's it! As your application grows you'll add more routes and middleware, but the basic structure shouldn't change too much.

Let's take a look at that runDB helper we used. It takes a SQL query q and runs it inside our monad transformer stack. It does this by asking the reader monad for the database connection pool, then running the query with that pool in the IO monad.

> runDB :: (MonadTrans t, MonadIO (t ConfigM)) =>
>   DB.SqlPersistT IO a -> t ConfigM a
> runDB q = do
>   p <- lift (asks pool)
>   liftIO (DB.runSqlPool q p)

Middleware

Up next is the logging middleware. In development we want colorful multiline logs flushed every request. In production we want plain log lines flushed sometimes. In testing we don't want logging at all.

> loggingM :: Environment -> Middleware
> loggingM Development = logStdoutDev
> loggingM Production = logStdout
> loggingM Test = id

Before we define our default exception handler, let's create an alias for our Scotty actions. They're all going to have the same type, so we don't want to repeat ourselves over and over again.

> type Action = ActionT Error ConfigM ()

Since our default exception handler handles uncaught exceptions in our application, we want it print out the exceptions in development but swallow them in production. We don't really care what happens to them in testing. In the real world you might send the exception to another service.

> defaultH :: Environment -> Error -> Action
> defaultH e x = do
>   status internalServerError500
>   let o = case e of
>         Development -> object ["error" .= showError x]
>         Production -> Null
>         Test -> object ["error" .= showError x]
>   json o

Actions

At long last we can get to the meat of our application: the actions. This is where all of your business logic lives. Since Hairy is just a basic CRUD app, there's not a lot going on here.

This action gets all the tasks from the database and renders them as JSON.

> getTasksA :: Action
> getTasksA = do
>   ts <- runDB (DB.selectList [] [])
>   json (ts :: [DB.Entity Task])

This one allows you to create new tasks by posting JSON to it. If the JSON isn't valid, an exception will be raised. That means in development you'll get a helpful error message, but in production you'll get a blank 500.

$ curl -X POST localhost:3000/tasks -d not-json
{"error":"jsonData - no parse: not-json"}
> postTasksA :: Action
> postTasksA = do
>   t <- jsonData
>   runDB (DB.insert_ t)
>   status created201
>   json (t :: Task)

This action gets a task from the database. If it was found, it renders it as JSON. If it wasn't, it renders the generic not found action.

> getTaskA :: Action
> getTaskA = do
>   i <- param "id"
>   m <- runDB (DB.get (toKey i))
>   case m of
>     Nothing -> notFoundA
>     Just t -> json (t :: Task)

This one will either update an existing task or create a new one with the given ID. Then it renders the task as JSON.

> putTaskA :: Action
> putTaskA = do
>   i <- param "id"
>   t <- jsonData
>   runDB (DB.repsert (toKey i) t)
>   json (t :: Task)

This is the last action. It will delete a task with the given ID. If there is no such task, it returns 200 anyway. In either case, null is returned.

> deleteTaskA :: Action
> deleteTaskA = do
>   i <- param "id"
>   runDB (DB.delete (toKey i :: TaskId))
>   json Null

Utilities

That wraps up the business logic. We only have a couple other things to attend to. We used toKey, a helper function that converts a request parameter into a database key. It allows us to query for stuff from the database using request parameters.

This helper function requires the FlexibleContexts language extension, although I can't really tell you why. If you don't have it, GHC complains. If you do have it, everything works fine.

> toKey :: DB.ToBackendKey DB.SqlBackend a =>
>   Integer -> DB.Key a
> toKey i = DB.toSqlKey (fromIntegral (i :: Integer))

The last thing we need to do is define our not found action. All it does is set the HTTP status to 404 and render null.

> notFoundA :: Action
> notFoundA = do
>   status notFound404
>   json Null

That's all there is to it! With less than 200 lines of code we've created a JSON REST API with some CRUD actions. It's all backed by a database and can be configured to run in development mode on your machine or in production on Heroku.

I used a few excellent resources while working on this post, including: