Skip to content
Haskell Avro Encoding and Decoding Native Support (no RPC)
Haskell
Branch: master
Clone or download
Latest commit a4cb52b Oct 4, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.circleci
.vscode
bench
src/Data Tip toe around MonadFail Oct 3, 2019
test Support primitive types declared as JSON objects. Refs: #116 Sep 26, 2019
.gitignore
.stylish-haskell.yaml
ChangeLog.md Replace entropy lib with tf-random for ghcjs compatibility Dec 21, 2018
LICENSE Schema initial stab Jun 4, 2016
README.md Derive generic Jul 4, 2018
Setup.hs Schema initial stab Jun 4, 2016
TODO Meh Aug 7, 2016
avro.cabal

README.md

Native Haskell implementation of Avro

This is a Haskell Avro library useful for decoding and encoding Avro data structures. Avro can be thought of as a serialization format and RPC specification which induces three separable tasks:

  • Serialization/Deserialization - This library has been used "in anger" for:
    • Deserialization of avro container files
    • Serialization/deserialization Avro messages to/from Kafka topics
  • RPC - There is currently no support for Avro RPC in this library.

This library also provides functionality for automatically generating Avro-related data types and instances from Avro schemas (using TemplateHaskell).

Quickstart

This library provides the following conversions between Haskell types and Avro types:

Haskell type Avro type
() "null"
Bool "boolean"
Int, Int64 "long"
Int32 "int"
Double "double"
Text "string"
ByteString "bytes"
Maybe a ["null", "a"]
Either a b ["a", "b"]
Map Text a {"type": "map", "value": "a"}
Map String a {"type": "map", "value": "a"}
HashMap Text a {"type": "map", "value": "a"}
HashMap String a {"type": "map", "value": "a"}
[a] {"type": "array", "value": "a"}

User defined data types should provide HasAvroSchema/ToAvro/FromAvro instances to be encoded/decoded to/from Avro.

Defining types and HasAvroSchema / FromAvro / ToAvro manually

Typically these imports are useful:

import           Data.Avro
import           Data.Avro.Schema as S
import qualified Data.Avro.Types  as AT

Assuming there is a data type to be encoded/decoded from/to Avro:

data Gender = Male | Female deriving (Eq, Ord, Show, Enum)
data Person = Person
     { fullName :: Text
     , age      :: Int32
     , gender   :: Gender
     , ssn      :: Maybe Text
     } deriving (Show, Eq)

Avro schema for this type can be defined as:

genderSchema :: Schema
genderSchema = mkEnum "Gender" [] Nothing Nothing ["Male", "Female"]

personSchema :: Schema
personSchema =
  Record "Person" Nothing [] Nothing Nothing
    [ fld "name"   String       Nothing
    , fld "age"    Int          Nothing
    , fld "gender" genderSchema Nothing
    , fld "ssn" (mkUnion $ Null :| [String]) Nothing
    ]
    where
     fld nm ty def = Field nm [] Nothing Nothing ty def

instance HasAvroSchema Person where
  schema = pure personSchema

ToAvro instance for Person can be defined as:

instance ToAvro Person where
  schema = pure personSchema
  toAvro p = record personSchema
             [ "name"   .= fullName p
             , "age"    .= age p
             , "gender" .= gender p
             , "ssn"    .= ssn p
             ]

FromAvro instance for Person can be defined as:

instance FromAvro Person where
  fromAvro (AT.Record _ r) =
    Person <$> r .: "name"
           <*> r .: "age"
           <*> r .: "gender"
           <*> r .: "ssn"
  fromAvro r = badValue r "Person"

Defining types and HasAvroSchema / FromAvro / ToAvro "automatically"

This library provides functionality to derive Haskell data types and HasAvroSchema/FromAvro/ToAvro instances "automatically" from already existing Avro schemas (using TemplateHaskell).

Examples

deriveAvro will derive data types, FromAvro and ToAvro instances from a provided Avro schema file:

{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE DeriveGeneric   #-}
import Data.Avro.Deriving

deriveAvro "schemas/contract.avsc"

Similarly, deriveFromAvro can be used to only derive data types and FromAvro, but not ToAvro instances.

If you prefer defining Avro schema in Haskell and not in avsc, then deriveAvro' can be used instead of deriveAvro.

Conventions

When Haskell data types are generated, these conventions are followed:

  • Type and field names are "sanitized": all the charachers except [a-z,A-Z,',_] are removed from names
  • Field names are prefixed with the name of the record they are declared in.

For example, if Avro schema defines Person record as:

{ "type": "record",
  "name": "Person",
  "fields": [
    { "name": "name", "type": "string"}
  ]
}

then generated Haskell type will look like:

data Person = Person
     { personName :: Text
     } deriving (Show, Eq)

Limitations

Two-parts unions like ["null", "MyType"] or ["MyType", "YourType"] are supported (as Haskell's Maybe MyType and Either MyType YourType), but multi-parts unions are currently not supported. It is not due to any fundamental problems but because it has not been done yet. PRs are welcomed! :)

TODO

Please see the TODO

You can’t perform that action at this time.