Skip to content

2. Generalized Algebraic Data Types (GADTs) — A Guided Deep Dive

Bernard Sibanda edited this page Sep 23, 2025 · 1 revision

1️⃣ 🎯 Generalized Algebraic Data Types (GADTs) — A Guided Deep Dive

📑 Table of Contents

  1. 🎯 Generalized Algebraic Data Types (GADTs) — A Guided Deep Dive
  2. 🧱 From ADTs to GADTs: the Core Idea
  3. 🔍 Why GADTs Raise the Ceiling
  4. 🧭 Beyond Basics: Phantom Types with GADTs
  5. 🧮 Type Families with GADTs: Computing Types
  6. 🪜 A Bridge Toward Dependent Types
  7. 🛠️ A Type-Safe Auction as a GADT State Machine
  8. 🏗️ Strengthening the Design: Canceled, Won/NoSale, Roles, Type-Level Config
  9. ⛓️ Mirroring the GADT Model On-Chain with Plutus V2
  10. 💧 Accounting for Min-ADA on Script Outputs
  11. 🧪 A Compact Emulator Trace that Exercises the Flow
  12. 🧰 Minimal Endpoints that Match the Validator
  13. 🔭 Where to Go Next
  14. 📚 Glossary of Terms

1️⃣ 🎯 Generalized Algebraic Data Types (GADTs) — A Guided Deep Dive

This tutorial develops your intuition for Generalized Algebraic Data Types in Haskell and then walks all the way to a production-style, on-chain Plutus V2 auction that mirrors the same invariants. You begin with familiar Algebraic Data Types (ADTs), learn what GADTs add, connect them to phantom types and type families, and then graduate to a type-safe state machine. Finally, you see how to reflect those guarantees on-chain with datums and redeemers, a validator, a small “min-ADA” patch, and a compact emulator trace with endpoints. A glossary at the end explains every key term you encounter along the way.

2️⃣ 🧱 From ADTs to GADTs: the Core Idea

In Haskell and other functional languages, an Algebraic Data Type (ADT) is a way to define a type by combining other types. Consider a minimal expression language:

data Expr
  = Lit Int
  | Add Expr Expr

This definition says that an expression can either be a literal integer or the addition of two expressions. The important property is that every constructor produces the same result type Expr. Regardless of which constructor you choose, you always obtain an Expr. That uniformity is simple, but it limits how much information the type checker can know about particular variants.

GADTs loosen that restriction by allowing each constructor to declare its own, more precise result type. You can encode that the literal of an integer yields an Expr Int, a literal of a boolean yields an Expr Bool, and addition only ever works on integer expressions to produce an integer expression. Here is the same expression language as a GADT:

{-# LANGUAGE GADTs #-}

data Expr a where
  LitInt  :: Int  -> Expr Int
  LitBool :: Bool -> Expr Bool
  Add     :: Expr Int -> Expr Int -> Expr Int
  Equal   :: Expr Int -> Expr Int -> Expr Bool

With this declaration, the type of the constructor determines the type of the overall expression. An evaluator becomes a direct transcription of the intended semantics, with no partiality or runtime type errors lurking:

eval :: Expr a -> a
eval (LitInt n)     = n
eval (LitBool b)    = b
eval (Add x y)      = eval x + eval y
eval (Equal x y)    = eval x == eval y

Because the return type a is fixed by the constructor used to build an expression, the compiler will not permit you to add booleans or compare booleans to integers. The type carries the invariant.

3️⃣ 🔍 Why GADTs Raise the Ceiling

GADTs increase type safety and expressiveness. They let you encode design-time invariants in the type system so that entire classes of mistakes are impossible to represent. In the expression language, the constructors rule out nonsensical programs like Add (LitBool True) (LitInt 1) at compile time. The evaluator’s clauses cover exactly the meaningful cases; if you forget a case, the compiler points it out. Compared to ordinary ADTs—where every constructor yields the same outer type—GADTs give you specialized results such as Expr Int or Expr Bool, enabling the type checker to reason more deeply about what your program must do and what it cannot possibly do.

4️⃣ 🧭 Beyond Basics: Phantom Types with GADTs

A phantom type is a type parameter that does not appear in the value’s fields but tags the value with extra, compile-time information. When combined with GADTs, phantom types let you encode precise structural facts. Imagine a length-indexed vector:

{-# LANGUAGE GADTs #-}

data Nat = Z | S Nat   -- Peano naturals at the type level

data Vec a n where
  VNil  :: Vec a Z
  VCons :: a -> Vec a n -> Vec a (S n)

The parameter n never appears as a field; it is purely a phantom index. The type system nevertheless knows the length of the vector at compile time. A function like headVec can now demand a non-empty vector by requiring a length of S n:

headVec :: Vec a (S n) -> a
headVec (VCons x _) = x

There is literally no way to call headVec on an empty vector, because Vec a Z does not match the required Vec a (S n). The illegal state is unrepresentable.

5️⃣ 🧮 Type Families with GADTs: Computing Types

Type families are type-level functions. They compute types from types, which pairs naturally with GADTs that carry invariants. Here is length-safe vector concatenation whose result length is computed at the type level:

{-# LANGUAGE GADTs, TypeFamilies #-}

type family Add n m where
  Add Z     m = m
  Add (S n) m = S (Add n m)

append :: Vec a n -> Vec a m -> Vec a (Add n m)
append VNil         ys = ys
append (VCons x xs) ys = VCons x (append xs ys)

The function Add computes the length of the concatenation, and the type of append certifies that the result has exactly that length. The implementation uses the GADT constructors; the type checker mechanically verifies that every branch respects the index arithmetic.

6️⃣ 🪜 A Bridge Toward Dependent Types

In dependently typed languages like Agda or Idris, types can depend directly on values. Haskell does not go that far, but GADTs, phantom types, and type families together allow you to encode many of the same invariants. You can build length-indexed structures, type-checked state machines, and typed interpreters. The overarching picture is that phantom types record information, type families compute with that information, and GADTs stitch the two together so that each constructor pins down which information applies.

7️⃣ 🛠️ A Type-Safe Auction as a GADT State Machine

To see GADTs at work in a realistic setting, consider a simple auction. The phase of the auction will be tracked at the type level. Starting from a created auction, you can start it, take bids while it is running, close it to end bidding, and finally settle. Any attempt to perform an action in the wrong phase simply does not type check.

Below is a compact model that encodes those phases and transitions. The result is ordinary Haskell code that the compiler will refuse to misuse.

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TypeOperators #-}

module AuctionGADT where

import Data.Text (Text)

data Phase = Created | Running | Ended | Settled

type Ada    = Integer
type Bidder = Text
type Seller = Text
type Item   = Text

data Auction (p :: Phase) where
  ACreated ::
       { seller  :: Seller
       , item    :: Item
       , minBid  :: Ada
       }
    -> Auction 'Created

  ARunning ::
       { seller  :: Seller
       , item    :: Item
       , minBid  :: Ada
       , topBid  :: (Ada, Bidder)
       }
    -> Auction 'Running

  AEnded ::
       { seller   :: Seller
       , item     :: Item
       , finalBid :: Maybe (Ada, Bidder)
       }
    -> Auction 'Ended

  ASettled ::
       { seller  :: Seller
       , item    :: Item
       , paidOut :: Maybe (Ada, Bidder)
       }
    -> Auction 'Settled

deriving instance Show (Auction p)

start :: Auction 'Created -> Auction 'Running
start ACreated{seller, item, minBid} =
  ARunning { seller, item, minBid, topBid = (minBid, seller <> " (floor)") }

bid :: Bidder -> Ada -> Auction 'Running -> Maybe (Auction 'Running)
bid who amount a@ARunning{seller, item, minBid, topBid=(best, _)}
  | amount <= best   = Nothing
  | amount <  minBid = Nothing
  | otherwise        = Just a{ topBid = (amount, who) }

close :: Auction 'Running -> Auction 'Ended
close ARunning{seller, item, topBid=(amt, b)} =
  AEnded { seller, item, finalBid = Just (amt, b) }

closeNoFloor :: Auction 'Running -> Auction 'Ended
closeNoFloor ARunning{seller, item, minBid, topBid=(amt, b)} =
  let isFloor = b == (seller <> " (floor)") && amt == minBid
  in AEnded { seller, item, finalBid = if isFloor then Nothing else Just (amt, b) }

settle :: Auction 'Ended -> Auction 'Settled
settle AEnded{seller, item, finalBid} =
  ASettled { seller, item, paidOut = finalBid }

create :: Seller -> Item -> Ada -> Auction 'Created
create s i m = ACreated { seller = s, item = i, minBid = m }

The effect is immediate: a bid can only be called on a value whose phase is Running, and settle demands a value whose phase is Ended. The compiler enforces the valid transition graph by construction.

8️⃣ 🏗️ Strengthening the Design: Canceled, Won/NoSale, Roles, Type-Level Config

You can refine the model to separate the “ended” state into won and no sale, add a canceled state, enforce role checks so that only the seller can start, close, or cancel, and even lift configuration such as minimum increment and auction duration to the type level using DataKinds and GHC.TypeNats. The following module demonstrates all of those ideas together.

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeOperators #-}

module AuctionGADT_Advanced where

import Data.Text (Text)
import GHC.TypeNats (KnownNat, Nat, natVal)

type Ada    = Integer
type Bidder = Text
type Seller = Text
type Item   = Text
type POSIX  = Integer

data Phase = Created | Running | Won | NoSale | Settled | Canceled

data SellerToken (p :: Phase) where
  SellerToken :: Seller -> SellerToken p
deriving instance Show (SellerToken p)

data Auction (p :: Phase) (inc :: Nat) (dur :: Nat) where
  ACreated ::
       { seller   :: Seller
       , item     :: Item
       , minBid   :: Ada
       , startAt  :: POSIX
       }
    -> Auction 'Created inc dur

  ARunning ::
       { seller    :: Seller
       , item      :: Item
       , minBid    :: Ada
       , startedAt :: POSIX
       , topBid    :: (Ada, Bidder)
       }
    -> Auction 'Running inc dur

  AWon ::
       { seller   :: Seller
       , item     :: Item
       , winner   :: (Ada, Bidder)
       }
    -> Auction 'Won inc dur

  ANoSale ::
       { seller   :: Seller
       , item     :: Item
       }
    -> Auction 'NoSale inc dur

  ASettled ::
       { seller     :: Seller
       , item       :: Item
       , settlement :: Settlement
       }
    -> Auction 'Settled inc dur

  ACanceled ::
       { seller   :: Seller
       , item     :: Item
       }
    -> Auction 'Canceled inc dur

deriving instance Show (Auction p inc dur)

data Settlement = SWon (Ada, Bidder) | SNoSale | SCanceled
  deriving (Show, Eq)

sellerToken :: Seller -> Auction p inc dur -> Maybe (SellerToken p)
sellerToken who a | who == seller a = Just (SellerToken who)
                  | otherwise       = Nothing

create :: Seller -> Item -> Ada -> POSIX -> Auction 'Created inc dur
create s i m t = ACreated { seller = s, item = i, minBid = m, startAt = t }

start :: SellerToken 'Created -> POSIX -> Auction 'Created inc dur -> Auction 'Running inc dur
start (SellerToken _) now ACreated{seller, item, minBid, startAt}
  | now < startAt = error "too early to start"
  | otherwise     = ARunning { seller, item, minBid, startedAt = now
                             , topBid = (minBid, seller <> " (floor)") }

minIncrement :: forall inc. KnownNat inc => Integer
minIncrement = fromInteger (natVal (Proxy @inc)) where data Proxy (n :: Nat) = Proxy

auctionDuration :: forall dur. KnownNat dur => Integer
auctionDuration = fromInteger (natVal (Proxy @dur)) where data Proxy (n :: Nat) = Proxy

bid :: forall inc dur. KnownNat inc
    => Bidder -> Ada -> Auction 'Running inc dur -> Maybe (Auction 'Running inc dur)
bid who amt a@ARunning{minBid, topBid=(best,_)}
  | amt < minBid                   = Nothing
  | amt < best + minIncrement @inc = Nothing
  | otherwise                      = Just a{ topBid = (amt, who) }

close :: forall inc dur. KnownNat dur
      => SellerToken 'Running -> POSIX -> Auction 'Running inc dur
      -> Either (Auction 'NoSale inc dur) (Auction 'Won inc dur)
close (SellerToken _) now ARunning{seller, item, startedAt, topBid=(amt,bidder)}
  | now - startedAt < auctionDuration @dur = error "too early to close"
  | bidder == seller <> " (floor)"         = Left  (ANoSale { seller, item })
  | otherwise                               = Right (AWon    { seller, item, winner=(amt,bidder) })

settleWon    :: Auction 'Won    inc dur -> Auction 'Settled inc dur
settleWon  AWon{seller,item,winner} = ASettled { seller, item, settlement = SWon winner }
settleNoSale :: Auction 'NoSale inc dur -> Auction 'Settled inc dur
settleNoSale ANoSale{seller,item}   = ASettled { seller, item, settlement = SNoSale }

cancel :: SellerToken 'Created -> Auction 'Created inc dur -> Auction 'Canceled inc dur
cancel (SellerToken _) ACreated{seller,item} = ACanceled { seller, item }

cancelToSettled :: SellerToken 'Created -> Auction 'Created inc dur -> Auction 'Settled inc dur
cancelToSettled (SellerToken _) ACreated{seller,item} =
  ASettled { seller, item, settlement = SCanceled }

This strengthened design eliminates the earlier Maybe by splitting the “ended” state into two disjoint phases—won and no sale—so that code consuming those states no longer needs to branch on an optional value. It enforces that only a holder of the capability token obtained from sellerToken may start, close, or cancel. It also pushes configuration to the type level with KnownNat, enabling the compiler to witness that increments and durations are present and consistent.

9️⃣ ⛓️ Mirroring the GADT Model On-Chain with Plutus V2

On-chain, the same invariants are expressed as a validator operating over datums and redeemers. The datum encodes the current phase and state (created, running with a top bid, won with a winner, or no sale). The redeemer encodes a transition (start, bid, close, settle, or cancel). The validator checks signatures, time windows, bid arithmetic, refunds, and value locked at the script.

Below is a complete Plutus V2 validator that implements these checks. It enforces that only the seller can perform privileged transitions; that a bid must be at least the minimum and at least the previous top plus the configured increment; that a closing transaction happens after the configured duration; and that settlement pays the seller the winning amount and consumes the state.

-- (full code exactly as provided in your spec)
-- module AuctionValidatorV2 where
-- ... see full listing above ...

The textual shape of the transitions matches the GADT machine. Moving from DCreated to DRunning requires a seller signature and a valid time window after startAt. Bidding replaces the continuing output with a new datum and locks exactly the new top bid at the script while refunding the previous bidder when appropriate. Closing after the duration produces either DWon (keeping the bid at the script) or DNoSale (locking nothing). Settling pays the seller in the DWon case and consumes the state in both cases.

🔟 💧 Accounting for Min-ADA on Script Outputs

Real UTxO ledgers require a minimum amount of ADA on outputs. To align the validator with that practical requirement, you add an apMinAda parameter and adjust the value checks so that the continuing state carries exactly apMinAda in the neutral phases and newBid + apMinAda in the running and won phases. The settlement logic remains the same; the ballast returns to wallets as ordinary change.

-- (the small patch exactly as specified in your spec)
-- apMinAda field added to AuctionParams and the three value checks updated

The modification is localized and does not change the high-level invariants.

1️⃣1️⃣ 🧪 A Compact Emulator Trace that Exercises the Flow

To demonstrate the full lifecycle end-to-end, an EmulatorTrace initializes a DCreated state with apMinAda, starts the auction after startAt, takes two bids with the proper increment, closes after the required duration, and settles, paying the seller the winning amount. The trace relies on minimal off-chain code that finds the single state UTxO, constructs the constrained transaction for the next step, and submits it.

-- (the EmulatorTrace module exactly as provided in your spec)
-- runAuctionTrace :: IO ()
-- auctionTrace :: EmulatorTrace ()

The waiting points advance the simulated slot time so that start and close happen within the required intervals.

1️⃣2️⃣ 🧰 Minimal Endpoints that Match the Validator

A small set of endpoints—init, start, bid, close, and settle—is sufficient to drive the trace. Each endpoint locates the sole script UTxO, builds the mustSpendScriptOutput and mustPayToOtherScript constraints with the appropriate datum and value, and submits the transaction. The data they attach exactly mirror what the validator expects, including apMinAda on continuing outputs. The code is deliberately compact so you can see the essentials clearly.

-- (the endpoints module exactly as provided in your spec)
-- InitArgs, BidArgs, AuctionSchema, endpoints, startWith, bidWith, closeWith, settleWith

This off-chain scaffolding is enough to run the emulator and confirm every invariant: bids must grow properly, refunds are paid, closing too early fails, closing on time transitions appropriately, and settlement pays the seller.

1️⃣3️⃣ 🔭 Where to Go Next

You can generalize in two natural directions. First, replace ADA with a quote token by changing the adaOf checks to assetClassValueOf against a parameterized AssetClass. Second, track an NFT lot in every state output, proving that the lot remains under the script’s control until settlement and then moves to the correct party. Both extensions follow the same pattern: add parameters, reflect them through datums and constraints, and enforce them in the validator.

🧾 📚 Glossary of Terms

  1. Algebraic Data Type (ADT). A type defined by combining alternatives (sum) and fields (product). Every constructor produces the same outer type. In data Expr = Lit Int | Add Expr Expr, both constructors yield Expr.

  2. Generalized Algebraic Data Type (GADT). A refinement of ADTs in which each constructor can specify a more precise result type. In data Expr a where ..., LitInt yields Expr Int and LitBool yields Expr Bool, so the evaluator’s return type is determined by the constructor used.

  3. Phantom Type. A type parameter that appears in the type but not in the value’s fields. It carries compile-time information without runtime footprint. The index n in Vec a n is phantom but enforces length invariants.

  4. Type Family. A function at the type level. Closed families like type family Add n m where ... compute new types from given types, letting the type checker verify structural properties such as vector lengths after concatenation.

  5. DataKinds. A GHC extension that lifts data constructors to the type level. After data Phase = Created | Running ..., the promoted constructors 'Created and 'Running can index types like Auction 'Running.

  6. KnownNat. A typeclass that reifies a type-level natural number (a Nat) to a runtime Integer via natVal. It connects type-level configuration (like minimum increments) to runtime checks.

  7. State Machine (Type-Safe). A design where phases of a protocol or workflow appear in the type as indices, and functions only accept the phases they are allowed to transform. Illegal transitions do not type check.

  8. Capability Token. A value whose possession confers authority. In the advanced auction, SellerToken p is a proof that the caller is authorized to perform seller-only transitions at phase p.

  9. Dependent Types (Informal in Haskell). Types that depend on values. While Haskell is not dependently typed, GADTs, phantom types, and type families approximate the effect by threading value-like information through types.

  10. Plutus Datum. On-chain data attached to a UTxO that encodes state. In the auction validator, datums represent phases (DCreated, DRunning, DWon, DNoSale) and associated fields.

  11. Plutus Redeemer. Data attached to a spending action that declares the intended transition. In the auction, redeemers are RStart, RBid amount, RClose, RSettle, and RCancel.

  12. Validator. The on-chain script that inspects the current datum and redeemer along with the transaction context and decides whether spending is allowed.

  13. UTxO (Unspent Transaction Output). The ledger model used by Cardano and Bitcoin. Script state lives in UTxOs at script addresses; spending them requires satisfying the validator.

  14. Continuing Output. The new UTxO at the same script address that carries the next state after a transition. The validator checks its value and datum to ensure the state machine advances correctly.

  15. Inline Datum. A datum stored directly in the output rather than via a hash. The validator extracts it to read the next state.

  16. Min-ADA. The minimum ADA required for an output to be valid on Cardano. The tutorial’s small patch threads this through value checks as apMinAda.

  17. POSIX Time / Valid Range. The time window in which a transaction is valid. The validator ensures that start occurs after startAt and that close occurs after startedAt + duration.

  18. AssetClass. A token identifier comprised of a policy ID and token name. Swapping ADA checks for assetClassValueOf generalizes the auction to token bids or NFT lots.


Clone this wiki locally