Skip to content

0024. Email Tutorials Haskell For Beginners ‐ Algebraic Data Types with Haskell

Bernard Sibanda edited this page Dec 8, 2025 · 1 revision

Algebraic Data Types with Haskell

24.1 Enumeration Sum Types: Defining Month

We start with a small exercise:

Define a type Month that lists all the months in the year, and then define some values of that type.

A solution looks like this:

data Month
  = January
  | February
  | March
  | April
  | May
  | June
  | July
  | August
  | September
  | October
  | November
  | December
  deriving Show

And some values:

jan, feb, dec :: Month
jan = January
feb = February
dec = December

Key points:

  • Month is a sum type: it can be January OR February OR … OR December.

  • Each of January, February, etc. is a nullary constructor:

    • They take no arguments.
    • They just represent a single fixed value.
  • deriving Show lets us print values in ghci:

ghci> jan
January
ghci> [jan, feb, dec]
[January,February,December]

These kinds of types are often called enumeration types:

  • They enumerate all possible values in a small, fixed set.

  • They are much safer than using Int for months, because:

    • Int has many more possible values than 12.
    • You can easily create nonsense like “month 42” with plain Int.

24.2 A Sum Type with Data: Status

Next example: a sum type where one branch carries extra data.

import Data.Int (Int8)

data Status
  = Success
  | Failure Int8
  deriving Show

Intuition:

  • Status represents whether something succeeded or failed.

  • Success:

    • Nullary constructor → no extra info.
  • Failure Int8:

    • Constructor that carries an Int8 → a failure code.

How many values can Status have?

  • Int8 has 8 bits of precision:

    • Possible values are from –128 to 127, inclusive → 256 possibilities.
  • For Status:

    • Success contributes 1 possible value.
    • Failure Int8 contributes 256 possible values (one for each possible Int8).

So total:

1 (Success) + 256 (Failure n) = 257

Thus, Status has 257 possible values.

This shows the “sum” aspect of sum types:

  • Total number of values for a sum type = sum of the possibilities of each constructor.

24.3 Mixing Sum and Product: The Type A

Now a more contrived type:

data A
  = A Bool Int8
  | B Int8 Int8 Int8 Bool
  | C
  deriving Show

Observations:

  • Multiple constructors → sum type.
  • Some constructors have multiple fields → product types inside.

So A is both a sum and product type.

How many values can each constructor represent?

  1. Constructor A Bool Int8

    • Bool has 2 values: True, False.
    • Int8 has 256 values.
    • Product type, so we multiply:
    2 * 256 = 512
    

    So any value built with A can be one of 512 possibilities.

  2. Constructor B Int8 Int8 Int8 Bool

    • Three Int8s → each 256 possibilities.
    • One Bool → 2 possibilities.

    Product again:

    256 * 256 * 256 * 2 = 33,554,432
    

    So values built with B have 33,554,432 possibilities.

  3. Constructor C

    • Nullary constructor → 1 possible value (just C itself).

Total number of possible A values

Sum across constructors:

A:        512
B: 33,554,432
C:          1--------------
Total: 33,554,945

So:

  • Product parts → multiply possibilities.
  • Sum over constructors → add possibilities.
  • Hence the terms product types and sum types.

24.4 What Are Algebraic Data Types (ADTs)?

We’ve essentially been building up to this definition.

Algebraic Data Types (ADTs) are:

Types formed by combining other types using two primitive operations:

  • Product (and)
  • Sum (or)

Informal definitions:

  • Product type: “has these fields AND those fields”.
  • Sum type: “is this constructor OR that constructor OR another one”.

Quoting the Scala glossary (paraphrased):

An algebraic data type is a type defined by providing several alternatives, each of which comes with its own constructor.

Generic template in Haskell

Most ADTs in Haskell follow this pattern:

data TypeName
  = Constructor1 T11 T12 ...
  | Constructor2 T21 ...
  | Constructor3 ...
  | Constructor4
  deriving (Show, Eq, Ord, ...)
  • Each ConstructorN is a possible alternative.

  • Each constructor can take zero or more parameters:

    • Zero parameters → nullary (like True, False, January, C).
    • One or more → product of those types.

By mixing sums and products, Haskell lets you define rich, precise models of your problem domain with very lightweight syntax. This:

  • Improves type safety (fewer invalid states).
  • Improves readability (type signatures say what you mean, not just raw Int/String blobs).

24.5 Module Recap (Big Picture)

In this part of the course we:

  1. Looked at the theory of product types via the Cartesian product of sets.

  2. Saw that tuples are built-in product types in Haskell, but:

    • (Int, Int, Int) doesn’t say “date”.
    • It doesn’t prevent nonsense values.
  3. Learned about:

    • Type synonyms (type Date = (Day, Month, Year)).
    • Datatype renamings via newtype (e.g. newtype Day = Day Int).
    • Their strengths and limitations.
  4. Saw deriving Show as a way to make our custom types printable in ghci, with a promise to return to type classes later.

  5. Defined our own product types (Date), sum types (Bool, Month, Status), and combinations (A).

  6. Finally arrived at the notion of algebraic data types as combinations of sum and product types.

With this, you’re well equipped to:

  • Design your own ADTs.
  • Model your domain more clearly and safely.
  • Prepare for upcoming topics: pattern matching, polymorphism, and type classes.

24.6 Multiple Choice Questions (with Answers)

Q1. What kind of type is Month in:

data Month
  = January | February | March | April
  | May | June | July | August
  | September | October | November | December
  deriving Show

A. A product type B. A sum (enumeration) type C. A type synonym D. A newtype

Answer: B

Q2. Why is Month safer than using Int to represent months?

A. Because Month values are faster to compare than Int. B. Because it uses less memory than Int. C. Because it only allows one of the twelve valid month values, instead of any integer. D. Because it automatically checks the number of days in the month.

Answer: C

Q3. In data Status = Success | Failure Int8, how many values can Status take?

A. 2 B. 256 C. 257 D. Infinite

Answer: C (1 for Success, 256 for Failure n.)

Q4. For data A = A Bool Int8 | B Int8 Int8 Int8 Bool | C, what is the number of possibilities for constructor B?

A. 2 * 256 B. 256 * 256 C. 256 * 256 * 256 * 2 D. 256 * 256 * 256

Answer: C

Q5. What makes A in the previous question an algebraic data type?

A. It’s defined using newtype. B. It combines sum (multiple constructors) and product (constructors with several fields) structure. C. It uses only Int8 and Bool. D. It has exactly three constructors.

Answer: B

Q6. Which of the following is the core idea behind algebraic data types (ADTs)?

A. They allow only numeric types. B. They are types that are defined using loops and recursion. C. They are types formed by combining other types using sums (“or”) and products (“and”). D. They automatically derive instances for all type classes.

Answer: C

Quiz and Onchain Progress NFT

Clone this wiki locally