\chapter{Type theory} \label{cha:typetheory} \section{Type theory versus set theory} \label{sec:types-vs-sets} \label{sec:axioms} \index{type theory} Homotopy type theory is (among other things) a foundational language for mathematics, i.e., an alternative to Zermelo--Fraenkel\index{set theory!Zermelo--Fraenkel} set theory. However, it behaves differently from set theory in several important ways, and that can take some getting used to. Explaining these differences carefully requires us to be more formal here than we will be in the rest of the book. As stated in the introduction, our goal is to write type theory \emph{informally}; but for a mathematician accustomed to set theory, more precision at the beginning can help avoid some common misconceptions and mistakes. We note that a set-theoretic foundation has two layers'': the deductive system of first-order logic,\index{first-order!logic} and, formulated inside this system, the axioms of a particular theory, such as ZFC. Thus, set theory is not only about sets, but rather about the interplay between sets (the objects of the second layer) and propositions (the objects of the first layer). By contrast, type theory is its own deductive system: it need not be formulated inside any superstructure, such as first-order logic. Instead of the two basic notions of set theory, sets and propositions, type theory has one basic notion: \emph{types}. Propositions (statements which we can prove, disprove, assume, negate, and so on\footnote{Confusingly, it is also a common practice (dating back to Euclid) to use the word proposition'' synonymously with theorem''. We will confine ourselves to the logician's usage, according to which a \emph{proposition} is a statement \emph{susceptible to} proof, whereas a \emph{theorem}\indexfoot{theorem} (or lemma''\indexfoot{lemma} or corollary''\indexfoot{corollary}) is such a statement that \emph{has been} proven. Thus $0=1$'' and its negation $\neg(0=1)$'' are both propositions, but only the latter is a theorem.}) are identified with particular types, via the correspondence shown in \cref{tab:pov} on page~\pageref{tab:pov}. Thus, the mathematical activity of \emph{proving a theorem} is identified with a special case of the mathematical activity of \emph{constructing an object}---in this case, an inhabitant of a type that represents a proposition. \index{deductive system}% This leads us to another difference between type theory and set theory, but to explain it we must say a little about deductive systems in general. Informally, a deductive system is a collection of \define{rules} \indexdef{rule}% for deriving things called \define{judgments}. \indexdef{judgment}% If we think of a deductive system as a formal game, \index{game!deductive system as}% then the judgments are the positions'' in the game which we reach by following the game rules. We can also think of a deductive system as a sort of algebraic theory, in which case the judgments are the elements (like the elements of a group) and the deductive rules are the operations (like the group multiplication). From a logical point of view, the judgments can be considered to be the external'' statements, living in the metatheory, as opposed to the internal'' statements of the theory itself. In the deductive system of first-order logic (on which set theory is based), there is only one kind of judgment: that a given proposition has a proof. That is, each proposition $A$ gives rise to a judgment $A$ has a proof'', and all judgments are of this form. A rule of first-order logic such as from $A$ and $B$ infer $A\wedge B$'' is actually a rule of proof construction'' which says that given the judgments $A$ has a proof'' and $B$ has a proof'', we may deduce that $A\wedge B$ has a proof''. Note that the judgment $A$ has a proof'' exists at a different level from the \emph{proposition} $A$ itself, which is an internal statement of the theory. % In particular, we cannot manipulate it to construct propositions such as if $A$ has a proof, then $B$ does not have a proof''---unless we are using our set-theoretic foundation as a meta-theory with which to talk about some other axiomatic system. The basic judgment of type theory, analogous to $A$ has a proof'', is written $a:A$'' and pronounced as the term $a$ has type $A$'', or more loosely $a$ is an element of $A$'' (or, in homotopy type theory, $a$ is a point of $A$''). \indexdef{term}% \indexdef{element}% \indexdef{point!of a type}% When $A$ is a type representing a proposition, then $a$ may be called a \emph{witness}\index{witness!to the truth of a proposition} to the provability of $A$, or \emph{evidence}\index{evidence, of the truth of a proposition} of the truth of $A$ (or even a \emph{proof}\index{proof} of $A$, but we will try to avoid this confusing terminology). In this case, the judgment $a:A$ is derivable in type theory (for some $a$) precisely when the analogous judgment $A$ has a proof'' is derivable in first-order logic (modulo differences in the axioms assumed and in the encoding of mathematics, as we will discuss throughout the book). On the other hand, if the type $A$ is being treated more like a set than like a proposition (although as we will see, the distinction can become blurry), then $a:A$'' may be regarded as analogous to the set-theoretic statement $a\in A$''. However, there is an essential difference in that $a:A$'' is a \emph{judgment} whereas $a\in A$'' is a \emph{proposition}. In particular, when working internally in type theory, we cannot make statements such as if $a:A$ then it is not the case that $b:B$'', nor can we disprove'' the judgment $a:A$''. A good way to think about this is that in set theory, membership'' is a relation which may or may not hold between two pre-existing objects $a$'' and $A$'', while in type theory we cannot talk about an element $a$'' in isolation: every element \emph{by its very nature} is an element of some type, and that type is (generally speaking) uniquely determined. Thus, when we say informally let $x$ be a natural number'', in set theory this is shorthand for let $x$ be a thing and assume that $x\in\nat$'', whereas in type theory let $x:\nat$'' is an atomic statement: we cannot introduce a variable without specifying its type.\index{membership} At first glance, this may seem an uncomfortable restriction, but it is arguably closer to the intuitive mathematical meaning of let $x$ be a natural number''. In practice, it seems that whenever we actually \emph{need} $a\in A$'' to be a proposition rather than a judgment, there is always an ambient set $B$ of which $a$ is known to be an element and $A$ is known to be a subset. This situation is also easy to represent in type theory, by taking $a$ to be an element of the type $B$, and $A$ to be a predicate on $B$; see \cref{subsec:prop-subsets}. A last difference between type theory and set theory is the treatment of equality. The familiar notion of equality in mathematics is a proposition: e.g.\ we can disprove an equality or assume an equality as a hypothesis. Since in type theory, propositions are types, this means that equality is a type: for elements $a,b:A$ (that is, both $a:A$ and $b:A$) we have a type $\id[A]ab$''. (In \emph{homotopy} type theory, of course, this equality proposition can behave in unfamiliar ways: see \cref{sec:identity-types,cha:basics}, and the rest of the book). When $\id[A]ab$ is inhabited, we say that $a$ and $b$ are \define{(propositionally) equal}. \index{propositional!equality}% \index{equality!propositional}% However, in type theory there is also a need for an equality \emph{judgment}, existing at the same level as the judgment $x:A$''.\index{judgment} \symlabel{defn:judgmental-equality}% This is called \define{judgmental equality} \indexdef{equality!judgmental}% \indexdef{judgmental equality}% or \define{definitional equality}, \indexdef{equality!definitional}% \indexsee{definitional equality}{equality, definitional}% and we write it as $a\jdeq b : A$ or simply $a \jdeq b$. It is helpful to think of this as meaning equal by definition''. For instance, if we define a function $f:\nat\to\nat$ by the equation $f(x)=x^2$, then the expression $f(3)$ is equal to $3^2$ \emph{by definition}. Inside the theory, it does not make sense to negate or assume an equality-by-definition; we cannot say if $x$ is equal to $y$ by definition, then $z$ is not equal to $w$ by definition''. Whether or not two expressions are equal by definition is just a matter of expanding out the definitions; in particular, it is algorithmically\index{algorithm} decidable (though the algorithm is necessarily meta-theoretic, not internal to the theory).\index{decidable!definitional equality} As type theory becomes more complicated, judgmental equality can get more subtle than this, but it is a good intuition to start from. Alternatively, if we regard a deductive system as an algebraic theory, then judgmental equality is simply the equality in that theory, analogous to the equality between elements of a group---the only potential for confusion is that there is \emph{also} an object \emph{inside} the deductive system of type theory (namely the type $a=b$'') which behaves internally as a notion of equality''. The reason we \emph{want} a judgmental notion of equality is so that it can control the other form of judgment, $a:A$''. For instance, suppose we have given a proof that $3^2=9$, i.e.\ we have derived the judgment $p:(3^2=9)$ for some $p$. Then the same witness $p$ ought to count as a proof that $f(3)=9$, since $f(3)$ is $3^2$ \emph{by definition}. The best way to represent this is with a rule saying that given the judgments $a:A$ and $A\jdeq B$, we may derive the judgment $a:B$. Thus, for us, type theory will be a deductive system based on two forms of judgment: \begin{center} \medskip \begin{tabular}{cl} \toprule Judgment & Meaning\\ \midrule $a : A$ & $a$ is an object of type $A$''\\ $a \jdeq b : A$ & $a$ and $b$ are definitionally equal objects of type $A$''\\ \bottomrule \end{tabular} \medskip \end{center} % \symlabel{defn:defeq}% When introducing a definitional equality, i.e., defining one thing to be equal to another, we will use the symbol $\defeq$''. Thus, the above definition of the function $f$ would be written as $f(x)\defeq x^2$. Because judgments cannot be put together into more complicated statements, the symbols $:$'' and $\jdeq$'' bind more loosely than anything else.% \footnote{In formalized\indexfoot{mathematics!formalized} type theory, commas and turnstiles can bind even more loosely. For instance, $x:A,y:B\vdash c:C$ is parsed as $((x:A),(y:B))\vdash (c:C)$. However, in this book we refrain from such notation until \cref{cha:rules}.} Thus, for instance, $p:\id{x}{y}$'' should be parsed as $p:(\id{x}{y})$'', which makes sense since $\id{x}{y}$'' is a type, and not as $\id{(p:x)}{y}$'', which is senseless since $p:x$'' is a judgment and cannot be equal to anything. Similarly, $A\jdeq \id{x}{y}$'' can only be parsed as $A\jdeq(\id{x}{y})$'', although in extreme cases such as this, one ought to add parentheses anyway to aid reading comprehension. Moreover, later on we will fall into the common notation of chaining together equalities --- e.g.\ writing $a=b=c=d$ to mean $a=b$ and $b=c$ and $c=d$, hence $a=d$'' --- and we will also include judgmental equalities in such chains. Context usually suffices to make the intent clear. This is perhaps also an appropriate place to mention that the common mathematical notation $f:A\to B$'', expressing the fact that $f$ is a function from $A$ to $B$, can be regarded as a typing judgment, since we use $A\to B$'' as notation for the type of functions from $A$ to $B$ (as is standard practice in type theory; see \cref{sec:pi-types}). \index{assumption|(defstyle}% Judgments may depend on \emph{assumptions} of the form $x:A$, where $x$ is a variable \indexdef{variable}% and $A$ is a type. For example, we may construct an object $m + n : \nat$ under the assumptions that $m,n : \nat$. Another example is that assuming $A$ is a type, $x,y : A$, and $p : \id[A]{x}{y}$, we may construct an element $p^{-1} : \id[A]{y}{x}$. The collection of all such assumptions is called the \define{context};% \index{context} from a topological point of view it may be thought of as a parameter\index{parameter!space} space''. In fact, technically the context must be an ordered list of assumptions, since later assumptions may depend on previous ones: the assumption $x:A$ can only be made \emph{after} the assumptions of any variables appearing in the type $A$. If the type $A$ in an assumption $x:A$ represents a proposition, then the assumption is a type-theoretic version of a \emph{hypothesis}: \indexdef{hypothesis}% we assume that the proposition $A$ holds. When types are regarded as propositions, we may omit the names of their proofs. Thus, in the second example above we may instead say that assuming $\id[A]{x}{y}$, we can prove $\id[A]{y}{x}$. However, since we are doing proof-relevant'' mathematics, \index{mathematics!proof-relevant}% we will frequently refer back to proofs as objects. In the example above, for instance, we may want to establish that $p^{-1}$ together with the proofs of transitivity and reflexivity behave like a groupoid; see \cref{cha:basics}. Note that under this meaning of the word \emph{assumption}, we can assume a propositional equality (by assuming a variable $p:x=y$), but we cannot assume a judgmental equality $x\jdeq y$, since it is not a type that can have an element. However, we can do something else which looks kind of like assuming a judgmental equality: if we have a type or an element which involves a variable $x:A$, then we can \emph{substitute} any particular element $a:A$ for $x$ to obtain a more specific type or element. We will sometimes use language like now assume $x\jdeq a$'' to refer to this process of substitution, even though it is not an \emph{assumption} in the technical sense introduced above. \index{assumption|)}% By the same token, we cannot \emph{prove} a judgmental equality either, since it is not a type in which we can exhibit a witness. Nevertheless, we will sometimes state judgmental equalities as part of a theorem, e.g.\ there exists $f:A\to B$ such that $f(x)\jdeq y$''. This should be regarded as the making of two separate judgments: first we make the judgment $f:A\to B$ for some element $f$, then we make the additional judgment that $f(x)\jdeq y$. In the rest of this chapter, we attempt to give an informal presentation of type theory, sufficient for the purposes of this book; we give a more formal account in \cref{cha:rules}. Aside from some fairly obvious rules (such as the fact that judgmentally equal things can always be substituted\index{substitution} for each other), the rules of type theory can be grouped into \emph{type formers}. Each type former consists of a way to construct types (possibly making use of previously constructed types), together with rules for the construction and behavior of elements of that type. In most cases, these rules follow a fairly predictable pattern, but we will not attempt to make this precise here; see however the beginning of \cref{sec:finite-product-types} and also \cref{cha:induction}.\index{type theory!informal} \index{axiom!versus rules}% \index{rule!versus axioms}% An important aspect of the type theory presented in this chapter is that it consists entirely of \emph{rules}, without any \emph{axioms}. In the description of deductive systems in terms of judgments, the \emph{rules} are what allow us to conclude one judgment from a collection of others, while the \emph{axioms} are the judgments we are given at the outset. If we think of a deductive system as a formal game, then the rules are the rules of the game, while the axioms are the starting position. And if we think of a deductive system as an algebraic theory, then the rules are the operations of the theory, while the axioms are the \emph{generators} for some particular free model of that theory. In set theory, the only rules are the rules of first-order logic (such as the rule allowing us to deduce $A\wedge B$ has a proof'' from $A$ has a proof'' and $B$ has a proof''): all the information about the behavior of sets is contained in the axioms. By contrast, in type theory, it is usually the \emph{rules} which contain all the information, with no axioms being necessary. For instance, in \cref{sec:finite-product-types} we will see that there is a rule allowing us to deduce the judgment $(a,b):A\times B$'' from $a:A$'' and $b:B$'', whereas in set theory the analogous statement would be (a consequence of) the pairing axiom. The advantage of formulating type theory using only rules is that rules are procedural''. In particular, this property is what makes possible (though it does not automatically ensure) the good computational properties of type theory, such as canonicity''.\index{canonicity} However, while this style works for traditional type theories, we do not yet understand how to formulate everything we need for \emph{homotopy} type theory in this way. In particular, in \cref{sec:compute-pi,sec:compute-universe,cha:hits} we will have to augment the rules of type theory presented in this chapter by introducing additional axioms, notably the \emph{univalence axiom}. In this chapter, however, we confine ourselves to a traditional rule-based type theory. \section{Function types} \label{sec:function-types} \index{type!function|(defstyle}% \indexsee{function type}{type, function}% Given types $A$ and $B$, we can construct the type $A \to B$ of \define{functions} \index{function|(defstyle}% \indexsee{map}{function}% \indexsee{mapping}{function}% with domain $A$ and codomain $B$. We also sometimes refer to functions as \define{maps}. \index{domain!of a function}% \index{codomain, of a function}% \index{function!domain of}% \index{function!codomain of}% \index{functional relation}% Unlike in set theory, functions are not defined as functional relations; rather they are a primitive concept in type theory. We explain the function type by prescribing what we can do with functions, how to construct them and what equalities they induce. Given a function $f : A \to B$ and an element of the domain $a : A$, we can \define{apply} \indexdef{application!of function}% \indexdef{function!application}% \indexsee{evaluation}{application, of a function} the function to obtain an element of the codomain $B$, denoted $f(a)$ and called the \define{value} of $f$ at $a$. \indexdef{value!of a function}% It is common in type theory to omit the parentheses\index{parentheses} and denote $f(a)$ simply by $f\,a$, and we will sometimes do this as well. But how can we construct elements of $A \to B$? There are two equivalent ways: either by direct definition or by using $\lambda$-abstraction. Introducing a function by definition \indexdef{definition!of function, direct}% means that we introduce a function by giving it a name --- let's say, $f$ --- and saying we define $f : A \to B$ by giving an equation \begin{equation} \label{eq:expldef} f(x) \defeq \Phi \end{equation} where $x$ is a variable \index{variable}% and $\Phi$ is an expression which may use $x$. In order for this to be valid, we have to check that $\Phi : B$ assuming $x:A$. Now we can compute $f(a)$ by replacing the variable $x$ in $\Phi$ with $a$. As an example, consider the function $f : \nat \to \nat$ which is defined by $f(x) \defeq x+x$. (We will define $\nat$ and $+$ in \cref{sec:inductive-types}.) Then $f(2)$ is judgmentally equal to $2+2$. If we don't want to introduce a name for the function, we can use \define{$\lambda$-abstraction}. \index{lambda abstraction@$\lambda$-abstraction|defstyle}% \indexsee{function!lambda abstraction@$\lambda$-abstraction}{$\lambda$-abstraction}% \indexsee{abstraction!lambda-@$\lambda$-}{$\lambda$-abstraction}% Given an expression $\Phi$ of type $B$ which may use $x:A$, as above, we write $\lam{x:A} \Phi$ to indicate the same function defined by~\eqref{eq:expldef}. Thus, we have $(\lamt{x:A}\Phi) : A \to B.$ For the example in the previous paragraph, we have the typing judgment $(\lam{x:\nat}x+x) : \nat \to \nat.$ As another example, for any types $A$ and $B$ and any element $y:B$, we have a \define{constant function} \indexdef{constant!function}% \indexdef{function!constant}% $(\lam{x:A} y): A\to B$. We generally omit the type of the variable $x$ in a $\lambda$-abstraction and write $\lam{x}\Phi$, since the typing $x:A$ is inferable from the judgment that the function $\lam x \Phi$ has type $A\to B$. By convention, the scope'' \indexdef{variable!scope of}% \indexdef{scope}% of the variable binding $\lam{x}$'' is the entire rest of the expression, unless delimited with parentheses\index{parentheses}. Thus, for instance, $\lam{x} x+x$ should be parsed as $\lam{x} (x+x)$, not as $(\lam{x}x)+x$ (which would, in this case, be ill-typed anyway). Another equivalent notation is \symlabel{mapsto}% $(x \mapsto \Phi) : A \to B.$ \symlabel{blank}% We may also sometimes use a blank $\blank$'' in the expression $\Phi$ in place of a variable, to denote an implicit $\lambda$-abstraction. For instance, $g(x,\blank)$ is another way to write $\lam{y} g(x,y)$. Now a $\lambda$-abstraction is a function, so we can apply it to an argument $a:A$. We then have the following \define{computation rule}\indexdef{computation rule!for function types}\footnote{Use of this equality is often referred to as \define{$\beta$-conversion} \indexsee{beta-conversion@$\beta$-conversion}{$\beta$-reduction}% \indexsee{conversion!beta@$\beta$-}{$\beta$-reduction}% or \define{$\beta$-reduction}.% \index{beta-reduction@$\beta$-reduction|footstyle}% \indexsee{reduction!beta@$\beta$-}{$\beta$-reduction}% }, which is a definitional equality: $(\lamu{x:A}\Phi)(a) \jdeq \Phi'$ where $\Phi'$ is the expression $\Phi$ in which all occurrences of $x$ have been replaced by $a$. Continuing the above example, we have % $(\lamu{x:\nat}x+x)(2) \jdeq 2+2.$ % Note that from any function $f:A\to B$, we can construct a lambda abstraction function $\lam{x} f(x)$. Since this is by definition the function that applies $f$ to its argument'' we consider it to be definitionally equal to $f$:\footnote{Use of this equality is often referred to as \define{$\eta$-conversion} \indexsee{eta-conversion@$\eta$-conversion}{$\eta$-expansion}% \indexsee{conversion!eta@$\eta$-}{$\eta$-expansion}% or \define{$\eta$-expansion. \index{eta-expansion@$\eta$-expansion|footstyle}% \indexsee{expansion, eta-@expansion, $\eta$-}{$\eta$-expansion}% }} $f \jdeq (\lam{x} f(x)).$ This equality is the \define{uniqueness principle for function types}\indexdef{uniqueness!principle!for function types}, because it shows that $f$ is uniquely determined by its values. The introduction of functions by definitions with explicit parameters can be reduced to simple definitions by using $\lambda$-abstraction: i.e., we can read a definition of $f: A\to B$ by $f(x) \defeq \Phi$ as $f \defeq \lamu{x:A}\Phi.$ When doing calculations involving variables, we have to be careful when replacing a variable with an expression that also involves variables, because we want to preserve the binding structure of expressions. By the \emph{binding structure}\indexdef{binding structure} we mean the invisible link generated by binders such as $\lambda$, $\Pi$ and $\Sigma$ (the latter we are going to meet soon) between the place where the variable is introduced and where it is used. As an example, consider $f : \nat \to (\nat \to \nat)$ defined as $f(x) \defeq \lamu{y:\nat} x + y.$ Now if we have assumed somewhere that $y : \nat$, then what is $f(y)$? It would be wrong to just naively replace $x$ by $y$ everywhere in the expression $\lam{y}x+y$'' defining $f(x)$, obtaining $\lamu{y:\nat} y + y$, because this means that $y$ gets \define{captured}. \indexdef{capture, of a variable}% \indexdef{variable!captured}% Previously, the substituted\index{substitution} $y$ was referring to our assumption, but now it is referring to the argument of the $\lambda$-abstraction. Hence, this naive substitution would destroy the binding structure, allowing us to perform calculations which are semantically unsound. But what \emph{is} $f(y)$ in this example? Note that bound (or dummy'') variables \indexdef{variable!bound}% \indexdef{variable!dummy}% \indexsee{bound variable}{variable, bound}% \indexsee{dummy variable}{variable, bound}% such as $y$ in the expression $\lamu{y:\nat} x + y$ have only a local meaning, and can be consistently replaced by any other variable, preserving the binding structure. Indeed, $\lamu{y:\nat} x + y$ is declared to be judgmentally equal\footnote{Use of this equality is often referred to as \define{$\alpha$-conversion. \indexfoot{alpha-conversion@$\alpha$-conversion} \indexsee{conversion!alpha@$\alpha$-}{$\alpha$-conversion} }} to $\lamu{z:\nat} x + z$. It follows that $f(y)$ is judgmentally equal to $\lamu{z:\nat} y + z$, and that answers our question. (Instead of $z$, any variable distinct from $y$ could have been used, yielding an equal result.) Of course, this should all be familiar to any mathematician: it is the same phenomenon as the fact that if $f(x) \defeq \int_1^2 \frac{dt}{x-t}$, then $f(t)$ is not $\int_1^2 \frac{dt}{t-t}$ but rather $\int_1^2 \frac{ds}{t-s}$. A $\lambda$-abstraction binds a dummy variable in exactly the same way that an integral does. We have seen how to define functions in one variable. One way to define functions in several variables would be to use the cartesian product, which will be introduced later; a function with parameters $A$ and $B$ and results in $C$ would be given the type $f : A \times B \to C$. However, there is another choice that avoids using product types, which is called \define{currying} \indexdef{currying}% \indexdef{function!currying of}% (after the mathematician Haskell Curry). \index{programming}% The idea of currying is to represent a function of two inputs $a:A$ and $b:B$ as a function which takes \emph{one} input $a:A$ and returns \emph{another function}, which then takes a second input $b:B$ and returns the result. That is, we consider two-variable functions to belong to an iterated function type, $f : A \to (B \to C)$. We may also write this without the parentheses\index{parentheses}, as $f : A \to B \to C$, with associativity\index{associativity!of function types} to the right as the default convention. Then given $a : A$ and $b : B$, we can apply $f$ to $a$ and then apply the result to $b$, obtaining $f(a)(b) : C$. To avoid the proliferation of parentheses, we allow ourselves to write $f(a)(b)$ as $f(a,b)$ even though there are no products involved. When omitting parentheses around function arguments entirely, we write $f\,a\,b$ for $(f\,a)\,b$, with the default associativity now being to the left so that $f$ is applied to its arguments in the correct order. Our notation for definitions with explicit parameters extends to this situation: we can define a named function $f : A \to B \to C$ by giving an equation $f(x,y) \defeq \Phi$ where $\Phi:C$ assuming $x:A$ and $y:B$. Using $\lambda$-abstraction\index{lambda abstraction@$\lambda$-abstraction} this corresponds to $f \defeq \lamu{x:A}{y:B} \Phi,$ which may also be written as $f \defeq x \mapsto y \mapsto \Phi.$ We can also implicitly abstract over multiple variables by writing multiple blanks, e.g.\ $g(\blank,\blank)$ means $\lam{x}{y} g(x,y)$. Currying a function of three or more arguments is a straightforward extension of what we have just described. \index{type!function|)}% \index{function|)}% \section{Universes and families} \label{sec:universes} So far, we have been using the expression $A$ is a type'' informally. We are going to make this more precise by introducing \define{universes}. \index{type!universe|(defstyle}% \indexsee{universe}{type, universe}% A universe is a type whose elements are types. As in naive set theory, we might wish for a universe of all types $\UU_\infty$ including itself (that is, with $\UU_\infty : \UU_\infty$). However, as in set theory, this is unsound, i.e.\ we can deduce from it that every type, including the empty type representing the proposition False (see \cref{sec:coproduct-types}), is inhabited. For instance, using a representation of sets as trees, we can directly encode Russell's paradox\index{paradox} \cite{coquand:paradox}. % or alternatively, in order to avoid the use of % inductive types to define trees, we can follow Girard \cite{girard:paradox} and encode the Burali-Forti paradox, % which shows that the collection of all ordinals cannot be an ordinal. To avoid the paradox we introduce a hierarchy of universes \indexsee{hierarchy!of universes}{type, universe}% $\UU_0 : \UU_1 : \UU_2 : \cdots$ where every universe $\UU_i$ is an element of the next universe $\UU_{i+1}$. Moreover, we assume that our universes are \define{cumulative}, \indexdef{type!universe!cumulative}% \indexdef{cumulative!universes}% that is that all the elements of the $i^{\mathrm{th}}$ universe are also elements of the $(i+1)^{\mathrm{st}}$ universe, i.e.\ if $A:\UU_i$ then also $A:\UU_{i+1}$. This is convenient, but has the slightly unpleasant consequence that elements no longer have unique types, and is a bit tricky in other ways that need not concern us here; see the Notes. When we say that $A$ is a type, we mean that it inhabits some universe $\UU_i$. We usually want to avoid mentioning the level \indexdef{universe level}% \indexsee{level}{universe level or $n$-type}% \indexsee{type!universe!level}{universe level}% $i$ explicitly, and just assume that levels can be assigned in a consistent way; thus we may write $A:\UU$ omitting the level. This way we can even write $\UU:\UU$, which can be read as $\UU_i:\UU_{i+1}$, having left the indices implicit. Writing universes in this style is referred to as \define{typical ambiguity}. \indexdef{typical ambiguity}% It is convenient but a bit dangerous, since it allows us to write valid-looking proofs that reproduce the paradoxes of self-reference. If there is any doubt about whether an argument is correct, the way to check it is to try to assign levels consistently to all universes appearing in it. When some universe \UU is assumed, we may refer to types belonging to \UU as \define{small types}. \indexdef{small!type}% \indexdef{type!small}% To model a collection of types varying over a given type $A$, we use functions $B : A \to \UU$ whose codomain is a universe. These functions are called \define{families of types} (or sometimes \emph{dependent types}); \indexsee{family!of types}{type, family of}% \indexdef{type!family of}% \indexsee{type!dependent}{type, family of}% \indexsee{dependent!type}{type, family of}% they correspond to families of sets as used in set theory. \symlabel{fin}% An example of a type family is the family of finite sets $\Fin : \nat \to \UU$, where $\Fin(n)$ is a type with exactly $n$ elements. (We cannot \emph{define} the family $\Fin$ yet --- indeed, we have not even introduced its domain $\nat$ yet --- but we will be able to soon; see \cref{ex:fin}.) We may denote the elements of $\Fin(n)$ by $0_n,1_n,\dots,(n-1)_n$, with subscripts to emphasize that the elements of $\Fin(n)$ are different from those of $\Fin(m)$ if $n$ is different from $m$, and all are different from the ordinary natural numbers (which we will introduce in \cref{sec:inductive-types}). \index{finite!sets, family of}% A more trivial (but very important) example of a type family is the \define{constant} type family \indexdef{constant!type family}% \indexdef{type!family of!constant}% at a type $B:\UU$, which is of course the constant function $(\lam{x:A} B):A\to\UU$. As a \emph{non}-example, in our version of type theory there is no type family $\lam{i:\nat} \UU_i$''. Indeed, there is no universe large enough to be its codomain. Moreover, we do not even identify the indices $i$ of the universes $\UU_i$ with the natural numbers \nat of type theory (the latter to be introduced in \cref{sec:inductive-types}). \index{type!universe|)}% \section{Dependent function types (\texorpdfstring{$\Pi$}{Π}-types)} \label{sec:pi-types} \index{type!dependent function|(defstyle}% \index{function!dependent|(defstyle}% \indexsee{dependent!function}{function, dependent}% \indexsee{type!Pi-@$\Pi$-}{type, dependent function}% \indexsee{Pi-type@$\Pi$-type}{type, dependent function}% In type theory we often use a more general version of function types, called a \define{$\Pi$-type} or \define{dependent function type}. The elements of a $\Pi$-type are functions whose codomain type can vary depending on the element of the domain to which the function is applied, called \define{dependent functions}. The name $\Pi$-type'' is used because this type can also be regarded as the cartesian product over a given type. Given a type $A:\UU$ and a family $B:A \to \UU$, we may construct the type of dependent functions $\prd{x:A}B(x) : \UU$. There are many alternative notations for this type, such as $\tprd{x:A} B(x) \qquad \dprd{x:A}B(x) \qquad \lprd{x:A} B(x).$ If $B$ is a constant family, then the dependent product type is the ordinary function type: $\tprd{x:A} B \jdeq (A \to B).$ Indeed, all the constructions of $\Pi$-types are generalizations of the corresponding constructions on ordinary function types. \indexdef{definition!of function, direct}% We can introduce dependent functions by explicit definitions: to define $f : \prd{x:A}B(x)$, where $f$ is the name of a dependent function to be defined, we need an expression $\Phi : B(x)$ possibly involving the variable $x:A$, \index{variable}% and we write $f(x) \defeq \Phi \qquad \mbox{for x:A}.$ Alternatively, we can use \define{$\lambda$-abstraction}% \index{lambda abstraction@$\lambda$-abstraction|defstyle}% \begin{equation} \label{eq:lambda-abstraction} \lamu{x:A} \Phi \ :\ \prd{x:A} B(x). \end{equation} \indexdef{application!of dependent function}% \indexdef{function!dependent!application}% As with non-dependent functions, we can \define{apply} a dependent function $f : \prd{x:A}B(x)$ to an argument $a:A$ to obtain an element $f(a):B(a)$. The equalities are the same as for the ordinary function type, i.e.\ we have the computation rule \index{computation rule!for dependent function types}% given $a:A$ we have $f(a) \jdeq \Phi'$ and $(\lamu{x:A} \Phi)(a) \jdeq \Phi'$, where $\Phi'$ is obtained by replacing all occurrences of $x$ in $\Phi$ by $a$ (avoiding variable capture, as always). Similarly, we have the uniqueness principle $f\jdeq (\lam{x} f(x))$ for any $f:\prd{x:A} B(x)$. \index{uniqueness!principle!for dependent function types}% As an example, recall from \cref{sec:universes} that there is a type family $\Fin:\nat\to\UU$ whose values are the standard finite sets, with elements $0_n,1_n,\dots,(n-1)_n : \Fin(n)$. There is then a dependent function $\fmax : \prd{n:\nat} \Fin(n+1)$ which returns the largest'' element of each nonempty finite type, $\fmax(n) \defeq n_{n+1}$. \index{finite!sets, family of}% As was the case for $\Fin$ itself, we cannot define $\fmax$ yet, but we will be able to soon; see \cref{ex:fin}. Another important class of dependent function types, which we can define now, are functions which are \define{polymorphic} \indexdef{function!polymorphic}% \indexdef{polymorphic function}% over a given universe. A polymorphic function is one which takes a type as one of its arguments, and then acts on elements of that type (or of other types constructed from it). \symlabel{idfunc}% \indexdef{function!identity}% \indexdef{identity!function}% An example is the polymorphic identity function $\idfunc : \prd{A:\UU} A \to A$, which we define by $\idfunc{} \defeq \lam{A:\type}{x:A} x$. We sometimes write some arguments of a dependent function as subscripts. For instance, we might equivalently define the polymorphic identity function by $\idfunc[A](x) \defeq x$. Moreover, if an argument can be inferred from context, we may omit it altogether. For instance, if $a:A$, then writing $\idfunc(a)$ is unambiguous, since $\idfunc$ must mean $\idfunc[A]$ in order for it to be applicable to $a$. Another, less trivial, example of a polymorphic function is the swap'' operation that switches the order of the arguments of a (curried) two-argument function: $\mathsf{swap} : \prd{A:\UU}{B:\UU}{C:\UU} (A\to B\to C) \to (B\to A \to C)$ We can define this by $\mathsf{swap}(A,B,C,g) \defeq \lam{b}{a} g(a)(b).$ We might also equivalently write the type arguments as subscripts: $\mathsf{swap}_{A,B,C}(g)(b,a) \defeq g(a,b).$ Note that as we did for ordinary functions, we use currying to define dependent functions with several arguments (such as $\mathsf{swap}$). However, in the dependent case the second domain may depend on the first one, and the codomain may depend on both. That is, given $A:\UU$ and type families $B : A \to \UU$ and $C : \prd{x:A}B(x) \to \UU$, we may construct the type $\prd{x:A}{y : B(x)} C(x,y)$ of functions with two arguments. (Like $\lambda$-abstractions, $\Pi$s automatically scope\index{scope} over the rest of the expression unless delimited; thus $C : \prd{x:A}B(x) \to \UU$ means $C : \prd{x:A}(B(x) \to \UU)$.) In the case when $B$ is constant and equal to $A$, we may condense the notation and write $\prd{x,y:A}$; for instance, the type of $\mathsf{swap}$ could also be written as $\mathsf{swap} : \prd{A,B,C:\UU} (A\to B\to C) \to (B\to A \to C).$ Finally, given $f:\prd{x:A}{y : B(x)} C(x,y)$ and arguments $a:A$ and $b:B(a)$, we have $f(a)(b) : C(a,b)$, which, as before, we write as $f(a,b) : C(a,b)$. \index{type!dependent function|)}% \index{function!dependent|)}% \section{Product types} \label{sec:finite-product-types} Given types $A,B:\UU$ we introduce the type $A\times B:\UU$, which we call their \define{cartesian product}. \indexsee{cartesian product}{type, product}% \indexsee{type!cartesian product}{type, product}% \index{type!product|(defstyle}% \indexsee{product!of types}{type, product}% We also introduce a nullary product type, called the \define{unit type} $\unit : \UU$. \indexsee{nullary!product}{type, unit}% \indexsee{unit!type}{type, unit}% \index{type!unit|(defstyle}% We intend the elements of $A\times B$ to be pairs $\tup{a}{b} : A \times B$, where $a:A$ and $b:B$, and the only element of $\unit$ to be some particular object $\ttt : \unit$. \indexdef{pair!ordered}% However, unlike in set theory, where we define ordered pairs to be particular sets and then collect them all together into the cartesian product, in type theory, ordered pairs are a primitive concept, as are functions. \begin{rmk}\label{rmk:introducing-new-concepts} There is a general pattern for introduction of a new kind of type in type theory. We have already seen this pattern in \cref{sec:function-types,sec:pi-types}\footnote{The description of universes above is an exception.}, so it is worth emphasizing the general form. To specify a type, we specify: \begin{enumerate} \item how to form new types of this kind, via \define{formation rules}. \indexdef{formation rule}% \index{rule!formation}% (For example, we can form the function type $A \to B$ when $A$ is a type and when $B$ is a type. We can form the dependent function type $\prd{x:A} B(x)$ when $A$ is a type and $B(x)$ is a type for $x:A$.) \item how to construct elements of that type. These are called the type's \define{constructors} or \define{introduction rules}. \indexdef{constructor}% \indexdef{rule!introduction}% \indexdef{introduction rule}% (For example, a function type has one constructor, $\lambda$-abstraction. Recall that a direct definition like $f(x)\defeq 2x$ can equivalently be phrased as a $\lambda$-abstraction $f\defeq \lam{x} 2x$.) \item how to use elements of that type. These are called the type's \define{eliminators} or \define{elimination rules}. \indexsee{rule!elimination}{eliminator}% \indexsee{elimination rule}{eliminator}% \indexdef{eliminator}% (For example, the function type has one eliminator, namely function application.) \item a \define{computation rule}\indexdef{computation rule}\footnote{also referred to as \define{$\beta$-reduction}\index{beta-reduction@$\beta$-reduction|footstyle}}, which expresses how an eliminator acts on a constructor. (For example, for functions, the computation rule states that $(\lamu{x:A}\Phi)(a)$ is judgmentally equal to the substitution of $a$ for $x$ in $\Phi$.) \item an optional \define{uniqueness principle}\indexdef{uniqueness!principle}\footnote{also referred to as \define{$\eta$-expansion}\index{eta-expansion@$\eta$-expansion|footstyle}}, which expresses uniqueness of maps into or out of that type. For some types, the uniqueness principle characterizes maps into the type, by stating that every element of the type is uniquely determined by the results of applying eliminators to it, and can be reconstructed from those results by applying a constructor---thus expressing how constructors act on eliminators, dually to the computation rule. (For example, for functions, the uniqueness principle says that any function $f$ is judgmentally equal to the expanded'' function $\lamu{x} f(x)$, and thus is uniquely determined by its values.) For other types, the uniqueness principle says that every map (function) \emph{from} that type is uniquely determined by some data. (An example is the coproduct type introduced in \cref{sec:coproduct-types}, whose uniqueness principle is mentioned in \cref{sec:universal-properties}.) When the uniqueness principle is not taken as a rule of judgmental equality, it is often nevertheless provable as a \emph{propositional} equality from the other rules for the type. In this case we call it a \define{propositional uniqueness principle}. \indexdef{uniqueness!principle, propositional}% \indexsee{propositional!uniqueness principle}{uniqueness principle, propositional}% (In later chapters we will also occasionally encounter \emph{propositional computation rules}.) \indexdef{computation rule!propositional}% \end{enumerate} The inference rules in \cref{sec:syntax-more-formally} are organized and named accordingly; see, for example, \cref{sec:more-formal-pi}, where each possibility is realized. \end{rmk} The way to construct pairs is obvious: given $a:A$ and $b:B$, we may form $(a,b):A\times B$. Similarly, there is a unique way to construct elements of $\unit$, namely we have $\ttt:\unit$. We expect that every element of $A\times B$ is a pair'', which is the uniqueness principle for products; we do not assert this as a rule of type theory, but we will prove it later on as a propositional equality. Now, how can we \emph{use} pairs, i.e.\ how can we define functions out of a product type? Let us first consider the definition of a non-dependent function $f : A\times B \to C$. Since we intend the only elements of $A\times B$ to be pairs, we expect to be able to define such a function by prescribing the result when $f$ is applied to a pair $\tup{a}{b}$. We can prescribe these results by providing a function $g : A \to B \to C$. Thus, we introduce a new rule (the elimination rule for products), which says that for any such $g$, we can define a function $f : A\times B \to C$ by $f(\tup{a}{b}) \defeq g(a)(b).$ We avoid writing $g(a,b)$ here, in order to emphasize that $g$ is not a function on a product. (However, later on in the book we will often write $g(a,b)$ both for functions on a product and for curried functions of two variables.) This defining equation is the computation rule for product types\index{computation rule!for product types}. Note that in set theory, we would justify the above definition of $f$ by the fact that every element of $A\times B$ is an ordered pair, so that it suffices to define $f$ on such pairs. By contrast, type theory reverses the situation: we assume that a function on $A\times B$ is well-defined as soon as we specify its values on pairs, and from this (or more precisely, from its more general version for dependent functions, below) we will be able to \emph{prove} that every element of $A\times B$ is a pair. From a category-theoretic perspective, we can say that we define the product $A\times B$ to be left adjoint to the exponential'' $B\to C$, which we have already introduced. As an example, we can derive the \define{projection} \indexsee{function!projection}{projection}% \indexsee{component, of a pair}{projection}% \indexdef{projection!from cartesian product type}% functions \symlabel{defn:proj}% \begin{align*} \fst & : A \times B \to A \\ \snd & : A \times B \to B \end{align*} with the defining equations \begin{align*} \fst(\tup{a}{b}) & \defeq a \\ \snd(\tup{a}{b}) & \defeq b. \end{align*} % \symlabel{defn:recursor-times}% Rather than invoking this principle of function definition every time we want to define a function, an alternative approach is to invoke it once, in a universal case, and then simply apply the resulting function in all other cases. That is, we may define a function of type \begin{equation} \rec{A\times B} : \prd{C:\UU}(A \to B \to C) \to A \times B \to C \end{equation} with the defining equation $\rec{A\times B}(C,g,\tup{a}{b}) \defeq g(a)(b).$ Then instead of defining functions such as $\fst$ and $\snd$ directly by a defining equation, we could define \begin{align*} \fst &\defeq \rec{A\times B}(A, \lam{a}{b} a)\\ \snd &\defeq \rec{A\times B}(B, \lam{a}{b} b). \end{align*} We refer to the function $\rec{A\times B}$ as the \define{recursor} \indexsee{recursor}{recursion principle}% for product types. The name recursor'' is a bit unfortunate here, since no recursion is taking place. It comes from the fact that product types are a degenerate example of a general framework for inductive types, and for types such as the natural numbers, the recursor will actually be recursive. We may also speak of the \define{recursion principle} for cartesian products, meaning the fact that we can define a function $f:A\times B\to C$ as above by giving its value on pairs. \index{recursion principle!for cartesian product}% We leave it as a simple exercise to show that the recursor can be derived from the projections and vice versa. % Ex: Derive from projections \symlabel{defn:recursor-unit}% We also have a recursor for the unit type: $\rec{\unit} : \prd{C:\UU}C \to \unit \to C$ with the defining equation $\rec{\unit}(C,c,\ttt) \defeq c.$ Although we include it to maintain the pattern of type definitions, the recursor for $\unit$ is completely useless, because we could have defined such a function directly by simply ignoring the argument of type $\unit$. To be able to define \emph{dependent} functions over the product type, we have to generalize the recursor. Given $C: A \times B \to \UU$, we may define a function $f : \prd{x : A \times B} C(x)$ by providing a function \narrowequation{ g : \prd{x:A}\prd{y:B} C(\tup{x}{y}) } with defining equation $f(\tup x y) \defeq g(x)(y).$ For example, in this way we can prove the propositional uniqueness principle, which says that every element of $A\times B$ is equal to a pair. \index{uniqueness!principle, propositional!for product types}% Specifically, we can construct a function $\uniq{A\times B} : \prd{x:A \times B} (\id[A\times B]{\tup{\fst {(x)}}{\snd {(x)}}}{x}).$ Here we are using the identity type, which we are going to introduce below in \cref{sec:identity-types}. However, all we need to know now is that there is a reflexivity element $\refl{x} : \id[A]{x}{x}$ for any $x:A$. Given this, we can define \label{uniquenessproduct} $\uniq{A\times B}(\tup{a}{b}) \defeq \refl{\tup{a}{b}}.$ This construction works, because in the case that $x \defeq \tup{a}{b}$ we can calculate $\tup{\fst(\tup{a}{b})}{\snd{(\tup{a}{b})}} \jdeq \tup{a}{b}$ using the defining equations for the projections. Therefore, $\refl{\tup{a}{b}} : \id{\tup{\fst(\tup{a}{b})}{\snd{(\tup{a}{b})}}}{\tup{a}{b}}$ is well-typed, since both sides of the equality are judgmentally equal. More generally, the ability to define dependent functions in this way means that to prove a property for all elements of a product, it is enough to prove it for its canonical elements, the ordered pairs. When we come to inductive types such as the natural numbers, the analogous property will be the ability to write proofs by induction. Thus, if we do as we did above and apply this principle once in the universal case, we call the resulting function \define{induction} for product types: given $A,B : \UU$ we have \symlabel{defn:induction-times}% $\ind{A\times B} : \prd{C:A \times B \to \UU} \Parens{\prd{x:A}{y:B} C(\tup{x}{y})} \to \prd{x:A \times B} C(x)$ with the defining equation $\ind{A\times B}(C,g,\tup{a}{b}) \defeq g(a)(b).$ Similarly, we may speak of a dependent function defined on pairs being obtained from the \define{induction principle} \index{induction principle}% \index{induction principle!for product}% of the cartesian product. It is easy to see that the recursor is just the special case of induction in the case that the family $C$ is constant. Because induction describes how to use an element of the product type, induction is also called the \define{(dependent) eliminator}, \indexsee{eliminator!of inductive type!dependent}{induction principle}% and recursion the \define{non-dependent eliminator}. \indexsee{eliminator!of inductive type!non-dependent}{recursion principle}% \indexsee{non-dependent eliminator}{recursion principle}% \indexsee{dependent eliminator}{induction principle}% % We can read induction propositionally as saying that a property which % is true for all pairs holds for all elements of the product type. Induction for the unit type turns out to be more useful than the recursor: \symlabel{defn:induction-unit}% $\ind{\unit} : \prd{C:\unit \to \UU} C(\ttt) \to \prd{x:\unit}C(x)$ with the defining equation $\ind{\unit}(C,c,\ttt) \defeq c.$ Induction enables us to prove the propositional uniqueness principle for $\unit$, which asserts that its only inhabitant is $\ttt$. That is, we can construct \label{uniquenessunit} $\uniq{\unit} : \prd{x:\unit} \id{x}{\ttt}$ by using the defining equations $\uniq{\unit}(\ttt) \defeq \refl{\ttt}$ or equivalently by using induction: $\uniq{\unit} \defeq \ind{\unit}(\lamu{x:\unit} \id{x}{\ttt},\refl{\ttt}).$ \index{type!product|)}% \index{type!unit|)}% \section{Dependent pair types (\texorpdfstring{$\Sigma$}{Σ}-types)} \label{sec:sigma-types} \index{type!dependent pair|(defstyle}% \indexsee{type!dependent sum}{type, dependent pair}% \indexsee{type!Sigma-@$\Sigma$-}{type, dependent pair}% \indexsee{Sigma-type@$\Sigma$-type}{type, dependent pair}% \indexsee{sum!dependent}{type, dependent pair}% Just as we generalized function types (\cref{sec:function-types}) to dependent function types (\cref{sec:pi-types}), it is often useful to generalize the product types from \cref{sec:finite-product-types} to allow the type of the second component of a pair to vary depending on the choice of the first component. This is called a \define{dependent pair type}, or \define{$\Sigma$-type}, because in set theory it corresponds to an indexed sum (in the sense of coproduct or disjoint union) over a given type. Given a type $A:\UU$ and a family $B : A \to \UU$, the dependent pair type is written as $\sm{x:A} B(x) : \UU$. Alternative notations are $\tsm{x:A} B(x) \hspace{2cm} \dsm{x:A}B(x) \hspace{2cm} \lsm{x:A} B(x).$ Like other binding constructs such as $\lambda$-abstractions and $\Pi$s, $\Sigma$s automatically scope\index{scope} over the rest of the expression unless delimited, so e.g.\ $\sm{x:A} B(x) \times C(x)$ means $\sm{x:A} (B(x) \times C(x))$. \symlabel{defn:dependent-pair}% \indexdef{pair!dependent}% The way to construct elements of a dependent pair type is by pairing: we have $\tup{a}{b} : \sm{x:A} B(x)$ given $a:A$ and $b:B(a)$. If $B$ is constant, then the dependent pair type is the ordinary cartesian product type: $\Parens{\sm{x:A} B} \jdeq (A \times B).$ All the constructions on $\Sigma$-types arise as straightforward generalizations of the ones for product types, with dependent functions often replacing non-dependent ones. For instance, the recursion principle% \index{recursion principle!for dependent pair type} says that to define a non-dependent function out of a $\Sigma$-type $f : (\sm{x:A} B(x)) \to C$, we provide a function $g : \prd{x:A} B(x) \to C$, and then we can define $f$ via the defining equation $f(\tup{a}{b}) \defeq g(a)(b).$ \indexdef{projection!from dependent pair type}% For instance, we can derive the first projection from a $\Sigma$-type: \symlabel{defn:dependent-proj1}% \begin{equation*} \fst : \Parens{\sm{x : A}B(x)} \to A. \end{equation*} by the defining equation \begin{equation*} \fst(\tup{a}{b}) \defeq a. \end{equation*} However, since the type of the second component of a pair \narrowequation{ (a,b):\sm{x:A} B(x) } is $B(a)$, the second projection must be a \emph{dependent} function, whose type involves the first projection function: \symlabel{defn:dependent-proj2}% $\snd : \prd{p:\sm{x : A}B(x)}B(\fst(p)).$ Thus we need the \emph{induction} principle% \index{induction principle!for dependent pair type} for $\Sigma$-types (the dependent eliminator''). This says that to construct a dependent function out of a $\Sigma$-type into a family $C : (\sm{x:A} B(x)) \to \UU$, we need a function $g : \prd{a:A}{b:B(a)} C(\tup{a}{b}).$ We can then derive a function $f : \prd{p : \sm{x:A}B(x)} C(p)$ with defining equation\index{computation rule!for dependent pair type} $f(\tup{a}{b}) \defeq g(a)(b).$ Applying this with $C(p)\defeq B(\fst(p))$, we can define \narrowequation{ \snd : \prd{p:\sm{x : A}B(x)}B(\fst(p)) } with the obvious equation $\snd(\tup{a}{b}) \defeq b.$ To convince ourselves that this is correct, we note that $B (\fst(\tup{a}{b})) \jdeq B(a)$, using the defining equation for $\fst$, and indeed $b : B(a)$. We can package the recursion and induction principles into the recursor for $\Sigma$: \symlabel{defn:recursor-sm}% $\rec{\sm{x:A}B(x)} : \dprd{C:\UU}\Parens{\tprd{x:A} B(x) \to C} \to \Parens{\tsm{x:A}B(x)} \to C$ with the defining equation $\rec{\sm{x:A}B(x)}(C,g,\tup{a}{b}) \defeq g(a)(b)$ and the corresponding induction operator: \symlabel{defn:induction-sm}% \begin{narrowmultline*} \ind{\sm{x:A}B(x)} : \narrowbreak \dprd{C:(\sm{x:A} B(x)) \to \UU} \Parens{\tprd{a:A}{b:B(a)} C(\tup{a}{b})} \to \dprd{p : \sm{x:A}B(x)} C(p) \end{narrowmultline*} with the defining equation $\ind{\sm{x:A}B(x)}(C,g,\tup{a}{b}) \defeq g(a)(b).$ As before, the recursor is the special case of induction when the family $C$ is constant. As a further example, consider the following principle, where $A$ and $B$ are types and $R:A\to B\to \UU$: $\ac : \Parens{\tprd{x:A} \tsm{y :B} R(x,y)} \to \Parens{\tsm{f:A\to B} \tprd{x:A} R(x,f(x))}.$ We may regard $R$ as a proof-relevant relation'' \index{mathematics!proof-relevant}% between $A$ and $B$, with $R(a,b)$ the type of witnesses for relatedness of $a:A$ and $b:B$. Then $\ac$ says intuitively that if we have a dependent function $g$ assigning to every $a:A$ a dependent pair $(b,r)$ where $b:B$ and $r:R(a,b)$, then we have a function $f:A\to B$ and a dependent function assigning to every $a:A$ a witness that $R(a,f(a))$. Our intuition tells us that we can just split up the values of $g$ into their components. Indeed, using the projections we have just defined, we can define: $\ac(g) \defeq \Parens{\lamu{x:A} \fst(g(x)),\, \lamu{x:A} \snd(g(x))}.$ To verify that this is well-typed, note that if $g:\prd{x:A} \sm{y :B} R(x,y)$, we have \begin{align*} \lamu{x:A} \fst(g(x)) &: A \to B, \\ \lamu{x:A} \snd(g(x)) &: \tprd{x:A} R(x,\fst(g(x))). \end{align*} Moreover, the type $\prd{x:A} R(x,\fst(g(x)))$ is the result of substituting the function $\lamu{x:A} \fst(g(x))$ for $f$ in the family being summed over in the co\-do\-main of \ac: $\tprd{x:A} R(x,\fst(g(x))) \jdeq \Parens{\lamu{f:A\to B} \tprd{x:A} R(x,f(x))}\big(\lamu{x:A} \fst(g(x))\big).$ Thus, we have $\Parens{\lamu{x:A} \fst(g(x)),\, \lamu{x:A} \snd(g(x))} : \tsm{f:A\to B} \tprd{x:A} R(x,f(x))$ as required. If we read $\Pi$ as for all'' and $\Sigma$ as there exists'', then the type of the function $\ac$ expresses: \emph{if for all $x:A$ there is a $y:B$ such that $R(x,y)$, then there is a function $f : A \to B$ such that for all $x:A$ we have $R(x,f(x))$}. Since this sounds like a version of the axiom of choice, the function \ac has traditionally been called the \define{type-theoretic axiom of choice}, and as we have just shown, it can be proved directly from the rules of type theory, rather than having to be taken as an axiom. \index{axiom!of choice!type-theoretic}% However, note that no choice is actually involved, since the choices have already been given to us in the premise: all we have to do is take it apart into two functions: one representing the choice and the other its correctness. In \cref{sec:axiom-choice} we will give another formulation of an axiom of choice'' which is closer to the usual one. Dependent pair types are often used to define types of mathematical structures, which commonly consist of several dependent pieces of data. To take a simple example, suppose we want to define a \define{magma}\indexdef{magma} to be a type $A$ together with a binary operation $m:A\to A\to A$. The precise meaning of the phrase together with''\index{together with} (and the synonymous equipped with''\index{equipped with}) is that a magma'' is a \emph{pair} $(A,m)$ consisting of a type $A:\UU$ and an operation $m:A\to A\to A$. Since the type $A\to A\to A$ of the second component $m$ of this pair depends on its first component $A$, such pairs belong to a dependent pair type. Thus, the definition a magma is a type $A$ together with a binary operation $m:A\to A\to A$'' should be read as defining \emph{the type of magmas} to be $\mathsf{Magma} \defeq \sm{A:\UU} (A\to A\to A).$ Given a magma, we extract its underlying type (its carrier''\index{carrier}) with the first projection $\proj1$, and its operation with the second projection $\proj2$. Of course, structures built from more than two pieces of data require iterated pair types, which may be only partially dependent; for instance the type of pointed magmas (magmas $(A,m)$ equipped with a basepoint $e:A$) is $\mathsf{PointedMagma} \defeq \sm{A:\UU} (A\to A\to A) \times A.$ We generally also want to impose axioms on such a structure, e.g.\ to make a pointed magma into a monoid or a group. This can also be done using $\Sigma$-types; see \cref{sec:pat}. In the rest of the book, we will sometimes make definitions of this sort explicit, but eventually we trust the reader to translate them from English into $\Sigma$-types. We also generally follow the common mathematical practice of using the same letter for a structure of this sort and for its carrier (which amounts to leaving the appropriate projection function implicit in the notation): that is, we will speak of a magma $A$ with its operation $m:A\to A\to A$. Note that the canonical elements of $\mathsf{PointedMagma}$ are of the form $(A,(m,e))$ where $A:\UU$, $m:A\to A\to A$, and $e:A$. Because of the frequency with which iterated $\Sigma$-types of this sort arise, we use the usual notation of ordered triples, quadruples and so on to stand for nested pairs (possibly dependent) associating to the right. That is, we have $(x,y,z) \defeq (x,(y,z))$ and $(x,y,z,w)\defeq (x,(y,(z,w)))$, etc. \index{type!dependent pair|)}% \section{Coproduct types} \label{sec:coproduct-types} Given $A,B:\UU$, we introduce their \define{coproduct} type $A+B:\UU$. \indexsee{coproduct}{type, coproduct}% \index{type!coproduct|(defstyle}% \indexsee{disjoint!sum}{type, coproduct}% \indexsee{disjoint!union}{type, coproduct}% \indexsee{sum!disjoint}{type, coproduct}% \indexsee{union!disjoint}{type, coproduct}% This corresponds to the \emph{disjoint union} in set theory, and we may also use that name for it. In type theory, as was the case with functions and products, the coproduct must be a fundamental construction, since there is no previously given notion of union of types''. We also introduce a nullary version: the \define{empty type $\emptyt:\UU$}. \indexsee{nullary!coproduct}{type, empty}% \indexsee{empty type}{type, empty}% \index{type!empty|(defstyle}% There are two ways to construct elements of $A+B$, either as $\inl(a) : A+B$ for $a:A$, or as $\inr(b):A+B$ for $b:B$. (The names $\inl$ and $\inr$ are short for left injection'' and right injection''.) There are no ways to construct elements of the empty type. \index{recursion principle!for coproduct} To construct a non-dependent function $f : A+B \to C$, we need functions $g_0 : A \to C$ and $g_1 : B \to C$. Then $f$ is defined via the defining equations \begin{align*} f(\inl(a)) &\defeq g_0(a), \\ f(\inr(b)) &\defeq g_1(b). \end{align*} That is, the function $f$ is defined by \define{case analysis}. \indexdef{case analysis}% As before, we can derive the recursor: \symlabel{defn:recursor-plus}% $\rec{A+B} : \dprd{C:\UU}(A \to C) \to (B\to C) \to A+B \to C$ with the defining equations \begin{align*} \rec{A+B}(C,g_0,g_1,\inl(a)) &\defeq g_0(a), \\ \rec{A+B}(C,g_0,g_1,\inr(b)) &\defeq g_1(b). \end{align*} \index{recursion principle!for empty type} We can always construct a function $f : \emptyt \to C$ without having to give any defining equations, because there are no elements of \emptyt on which to define $f$. Thus, the recursor for $\emptyt$ is \symlabel{defn:recursor-emptyt}% $\rec{\emptyt} : \tprd{C:\UU} \emptyt \to C,$ which constructs the canonical function from the empty type to any other type. Logically, it corresponds to the principle \textit{ex falso quodlibet}. \index{ex falso quodlibet@\textit{ex falso quodlibet}} \index{induction principle!for coproduct} To construct a dependent function $f:\prd{x:A+B}C(x)$ out of a coproduct, we assume as given the family $C: (A + B) \to \UU$, and require \begin{align*} g_0 &: \prd{a:A} C(\inl(a)), \\ g_1 &: \prd{b:B} C(\inr(b)). \end{align*} This yields $f$ with the defining equations:\index{computation rule!for coproduct type} \begin{align*} f(\inl(a)) &\defeq g_0(a), \\ f(\inr(b)) &\defeq g_1(b). \end{align*} We package this scheme into the induction principle for coproducts: \symlabel{defn:induction-plus}% \begin{narrowmultline*} \ind{A+B} : \dprd{C: (A + B) \to \UU} \Parens{\tprd{a:A} C(\inl(a))} \to \narrowbreak \Parens{\tprd{b:B} C(\inr(b))} \to \tprd{x:A+B}C(x). \end{narrowmultline*} As before, the recursor arises in the case that the family $C$ is constant. \index{induction principle!for empty type} The induction principle for the empty type \symlabel{defn:induction-emptyt}% $\ind{\emptyt} : \prd{C:\emptyt \to \UU}{z:\emptyt} C(z)$ gives us a way to define a trivial dependent function out of the empty type. % In the presence of $\eta$-equality it is derivable % from the recursor. % ex \index{type!coproduct|)}% \index{type!empty|)}% \section{The type of booleans} \label{sec:type-booleans} \indexsee{boolean!type of}{type of booleans}% \index{type!of booleans|(defstyle}% The type of booleans $\bool:\UU$ is intended to have exactly two elements $\bfalse,\btrue : \bool$. It is clear that we could construct this type out of coproduct % this one results in a warning message just because it's on the same page as the previous entry % for {type!coproduct}, so it's not our fault \index{type!coproduct}% and unit\index{type!unit} types as $\unit + \unit$. However, since it is used frequently, we give the explicit rules here. Indeed, we are going to observe that we can also go the other way and derive binary coproducts from $\Sigma$-types and $\bool$. \index{recursion principle!for type of booleans} To derive a function $f : \bool \to C$ we need $c_0,c_1 : C$ and add the defining equations \begin{align*} f(\bfalse) &\defeq c_0, \\ f(\btrue) &\defeq c_1. \end{align*} The recursor corresponds to the if-then-else construct in functional programming: \symlabel{defn:recursor-bool}% $\rec{\bool} : \prd{C:\UU} C \to C \to \bool \to C$ with the defining equations \begin{align*} \rec{\bool}(C,c_0,c_1,\bfalse) &\defeq c_0, \\ \rec{\bool}(C,c_0,c_1,\btrue) &\defeq c_1. \end{align*} \index{induction principle!for type of booleans} Given $C : \bool \to \UU$, to derive a dependent function $f : \prd{x:\bool}C(x)$ we need $c_0:C(\bfalse)$ and $c_1 : C(\btrue)$, in which case we can give the defining equations \begin{align*} f(\bfalse) &\defeq c_0, \\ f(\btrue) &\defeq c_1. \end{align*} We package this up into the induction principle \symlabel{defn:induction-bool}% $\ind{\bool} : \dprd{C:\bool \to \UU} C(\bfalse) \to C(\btrue) \to \tprd{x:\bool} C(x)$ with the defining equations \begin{align*} \ind{\bool}(C,c_0,c_1,\bfalse) &\defeq c_0, \\ \ind{\bool}(C,c_0,c_1,\btrue) &\defeq c_1. \end{align*} As an example, using the induction principle we can deduce that, as we expect, every element of $\bool$ is either $\btrue$ or $\bfalse$. As before, in order to state this we use the equality types which we have not yet introduced, but we need only the fact that everything is equal to itself by $\refl{x}:x=x$. Thus, we construct an element of \begin{equation}\label{thm:allbool-trueorfalse} \prd{x:\bool}(x=\bfalse)+(x=\btrue), \end{equation} i.e.\ a function assigning to each $x:\bool$ either an equality $x=\bfalse$ or an equality $x=\btrue$. We define this element using the induction principle for \bool, with $C(x) \defeq (x=\bfalse)+(x=\btrue)$; the two inputs are $\inl(\refl{\bfalse}) : C(\bfalse)$ and $\inr(\refl{\btrue}):C(\btrue)$. In other words, our element of~\eqref{thm:allbool-trueorfalse} is $\ind{\bool}\big(\lam{x}(x=\bfalse)+(x=\btrue),\, \inl(\refl{\bfalse}),\, \inr(\refl{\btrue})\big).$ We have remarked that $\Sigma$-types can be regarded as analogous to indexed disjoint unions, while coproducts are binary disjoint unions. It is natural to expect that a binary disjoint union $A+B$ could be constructed as an indexed one over the two-element type \bool. For this we need a type family $P:\bool\to\type$ such that $P(\bfalse)\jdeq A$ and $P(\btrue)\jdeq B$. Indeed, we can obtain such a family precisely by the recursion principle for $\bool$. \index{type!family of}% (The ability to define \emph{type families} by induction and recursion, using the fact that the universe $\UU$ is itself a type, is a subtle and important aspect of type theory.) Thus, we could have defined \index{type!coproduct}% $A + B \defeq \sm{x:\bool} \rec{\bool}(\UU,A,B,x).$ with \begin{align*} \inl(a) &\defeq \tup{\bfalse}{a}, \\ \inr(b) &\defeq \tup{\btrue}{b}. \end{align*} We leave it as an exercise to derive the induction principle of a coproduct type from this definition. (See also \cref{ex:sum-via-bool,sec:appetizer-univalence}.) We can apply the same idea to products and $\Pi$-types: we could have defined $A \times B \defeq \prd{x:\bool}\rec{\bool}(\UU,A,B,x).$ Pairs could then be constructed using induction for \bool: $\tup{a}{b} \defeq \ind{\bool}(\rec{\bool}(\UU,A,B),a,b)$ while the projections are straightforward applications \begin{align*} \fst(p) &\defeq p(\bfalse), \\ \snd(p) &\defeq p(\btrue). \end{align*} The derivation of the induction principle for binary products defined in this way is a bit more involved, and requires function extensionality, which we will introduce in \cref{sec:compute-pi}. Moreover, we do not get the same judgmental equalities; see \cref{ex:prod-via-bool}. This is a recurrent issue when encoding one type as another; we will return to it in \cref{sec:htpy-inductive}. We may occasionally refer to the elements $\bfalse$ and $\btrue$ of $\bool$ as false'' and true'' respectively. However, note that unlike in classical\index{mathematics!classical} mathematics, we do not use elements of $\bool$ as truth values \index{value!truth}% or as propositions. (Instead we identify propositions with types; see \cref{sec:pat}.) In particular, the type $A \to \bool$ is not generally the power set\index{power set} of $A$; it represents only the decidable'' subsets of $A$ (see \cref{cha:logic}). \index{decidable!subset}% \index{type!of booleans|)}% \section{The natural numbers} \label{sec:inductive-types} \indexsee{type!of natural numbers}{natural numbers}% \index{natural numbers|(defstyle}% \indexsee{number!natural}{natural numbers}% The rules we have introduced so far do not allow us to construct any infinite types. The simplest infinite type we can think of (and one which is of course also extremely useful) is the type $\nat : \UU$ of natural numbers. The elements of $\nat$ are constructed using $0 : \nat$\indexdef{zero} and the successor\indexdef{successor} operation $\suc : \nat \to \nat$. When denoting natural numbers, we adopt the usual decimal notation $1 \defeq \suc(0)$, $2 \defeq \suc(1)$, $3 \defeq \suc(2)$, \dots. The essential property of the natural numbers is that we can define functions by recursion and perform proofs by induction --- where now the words recursion'' and induction'' have a more familiar meaning. \index{recursion principle!for natural numbers}% To construct a non-dependent function $f : \nat \to C$ out of the natural numbers by recursion, it is enough to provide a starting point $c_0 : C$ and a next step'' function $c_s : \nat \to C \to C$. This gives rise to $f$ with the defining equations\index{computation rule!for natural numbers} \begin{align*} f(0) &\defeq c_0, \\ f(\suc(n)) &\defeq c_s(n,f(n)). \end{align*} We say that $f$ is defined by \define{primitive recursion}. \indexdef{primitive!recursion}% \indexdef{recursion!primitive}% As an example, we look at how to define a function on natural numbers which doubles its argument. In this case we have $C\defeq \nat$. We first need to supply the value of $\dbl(0)$, which is easy: we put $c_0 \defeq 0$. Next, to compute the value of $\dbl(\suc(n))$ for a natural number $n$, we first compute the value of $\dbl(n)$ and then perform the successor operation twice. This is captured by the recurrence\index{recurrence} $c_s(n,y) \defeq \suc(\suc(y))$. Note that the second argument $y$ of $c_s$ stands for the result of the \emph{recursive call}\index{recursive call} $\dbl(n)$. Defining $\dbl:\nat\to\nat$ by primitive recursion in this way, therefore, we obtain the defining equations: \begin{align*} \dbl(0) &\defeq 0\\ \dbl(\suc(n)) &\defeq \suc(\suc(\dbl(n))). \end{align*} This indeed has the correct computational behavior: for example, we have \begin{align*} \dbl(2) &\jdeq \dbl(\suc(\suc(0)))\\ & \jdeq c_s(\suc(0), \dbl(\suc(0))) \\ & \jdeq \suc(\suc(\dbl(\suc(0)))) \\ & \jdeq \suc(\suc(c_s(0,\dbl(0)))) \\ & \jdeq \suc(\suc(\suc(\suc(\dbl(0))))) \\ & \jdeq \suc(\suc(\suc(\suc(c_0)))) \\ & \jdeq \suc(\suc(\suc(\suc(0))))\\ &\jdeq 4. \end{align*} We can define multi-variable functions by primitive recursion as well, by currying and allowing $C$ to be a function type. \indexdef{addition!of natural numbers} For example, we define addition $\add : \nat \to \nat \to \nat$ with $C \defeq \nat \to \nat$ and the following starting point'' and next step'' data: \begin{align*} c_0 & : \nat \to \nat \\ c_0 (n) & \defeq n \\ c_s & : \nat \to (\nat \to \nat) \to (\nat \to \nat) \\ c_s(m,g)(n) & \defeq \suc(g(n)). \end{align*} We thus obtain $\add : \nat \to \nat \to \nat$ satisfying the definitional equalities \begin{align*} \add(0,n) &\jdeq n \\ \add(\suc(m),n) &\jdeq \suc(\add(m,n)). \end{align*} As usual, we write $\add(m,n)$ as $m+n$. The reader is invited to verify that $2+2\jdeq 4$. % ex: define multiplication and exponentiation. As in previous cases, we can package the principle of primitive recursion into a recursor: $\rec{\nat} : \dprd{C:\UU} C \to (\nat \to C \to C) \to \nat \to C$ with the defining equations \symlabel{defn:recursor-nat}% \begin{align*} \rec{\nat}(C,c_0,c_s,0) &\defeq c_0, \\ \rec{\nat}(C,c_0,c_s,\suc(n)) &\defeq c_s(n,\rec{\nat}(C,c_0,c_s,n)). \end{align*} %ex derive rec from it Using $\rec{\nat}$ we can present $\dbl$ and $\add$ as follows: \begin{align} \dbl &\defeq \rec\nat\big(\nat,\, 0,\, \lamu{n:\nat}{y:\nat} \suc(\suc(y))\big) \label{eq:dbl-as-rec}\\ \add &\defeq \rec{\nat}\big(\nat \to \nat,\, \lamu{n:\nat} n,\, \lamu{n:\nat}{g:\nat \to \nat}{m :\nat} \suc(g(m))\big). \end{align} Of course, all functions definable only using the primitive recursion principle will be \emph{computable}. (The presence of higher function types --- that is, functions with other functions as arguments --- does, however, mean we can define more than the usual primitive recursive functions; see e.g.~\cref{ex:ackermann}.) This is appropriate in constructive mathematics; \index{mathematics!constructive}% in \cref{sec:intuitionism,sec:axiom-choice} we will see how to augment type theory so that we can define more general mathematical functions. \index{induction principle!for natural numbers} We now follow the same approach as for other types, generalizing primitive recursion to dependent functions to obtain an \emph{induction principle}. Thus, assume as given a family $C : \nat \to \UU$, an element $c_0 : C(0)$, and a function $c_s : \prd{n:\nat} C(n) \to C(\suc(n))$; then we can construct $f : \prd{n:\nat} C(n)$ with the defining equations:\index{computation rule!for natural numbers} \begin{align*} f(0) &\defeq c_0, \\ f(\suc(n)) &\defeq c_s(n,f(n)). \end{align*} We can also package this into a single function \symlabel{defn:induction-nat}% $\ind{\nat} : \dprd{C:\nat\to \UU} C(0) \to \Parens{\tprd{n : \nat} C(n) \to C(\suc(n))} \to \tprd{n : \nat} C(n)$ with the defining equations \begin{align*} \ind{\nat}(C,c_0,c_s,0) &\defeq c_0, \\ \ind{\nat}(C,c_0,c_s,\suc(n)) &\defeq c_s(n,\ind{\nat}(C,c_0,c_s,n)). \end{align*} Here we finally see the connection to the classical notion of proof by induction. Recall that in type theory we represent propositions by types, and proving a proposition by inhabiting the corresponding type. In particular, a \emph{property} of natural numbers is represented by a family of types $P:\nat\to\type$. From this point of view, the above induction principle says that if we can prove $P(0)$, and if for any $n$ we can prove $P(\suc(n))$ assuming $P(n)$, then we have $P(n)$ for all $n$. This is, of course, exactly the usual principle of proof by induction on natural numbers. \index{associativity!of addition!of natural numbers} As an example, consider how we might represent an explicit proof that $+$ is associative. (We will not actually write out proofs in this style, but it serves as a useful example for understanding how induction is represented formally in type theory.) To derive $\assoc : \prd{i,j,k:\nat} \id{i + (j + k)}{(i + j) + k},$ it is sufficient to supply $\assoc_0 : \prd{j,k:\nat} \id{0 + (j + k)}{(0+ j) + k}$ and \begin{narrowmultline*} \assoc_s : \prd{i:\nat} \left(\prd{j,k:\nat} \id{i + (j + k)}{(i + j) + k}\right) \narrowbreak \to \prd{j,k:\nat} \id{\suc(i) + (j + k)}{(\suc(i) + j) + k}. \end{narrowmultline*} To derive $\assoc_0$, recall that $0+n \jdeq n$, and hence $0 + (j + k) \jdeq j+k \jdeq (0+ j) + k$. Hence we can just set $\assoc_0(j,k) \defeq \refl{j+k}.$ For $\assoc_s$, recall that the definition of $+$ gives $\suc(m)+n \jdeq \suc(m+n)$, and hence \begin{align*} \suc(i) + (j + k) &\jdeq \suc(i+(j+k)) \qquad\text{and}\\ (\suc(i)+j)+k &\jdeq \suc((i+j)+k). \end{align*} Thus, the output type of $\assoc_s$ is equivalently $\id{\suc(i+(j+k))}{\suc((i+j)+k)}$. But its input (the inductive hypothesis'') \index{hypothesis!inductive}% \index{inductive!hypothesis}% yields $\id{i+(j+k)}{(i+j)+k}$, so it suffices to invoke the fact that if two natural numbers are equal, then so are their successors. (We will prove this obvious fact in \cref{lem:map}, using the induction principle of identity types.) We call this latter fact $\apfunc{\suc} : %\prd{m,n:\nat} (\id[\nat]{m}{n}) \to (\id[\nat]{\suc(m)}{\suc(n)})$, so we can define $\assoc_s(i,h,j,k) \defeq \apfunc{\suc}( %n+(j+k),(n+j)+k, h(j,k)).$ Putting these together with $\ind{\nat}$, we obtain a proof of associativity. \index{natural numbers|)}% \section{Pattern matching and recursion} \label{sec:pattern-matching} \index{pattern matching|(defstyle}% \indexsee{matching}{pattern matching}% \index{definition!by pattern matching|(}% The natural numbers introduce an additional subtlety over the types considered up until now. In the case of coproducts, for instance, we could define a function $f:A+B\to C$ either with the recursor: $f \defeq \rec{A+B}(C, g_0, g_1)$ or by giving the defining equations: \begin{align*} f(\inl(a)) &\defeq g_0(a)\\ f(\inr(b)) &\defeq g_1(b). \end{align*} To go from the former expression of $f$ to the latter, we simply use the computation rules for the recursor. Conversely, given any defining equations \begin{align*} f(\inl(a)) &\defeq \Phi_0\\ f(\inr(b)) &\defeq \Phi_1 \end{align*} where $\Phi_0$ and $\Phi_1$ are expressions that may involve the variables \index{variable}% $a$ and $b$ respectively, we can express these equations equivalently in terms of the recursor by using $\lambda$-abstraction\index{lambda abstraction@$\lambda$-abstraction}: $f\defeq \rec{A+B}(C, \lam{a} \Phi_0, \lam{b} \Phi_1).$ In the case of the natural numbers, however, the defining equations'' of a function such as $\dbl$: \begin{align} \dbl(0) &\defeq 0 \label{eq:dbl0}\\ \dbl(\suc(n)) &\defeq \suc(\suc(\dbl(n)))\label{eq:dblsuc} \end{align} involve \emph{the function $\dbl$ itself} on the right-hand side. However, we would still like to be able to give these equations, rather than~\eqref{eq:dbl-as-rec}, as the definition of \dbl, since they are much more convenient and readable. The solution is to read the expression $\dbl(n)$'' on the right-hand side of~\eqref{eq:dblsuc} as standing in for the result of the recursive call, which in a definition of the form $\dbl\defeq \rec{\nat}(\nat,c_0,c_s)$ would be the second argument of $c_s$. More generally, if we have a definition'' of a function $f:\nat\to C$ such as \begin{align*} f(0) &\defeq \Phi_0\\ f(\suc(n)) &\defeq \Phi_s \end{align*} where $\Phi_0$ is an expression of type $C$, and $\Phi_s$ is an expression of type $C$ which may involve the variable $n$ and also the symbol $f(n)$'', we may translate it to a definition $f \defeq \rec{\nat}(C,\,\Phi_0,\,\lam{n}{r} \Phi_s')$ where $\Phi_s'$ is obtained from $\Phi_s$ by replacing all occurrences of $f(n)$'' by the new variable $r$. This style of defining functions by recursion (or, more generally, dependent functions by induction) is so convenient that we frequently adopt it. It is called definition by \define{pattern matching}. Of course, it is very similar to how a computer programmer may define a recursive function with a body that literally contains recursive calls to itself. However, unlike the programmer, we are restricted in what sort of recursive calls we can make: in order for such a definition to be re-expressible using the recursion principle, the function $f$ being defined can only appear in the body of $f(\suc(n))$ as part of the composite symbol $f(n)$''. Otherwise, we could write nonsense functions such as \begin{align*} f(0)&\defeq 0\\ f(\suc(n)) &\defeq f(\suc(\suc(n))). \end{align*} If a programmer wrote such a function, it would simply call itself forever on any positive input, going into an infinite loop and never returning a value. In mathematics, however, to be worthy of the name, a \emph{function} must always associate a unique output value to every input value, so this would be unacceptable. This point will be even more important when we introduce more complicated inductive types in \cref{cha:induction,cha:hits,cha:real-numbers}. Whenever we introduce a new kind of inductive definition, we always begin by deriving its induction principle. Only then do we introduce an appropriate sort of pattern matching'' which can be justified as a shorthand for the induction principle. \index{pattern matching|)}% \index{definition!by pattern matching|)}% \section{Propositions as types} \label{sec:pat} \index{proposition!as types|(defstyle}% \index{logic!propositions as types|(}% As mentioned in the introduction, to show that a proposition is true in type theory corresponds to exhibiting an element of the type corresponding to that proposition. \index{evidence, of the truth of a proposition}% \index{witness!to the truth of a proposition}% \index{proof|(} We regard the elements of this type as \emph{evidence} or \emph{witnesses} that the proposition is true. (They are sometimes even called \emph{proofs}, but this terminology can be misleading, so we generally avoid it.) In general, however, we will not construct witnesses explicitly; instead we present the proofs in ordinary mathematical prose, in such a way that they could be translated into an element of a type. This is no different from reasoning in classical set theory, where we don't expect to see an explicit derivation using the rules of predicate logic and the axioms of set theory. However, the type-theoretic perspective on proofs is nevertheless different in important ways. The basic principle of the logic of type theory is that a proposition is not merely true or false, but rather can be seen as the collection of all possible witnesses of its truth. Under this conception, proofs are not just the means by which mathematics is communicated, but rather are mathematical objects in their own right, on a par with more familiar objects such as numbers, mappings, groups, and so on. Thus, since types classify the available mathematical objects and govern how they interact, propositions are nothing but special types --- namely, types whose elements are proofs. \index{propositional!logic}% \index{logic!propositional}% The basic observation which makes this identification feasible is that we have the following natural correspondence between \emph{logical} operations on propositions, expressed in English, and \emph{type-theoretic} operations on their corresponding types of witnesses. \index{false}% \index{true}% \index{conjunction}% \index{disjunction}% \index{implication}% \begin{center} \medskip \begin{tabular}{ll} \toprule English & Type Theory\\ \midrule True & $\unit$ \\ False & $\emptyt$ \\ $A$ and $B$ & $A \times B$ \\ $A$ or $B$ & $A + B$ \\ If $A$ then $B$ & $A \to B$ \\ $A$ if and only if $B$ & $(A \to B) \times (B \to A)$ \\ Not $A$ & $A \to \emptyt$ \\ \bottomrule \end{tabular} \medskip \end{center} The point of the correspondence is that in each case, the rules for constructing and using elements of the type on the right correspond to the rules for reasoning about the proposition on the left. For instance, the basic way to prove a statement of the form $A$ and $B$'' is to prove $A$ and also prove $B$, while the basic way to construct an element of $A\times B$ is as a pair $(a,b)$, where $a$ is an element (or witness) of $A$ and $b$ is an element (or witness) of $B$. And if we want to use $A$ and $B$'' to prove something else, we are free to use both $A$ and $B$ in doing so, analogously to how the induction principle for $A\times B$ allows us to construct a function out of it by using elements of $A$ and of $B$. Similarly, the basic way to prove an implication\index{implication} if $A$ then $B$'' is to assume $A$ and prove $B$, while the basic way to construct an element of $A\to B$ is to give an expression which denotes an element (witness) of $B$ which may involve an unspecified variable element (witness) of type $A$. And the basic way to use an implication if $A$ then $B$'' is deduce $B$ if we know $A$, analogously to how we can apply a function $f:A\to B$ to an element of $A$ to produce an element of $B$. We strongly encourage the reader to do the exercise of verifying that the rules governing the other type constructors translate sensibly into logic. Of special note is that the empty type $\emptyt$ corresponds to falsity.\index{false} When speaking logically, we refer to an inhabitant of $\emptyt$ as a \define{contradiction}: \indexdef{contradiction}% thus there is no way to prove a contradiction,% \footnote{More precisely, there is no \emph{basic} way to prove a contradiction, i.e.\ \emptyt has no constructors. If our type theory were inconsistent, then there would be some more complicated way to construct an element of \emptyt.} while from a contradiction anything can be derived. We also define the \define{negation} \indexdef{negation}% of a type $A$ as % \begin{equation*} \neg A \ \defeq\ A \to \emptyt. \end{equation*} % Thus, a witness of $\neg A$ is a function $A \to \emptyt$, which we may construct by assuming $x : A$ and deriving an element of~$\emptyt$. \index{proof!by contradiction}% \index{logic!constructive vs classical} Note that although the logic we obtain is constructive'', as discussed in the introduction, this sort of proof by contradiction'' (assume $A$ and derive a contradiction, concluding $\neg A$) is perfectly valid constructively: it is simply invoking the \emph{meaning} of negation''. The sort of proof by contradiction'' which is disallowed is to assume $\neg A$ and derive a contradiction as a way of proving $A$. Constructively, such an argument would only allow us to conclude $\neg\neg A$, and the reader can verify that there is no obvious way to get from $\neg\neg A$ (that is, from $(A\to \emptyt)\to\emptyt$) to $A$. \mentalpause The above translation of logical connectives into type-forming operations is referred to as \define{propositions as types}: it gives us a way to translate propositions and their proofs, written in English, into types and their elements. For example, suppose we want to prove the following tautology (one of de Morgan's laws''): \index{law!de Morgan's|(}% \index{de Morgan's laws|(}% \begin{equation}\label{eq:tautology1} \text{\emph{If not $A$ and not $B$, then not ($A$ or $B$)''}.} \end{equation} An ordinary English proof of this fact might go as follows. \begin{quote} Suppose not $A$ and not $B$, and also suppose $A$ or $B$; we will derive a contradiction. There are two cases. If $A$ holds, then since not $A$, we have a contradiction. Similarly, if $B$ holds, then since not $B$, we also have a contradiction. Thus we have a contradiction in either case, so not ($A$ or $B$). \end{quote} Now, the type corresponding to our tautology~\eqref{eq:tautology1}, according to the rules given above, is \begin{equation}\label{eq:tautology2} (A\to \emptyt) \times (B\to\emptyt) \to (A+B\to\emptyt) \end{equation} so we should be able to translate the above proof into an element of this type. As an example of how such a translation works, let us describe how a mathematician reading the above English proof might simultaneously construct, in his or her head, an element of~\eqref{eq:tautology2}. The introductory phrase Suppose not $A$ and not $B$'' translates into defining a function, with an implicit application of the recursion principle for the cartesian product in its domain $(A\to\emptyt)\times (B\to\emptyt)$. This introduces unnamed variables \index{variable}% (hypotheses) \index{hypothesis}% of types $A\to\emptyt$ and $B\to\emptyt$. When translating into type theory, we have to give these variables names; let us call them $x$ and $y$. At this point our partial definition of an element of~\eqref{eq:tautology2} can be written as $f((x,y)) \defeq\; \Box\;:A+B\to\emptyt$ with a hole'' $\Box$ of type $A+B\to\emptyt$ indicating what remains to be done. (We could equivalently write $f \defeq \rec{(A\to\emptyt)\times (B\to\emptyt)}(A+B\to\emptyt,\lam{x}{y} \Box)$, using the recursor instead of pattern matching.) The next phrase also suppose $A$ or $B$; we will derive a contradiction'' indicates filling this hole by a function definition, introducing another unnamed hypothesis $z:A+B$, leading to the proof state: $f((x,y))(z) \defeq \;\Box\; :\emptyt$ Now saying there are two cases'' indicates a case split, i.e.\ an application of the recursion principle for the coproduct $A+B$. If we write this using the recursor, it would be $f((x,y))(z) \defeq \rec{A+B}(\emptyt,\lam{a} \Box,\lam{b}\Box,z)$ while if we write it using pattern matching, it would be \begin{align*} f((x,y))(\inl(a)) &\defeq \;\Box\;:\emptyt\\ f((x,y))(\inr(b)) &\defeq \;\Box\;:\emptyt. \end{align*} Note that in both cases we now have two holes'' of type $\emptyt$ to fill in, corresponding to the two cases where we have to derive a contradiction. Finally, the conclusion of a contradiction from $a:A$ and $x:A\to\emptyt$ is simply application of the function $x$ to $a$, and similarly in the other case. \index{application!of hypothesis or theorem}% (Note the convenient coincidence of the phrase applying a function'' with that of applying a hypothesis'' or theorem.) Thus our eventual definition is \begin{align*} f((x,y))(\inl(a)) &\defeq x(a)\\ f((x,y))(\inr(b)) &\defeq y(b). \end{align*} As an exercise, you should verify the converse tautology \emph{If not ($A$ or $B$), then (not $A$) and (not $B$)}'' by exhibiting an element of $((A + B) \to \emptyt) \to (A \to \emptyt) \times (B \to \emptyt),$ for any types $A$ and $B$, using the rules we have just introduced. \index{logic!classical vs constructive|(} However, not all classical\index{mathematics!classical} tautologies hold under this interpretation. For example, the rule \emph{If not ($A$ and $B$), then (not $A$) or (not $B$)''} is not valid: we cannot, in general, construct an element of the corresponding type $((A \times B) \to \emptyt) \to (A \to \emptyt) + (B \to \emptyt).$ This reflects the fact that the natural'' propositions-as-types logic of type theory is \emph{constructive}. This means that it does not include certain classical principles, such as the law of excluded middle (\LEM{})\index{excluded middle} or proof by contradiction,\index{proof!by contradiction} and others which depend on them, such as this instance of de Morgan's law. \index{law!de Morgan's|)}% \index{de Morgan's laws|)}% Philosophically, constructive logic is so-called because it confines itself to constructions that can be carried out \emph{effectively}, which is to say those with a computational meaning. Without being too precise, this means there is some sort of algorithm\index{algorithm} specifying, step-by-step, how to build an object (and, as a special case, how to see that a theorem is true). This requires omission of \LEM{}, since there is no \emph{effective}\index{effective!procedure} procedure for deciding whether a proposition is true or false. The constructivity of type-theoretic logic means it has an intrinsic computational meaning, which is of interest to computer scientists. It also means that type theory provides \emph{axiomatic freedom}.\index{axiomatic freedom} For example, while by default there is no construction witnessing \LEM{}, the logic is still compatible with the existence of one (see \cref{sec:intuitionism}). Thus, because type theory does not \emph{deny} \LEM{}, we may consistently add it as an assumption, and work conventionally without restriction. In this respect, type theory enriches, rather than constrains, conventional mathematical practice. We encourage the reader who is unfamiliar with constructive logic to work through some more examples as a means of getting familiar with it. See \cref{ex:tautologies,ex:not-not-lem} for some suggestions. \index{logic!classical vs constructive|)} \mentalpause So far we have discussed only propositional logic. \index{quantifier}% \index{quantifier!existential}% \index{quantifier!universal}% \index{predicate!logic}% \index{logic!predicate}% Now we consider \emph{predicate} logic, where in addition to logical connectives like and'' and or'' we have quantifiers there exists'' and for all''. In this case, types play a dual role: they serve as propositions and also as types in the conventional sense, i.e., domains we quantify over. A predicate over a type $A$ is represented as a family $P : A \to \UU$, assigning to every element $a : A$ a type $P(a)$ corresponding to the proposition that $P$ holds for $a$. We now extend the above translation with an explanation of the quantifiers: \begin{center} \medskip \begin{tabular}{ll} \toprule English & Type Theory\\ \midrule For all $x:A$, $P(x)$ holds & $\prd{x:A} P(x)$ \\ There exists $x:A$ such that $P(x)$ & $\sm{x:A}$ $P(x)$ \\ \bottomrule \end{tabular} \medskip \end{center} As before, we can show that tautologies of (constructive) predicate logic translate into inhabited types. For example, \emph{If for all $x:A$, $P(x)$ and $Q(x)$ then (for all $x:A$, $P(x)$) and (for all $x:A$, $Q(x)$)} translates to $(\tprd{x:A} P(x) \times Q(x)) \to (\tprd{x:A} P(x)) \times (\tprd{x:A} Q(x)).$ An informal proof of this tautology might go as follows: \begin{quote} Suppose for all $x$, $P(x)$ and $Q(x)$. First, we suppose given $x$ and prove $P(x)$. By assumption, we have $P(x)$ and $Q(x)$, and hence we have $P(x)$. Second, we suppose given $x$ and prove $Q(x)$. Again by assumption, we have $P(x)$ and $Q(x)$, and hence we have $Q(x)$. \end{quote} The first sentence begins defining an implication as a function, by introducing a witness for its hypothesis:\index{hypothesis} $f(p) \defeq \;\Box\; : (\tprd{x:A} P(x)) \times (\tprd{x:A} Q(x)).$ At this point there is an implicit use of the pairing constructor to produce an element of a product type, which is somewhat signposted in this example by the words first'' and second'': $f(p) \defeq \Big( \;\Box\; : \tprd{x:A} P(x) \;,\; \Box\; : \tprd{x:A}Q(x) \;\Big).$ The phrase we suppose given $x$ and prove $P(x)$'' now indicates defining a \emph{dependent} function in the usual way, introducing a variable \index{variable}% for its input. Since this is inside a pairing constructor, it is natural to write it as a $\lambda$-abstraction\index{lambda abstraction@$\lambda$-abstraction}: $f(p) \defeq \Big( \; \lam{x} \;\big(\Box\; : P(x)\big) \;,\; \Box\; : \tprd{x:A}Q(x) \;\Big).$ Now we have $P(x)$ and $Q(x)$'' invokes the hypothesis, obtaining $p(x) : P(x)\times Q(x)$, and hence we have $P(x)$'' implicitly applies the appropriate projection: $f(p) \defeq \Big( \; \lam{x} \proj1(p(x)) \;,\; \Box\; : \tprd{x:A}Q(x) \;\Big).$ The next two sentences fill the other hole in the obvious way: $f(p) \defeq \Big( \; \lam{x} \proj1(p(x)) \;,\; \lam{x} \proj2(p(x)) \; \Big).$ Of course, the English proofs we have been using as examples are much more verbose than those that mathematicians usually use in practice; they are more like the sort of language one uses in an introduction to proofs'' class. The practicing mathematician has learned to fill in the gaps, so in practice we can omit plenty of details, and we will generally do so. The criterion of validity for proofs, however, is always that they can be translated back into the construction of an element of the corresponding type. \symlabel{leq-nat}% As a more concrete example, consider how to define inequalities of natural numbers. One natural definition is that $n\le m$ if there exists a $k:\nat$ such that $n+k=m$. (This uses again the identity types that we will introduce in the next section, but we will not need very much about them.) Under the propositions-as-types translation, this would yield: $(n\le m) \defeq \sm{k:\nat} (\id{n+k}{m}).$ The reader is invited to prove the familiar properties of $\le$ from this definition. For strict inequality, there are a couple of natural choices, such as \[ (n