## Ordering, Choice and Zorn's Lemma

### Definitions related to Partial Order

*(strict) initial segment $s(a)$*: If X is a partially ordered set, and a is an element, then the set { x in X : x < a} is called the initial segment of X determined by a.

*weak initial segment $\bar{s}(a)$* : like the initial segment, but using the weak relation <=

*(strict) predecessors, successors, between* If x < y, x is a (strict) predecessor of y, y is a (strict) successor of x. If x < z < y, we say z is (strictly) between x and y. We can also use the weak equivalents explicitly.

*immediate successor / predecessor* When x < y, and there is no element strictly between x and y, we say x is an immediate predecessor of y, and y is an immediate successor of x.

*greatest (largest, last, maximum), least(smallest, first, minimum) element* : In a poset X, the unique element a such that a >= x for all x in X is the greatest element - it may not exist. Similarly for the least element.

*maximal, maximal element* : Any element a of a poset X, such that it has no strict successor is called maximal. One with no strict predecessor is called minimal. These may not exist, or there may be many elements of each type. 

*lower bound, upper bound* A lower bound of a subset E of a poset X is an element a of X such that a <= x for all x in X. A set E may have none, one or many lower or upper bounds, and they if they exist, they may or may not be in E. 

*infimum (inf, glb, greatest lower bound), supremum (sup, lub, least upper bound)* an infimum is a lower bound a of a subset E of X such that a is greater than any other lower bound of E. a may not be in E. If $E_*$ is the set of all lower bounds of E, then the inf a is the greatest element of $E_*$. Similarly, if $E^*$ is the set of all upper bounds of E, then the supremum is the least element of $E^*$,

### The Axiom of Choice

**Given a finite sequence of sets, a necessary and sufficient condition for the cartesian product to be empty is that one of them be empty**

Proof : TBD

**Axiom of Choice** : The cartesian product of a non-empty family of empty sets is non-empty.

More formally : If $\{X_i\}$ is a non-empty family of sets indexed by a non-empty set I, then there exists a family $\{x_i\}$, i in I, such that $x_i$ in $X_i$ for each i in I.

Given any collection C of non-empty sets, we can use C itself as the index set, with the identity mapping playing the role of indexing. The axiom of choice says the cartesian product of the sets of C has at least one element. Such an element can be considered a function f with domain C, whose value at each point is the element of C with that index i.e. if $A \in C$, then $f(A) \in A$.

**Choice function**

Given a set X, let P(X) be the power set of X. We can define, using the axiom of choice, a function f, such that given any subset of X in P(X), we can choose one element of that subset i.e f is function that allows us to arbitrarily choose one element of any subset of X when we want. 

**If a set is infinite, it has a subset equivalent to N**  

Let f be a function from $2^X - {\emptyset}$ to X, such that $f(A) \in A$ (choice function).

Let C be collection of all finite subsets of X. Since X is infinite, if A is in C, then X-A is not empty.

Let g be a function on C, such that $g(A) = A \cup f(X-A)$. 

By recursion theorem, we can define a function U, such that $U(0) = g(\emptyset)$, $U(n+) = g(U(n)) = U(n) \cup {f(X - U(n))}$

Let v(n) = f(X - U(n)). We want to show v is a one-to-one correspondence between N and a subset of X.

i) v(n) is not a member of U(n), for any n (by definition of v(n))  
ii) v(n) is in U(n+) for all n (by definition of U)  
iii) If n <= m, then $U(n) \subseteq U(m)$ (by induction : True for empty set, then if true for n, it is true for n+)  
iv) If n < m, $v(n) \neq v(m)$ : v(n) in U(m), but v(m) not in U(m).

Given that for any two natural numbers, one is strictly less than the other, the last statement proves this is a one-to-one correspondence.

**A set if infinite if and only if it is equivalent to a proper subset of itself**

From the previous result : let v(n) be a one-one correspondence from N to a subset X, and infinite set.

Let h(x) = v(n+) if x = v(n), h(x) = x, if x is not in range of v.

Then h is a one-one function, but it is not onto, because v(0) is not in range of h. Thus X is equivalent to a proper subset of itself.

### Exercises on AC

**Every relation includes a function with the same domain**

Proof : Let R be a relation from X to Y. Let D = dom R. If D is empty, then there is just one function from D to Y. Let f be a choice function on P(Y)

Then, we can define a function g : dom R -> Y, such that g(a) = f({b in Y : aRb}).

**If C' is a collection of pairwise disjoint non-empty sets, then there exists a set A such that $A \bigcap C$ is a singleton for each C in C'.**

Take the cartesian product, using C' as the index set. By AC, we have a family $A = {x_i}$ such that $x_i$ is in i for each i in C'. Now, consider $A \cap C$, where C is in C'. Clearly $x_C$ is in A. So the intersection is not empty. But for any j != i, $x_j$ belongs to some other D in C', and hence does not belong to C, since C and D are disjoint. Then the intersection is a singleton.

### Zorn's Lemma

If X is a partially ordered set such that every chain in X has an upper bound, then X has a maximal element.

Proof :

**Let S = collection of all weak initial segments of X**

For every element x of X, we can map it to its weak initial segment $\bar{s}(x)$. 

x <= y => $\bar{s}(x) \subset \bar{s}(y)$

We now convert our problem of finding a maximal value in X, to finding a maximal set in S, i.e. an initial segment which is not included in any other initial segment.

The hypothesis on chains (every chain in X has an upper bound), now becomes the hypothesis that every chain in S has an upper bound (now ordered by inclusion).

**Let CC = collection of all chains in X**

Every member of X is a subset of some  $\bar{s}(x)$ in S. If C is a chain in CC ( a chain of chains), the U C is in CC, since it is also a chain.

The hypothesis now changes : instead of saying each chain C has some upper bound in S, we can say that the union of the sets of C, which is an upper bound of C, is a member of X (i.e. is also a chain). 

We now have a nonempty collection of subsets CC of a non-empty set X with two properties : Every subset of each set is in X (including the empty set, which is a chain!), and the union of any chain of sets in X is in X (note that we are not talking of all unions, only of union of chains ordered by inclusion, which form a bigger chain). We have to show CC contains a maximal set.

**Choice function f, and growth function g**

Let f be a choice function on X.

We want a way to grow chains in CC slowly, one at a time. To do that we define a function g as follows :

For each chain A in CC, we define A' as the set of all x in X, such that A U {x} is also a chain (and hence in CC). 

define g on X as follows :

g(A) = A U f(A' - A) if A' - A is not empty
g(A) = A, otherwise.

We have to find all A's for which g(A) = A - these chains are maxed out!.

It turns out crucial to the proof that we can grow A by one element at a time.

**Towers and the smallest tower T0**

A tower is a subset T of CC such that :

i) the empty set is in T
ii) If A is in T, g(A) is in T
iii) If C is a chain in T, then U A in C is in T

We have at least one tower - CC itself.

Consider the collection of all towers of CC, and consider their intersection T0. T0 is a tower. 

**T0 and Comparability**

We define a set C in T0 to be comparable if for every set A in T0, C is either included in A, or vice versa. To prove T0 is a chain, we have to show every element in T0 is comparable. Note that comparable sets definitely exist - the empty set is one.

Consider any given comparable set C in T0.

Suppose A is in T0 and A is a proper subset of C. Then, g(A) must be a subset of C (this is like in natural numbers where we say if m < n, m must be in n). Because since C is comparable, either C is a proper subset of g(A) or g(A) is a subset of C. But in the former case we have at least two elements between A and g(A) which is not possible (the importance of growing A slowly).

Now consider a collection U of all sets A in T0 such that A is a subset of C, or g(C) is a subset of A. Our intention is to show that this more restricted set is in fact a tower. 

Clearly U contains the empty set, since it is a subset of C.

For the second condition : if A is in U, g(A) is in U:

i) A is proper subset of C - Then g(A) is a subset of C, hence it is in U
ii) A = C. Then g(A) = g(C) is a subset of g(C) and in U
iii) g(C) is a subset of A, then g(C) is also a subset of g(A) and hence g(A) is in U

Finally, for the third condition (the union of a chain in U belongs to U), this follows from the definition of U. Thus U is a tower. But U is a subset of the smallest tower T0, which means U = T0.

**T0 is a chain**

We have basically shown that if C is comparable, so is g(C). Given any C, for U as above. Then U = T0 means that if A is in T0, A is a subset of C, in which case A is a subset of g(C), or g(C) is a subset of A. Thus starting from the empty set, we can build up, using our trusted function g, a chain of comparable elements.

Consider the comparable sets in T0. The empty set is comparable. If A is comparable, so is g(A). And the union of a chain of comparable sets is obviously comparable, since it is in T0, since it is just the largest of the comparable sets. Thus the comparable sets in T0 form a tower i.e. they exhaust T0 - every set in T0 is a comparable. Thus T0 is a chain.

Since T0 is a chain, the union, say A, of all sets in T0 is itself a set in T0. But this means g(A) is a subset of A. This is only possible if g(A) = A. 

**Wrapping Up**

This means there is no x in X such that we can make the chain A bigger. Since A must be a maximal chain in CC. Since every chain in CC is a subset of some $\bar{s}(x)$ in S, there must be an a in X such that $A \subset \bar{s}(a)$. The chain A must contain a (otherwise g(A) != A). This a is unique, because if A belonged to some other $\bar{s}(y)$, this would imply a <= y, which would again mean g(A) != A.

So a is the unique element such that $A \subset \bar{s}(a)$, and is the promised maximal element.

https://terrytao.wordpress.com/2009/01/28/245b-notes-7-well-ordered-sets-ordinals-and-zorns-lemma-optional/

https://arxiv.org/pdf/1207.6698.pdf

### Exercises on Zorn's Lemma

**ZL is equivalent to AC**

Let X be a set, P(X) be the power set. Let {f} be a set of functions such that dom f is a subset of P(X), and $f(A) \in A$

Order all f by extension. This is a partial order. Is there an upper bound? Let C be a chain in {f}. U C is an upper bound and is in {f} - why? The primary question is uniqueness. Assume e = (A,a) is in UC. Then this must be unique. If not, the (A,b) would be in UC, and there would be two functions in the chain neither of which would include the other, which is not possible. Hence, U C is a function and hence an upper bound of C. Thus every chain is bounded above. Hence there must be a maximal element, say, f. dom f = P(X) - {$\emptyset$}. If not, then let A be a non-empty element of P(X). Then there are |A| functions which extend f which contradicts the statement that f is maximal. Thus dom f = P(X) - {$\emptyset$}

AC was used to prove ZL earlier. So the two are equivalent.

**(HMP) Every partially ordered set has a maximal chain / In every partially ordered set, every chain is contained in a maximal chain**

ZL => HMP. Consider the set C of all chains of the (non-empty) poset X. C is ordered by inclusion, and every chain in C is a chain in X. A maximal element in C would be a maximal chain in X. Now consider a chain in C. Take the union. Given any 2 elements of the union a and b, either a < b, a = b, or a > b. Thus this union is in C and an upper bound of the chain in C. Hence every chain in C has an upper bound => C has a maximal element => X has a maximal chain. Note that this does not mean it has a maximal element. 

HMP => ZL. Every partially ordered set X has a maximal chain. By hypothesis of ZL, every chain is bounded above => the maximal chain is bounded above - let A be such a maximal chain, with a as an upper bound. Then i) a is in A ii) For every s in a, s <= a iii) There is no z in X such that a <= z otherwise A would not be maximal. Hence a is a maximal element.

Following on from this, given any poset P with a chain C, we have shown in the first part of the proof above that C must be contained in a maximal chain.

## Well-Ordering

A partially ordered set may not have a smallest element, or some subset may not have one. A well-ordered set is one where every subset has a smallest element.

In particular, every well-ordered set has a least element.

**Principle of transfinite induction**

Suppose S is a subset of a well-ordered set X, and suppose whenever an initial segment of x in X is a subset of S, then x is in S. Then S = X.

Note: For the least element e, initial segment of e = $\emptyset$, which is included in S, so e is in S.

Proof : Assume X - S is not empty, let x be the least element. Then initial segment is in S => x is in S - a contradiction.

**Comparison with Principle of Mathematical induction**

The principle of mathematical induction discussed for natural numbers goes from a predecessor of an element to itself. The principle of transfinite induction flows from the set of all predecessors to the element. In the set of natural numbers every number has an immediate predecessor ... this is not true of all well-ordered sets.

Assume S is a set, 0 is in S, if n is in S, then n+ is in S. Let m < n+. Then m <= n => m is in n. 

**continuation**

A well ordered set A is a continuation of B if B is an initial segment of some element of A, and has the same ordering.

In a well ordered set with two elements a and b, such that a > b, s(a) is a continuation of s(b), and X itself is a continuation of both s(a) and s(b). The set of initial segments of X are totally ordered by continuation. 

**Theorem**: The union of a chain C of well-ordered sets ordered by continuation has a unique well-ordering such that U is a continuation of each set distinct from U in the collection C.

Proof: If a and b belong to U, a < b, then there must be a set which contains both a and b. Indeed any set which contains b, must contain a, otherwise C would not be a chain. Also since C is a chain by continuation, the order has to be same in every set containing A and B. Similarly, if a < b, and b < c, same arguments cover a and c, hence the result is transitive. Finally, given a and b, either a < b, or b < a, or a = b as there has to be an set in C containing both, and each set in C is well ordered. 

Is U C a well-ordered set? Any non-empty subset of U has a non-empty intersection with some element of C, say W. Note that if a is any element of U in the intersection, all elements of U < a are also in the intersection because C is a chain by continuation. Thus the smallest element of U is the smallest element of the intersection. Now, this smallest element exists, because W is well ordered. Hence every subset of U C has a smallest element and U C is well-ordered.

**Exercise** Prove every totally ordered set X has a co-final well ordered subset.

Let Z be the set of all well-ordered subsets of X. Z is not empty - the empty set is a member. Order these subsets by continuation. Any chain in this set is bounded, because the union of the chain is itself well ordered subset of C. Hence, by Zorn's Lemma, there is a maximal well-ordered element of Z. Supposing there is an element x, such that for every a in the maximal set, a <= x, but x is not in Z. Then we can extend the chain, contradicting the conclusion that it is maximal. Hence this maximal well-ordered subset is cofinal.

**Well Ordering Theorem** Every set can be well ordered.

Proof : In an argument similar to above, consider a set Z of well ordered subsets of X - each element of Z is a subset of X with a well-order on it.

We order these subsets by continuation - this is a partial order. Then any chain in Z is bounded above. Hence, by Zorn's Lemma has a maximal element M, which is well ordered. Assume there is an element of x of X not in M. Then, we can define a new element N = M U {x}, with m <= x for all m in M. N is a well ordered set because M is well ordered, and it is a subset of X. Thus, we have a contradiction i.e. there is no element of X not in M => M = X.

**Exercise**

**A totally ordered set is well ordered iff strict predecessors of each element are well ordered.**

Let X be totally ordered. Assume strict preds of each element are well ordered. Take any non-empty subset A of X. If a is an element of A, then all predecessors of a in A will form a subset of the set of all predecessors of A, and hence will have a smallest element. This will also be the smallest element of a, unless a is itself the smallest element of A. In either case A has a smallest element. Hence, X is well ordered.

Assume X is a totally ordered set where there these exists an element x of X whose predecessors are not well ordered. Then there is a (non-empty) subset of these predecessors which has no smallest element => there is a non-empty subset of X with no smallest element => X is not well ordered.

**Well-Ordering => AC**

We already showed ZL => WO. Now we show that WO => AC and hence ZL. Let X = cross of product of non-empty sets Xi. Order X by well ordering, choose the smallest element.

**Every partial order can be extended to a total order**

X is a set, R is a partial order. We want to create T, a total order, such that $R \subset T$. Take the set of all partial orders on X, (X,S), such that $R \subset S$. This is a partial order. Since every chain has an upper bound, by Zorn's Lemma has a maximal element. This maximal element must be a total order. If not, let there be two elements x and y which are not related - we can create a larger set creating a contradiction. Hence this must be a total order.

https://math.stackexchange.com/questions/271003/every-partial-order-can-be-extended-to-a-linear-ordering

## Transfinite Recursion

Let W be a well ordered set, a an element of W, X is any set. Let f be a function from the s(a) of a to X. We call f a sequence of type a in X - an example is to take a function U from W to X, and restrict it to s(a).

A sequence function of type W in X is a function f from the set of all sequences of type a for all a in W, to X. In effect f allows us to lengthen the sequence by one more element.

**Transfinite Recursion Theorem** If W is a well ordered set, and if f is sequence function of type W in X, then there is a unique function U from W into X such that U(a) = f(U|s(a)) for each a in W. 

Unlike normal recursion, we are considering a function from all predecessors, and not the immediate predecessor, which may not exist. Definition by transfinite induction 

Proof :

(uniqueness of U) : Assume there is a function Q. Let S be the set of all elements a of W such that U(a) = Q(a). S is not empty : the empty set is a member. Assume W - S is not empty, let x be the smallest element. Then s(a) is in S. Thus Q|s(a) = U|s(a). Hence Q(a) = U(a) => a is in S. Contradiction.

(existence) Let an f-closed subset A of W x X be such that : Given an element a in W, if there is a sequence of type a (t) included in A, then (a,f(t)) is included in A. We consider all f-closed subsets of WxX (this is not empty since WxX is itself such a subset). Let U be an intersection of all such subsets. 

U is f-closed (else find an a where condition is not true). We have to show U is a function. 

Let S be the of all c of W where (c,x) in U for at most one x. Assume there is an a such that s(a) included in S => there is a sequence of type t, such that $t \subset U$ => (a,f(t)) is in U. Assume there is an a such that s(a) included in S, but there is an (a, y) in U, such that y != f(t). Then U - {a,y} is f-closed, which contradicts the assumption U is the smallest f-closed set.

In the above, the empty set is included because the initial segment of the empty set is the empty set and hence included in S.

**Similar Sets** Two posets A and B are similar $A \simeq B$ if there is an order preserving one-to-one correspondence between them (order isomorphism).

Thus, if f is a function from X to Y, such that f(a) <= f(b) (in Y) iff a <= b (in X), then A and B are said to be similar, and f is called a similarity. Note similarities in general are not unique - we shall show that similarity between well ordered sets are unique.

Simple conclusions

**similarity preserves <**

If a <= b, and a != b => a < b. We have to show f(a) < f(b). Obviously, f(a) <= f(b). If f(a) = f(b), then f is not one-one.

If f(a) < f(b) => a <= b. If a = b, then f would not be a function.

**inverse of similarity is a similarity**  
**composition of similarities is a similarity**  

**Comparability Theorem:If f is a similarity from a well ordered set W to a proper subset of itself, then for each a in W, a <= f(a)**

Proof : Let S be the set of all elements of W for which a <= f(a). If W-S is not empty, let x be the least element. Then x > f(x). Since f is order preserving, f(x) > f(f(x)).

But, since x > f(x) this implies f(x) is in S => f(x) <= f(f(x)), a contradiction.

**If 2 well ordered sets are similar, then the similarity is unique**

Let f and g be the similarities from well ordered sets X to Y. Let h = $g^{-1}f$ => this is a similarity from X to itself. Then a <= $g^{-1}f(a)$ for all a. Thus g(a) <= f(a) for all a, and vice versa, by symmetry. Hence f(a) = g(a).

**A well ordered set is never similar to one of its initial segments**

Let X be similar to s(a), by similarity f. Then a > f(a), which violates theorem that a <= f(a) for any similarity from a well ordered set to a proper subset.

**For any two well ordered sets, either they are similar, or one is similar to a subset of the other**

Assume X and Y are non-empty well ordered sets, such that neither is similar to a subset of the other. 

For any $a \in X$, and t is a function from s(a) into Y. Let f be a function such that f(t) is the least of the proper upper bounds of the range of t in Y, if one exists, else let f(t) = smallest element of Y. f is a sequence function of type X in Y. Let U be the corresponding function (so that U(a) = f(U|s(a))). 

Let S be the a, such that for U|s(a) is a one to one correspondence from s(a) to the initial segment of U(a). We know S is not empty since the smallest element of X is mapped to the smallest element in U. Let a be such that s(a) is in S. By definition U(a) = f(U|s(a)) = least upper bound of range of U|s(a). Since s(a) cannot be all of Y, Y = U|s(a) must be non-empty with a least element b. For any element b' in U|s(a), b' < b (otherwise ... $U^{-1}(b')$ will not have its initial segment in U|s(b)). But this implies s(b) = U|s(a). Since U(a) = b, then a is also in S.

## Ordinal Numbers - Counting Beyond Natural Numbers

How do we count beyond the natural numbers? $\omega = N$ itself is the first step. What about $\omega^+$ - the successor? And its successor and so on. Remember $\omega+ = \omega \bigcup \{w\}$

For each natural number we can define a function f as follows : f(0) = $\omega$. f($m^+$) = $(f(m))^+$ for m < n. This is a function from n -> a set F(n) which is made up of $\omega,\omega+1,\omega+2$ and so on. Note that we are using F(n) = range of f. 

We can now quite easily create a function Q on N itself, such Q(n) = F(n) for each n in N. Though this is a little obscure, in fact this is not possible using the existing axioms of set formation - we need a new axiom.

**Axiom of Substitution** Let S(a,b) is a statement we can make about each a in some set A, so that the set {b : S(a,b)} can be formed. Then we can define a function F with domain A, such that F(a) = {b: S(a,b)}.

This is called the axiom of substitution because we have create a new set of {b}'s, by substiuting these {b}'s for each a in A. The fact that we can do this procedure is in effect the axiom of substitution.

We can use the axiom of substitution to define **ordinal numbers**. An ordinal number is a well ordered set which has the additional property that each element of the set is equal to its initial segment i.e. e = s(e) for each e in the ordinal number. Thus every element of the ordinal number is also a subset of the ordinal number.

We have already shown all natural numbers are ordinal numbers. Now, by this new definition, N (a.k.a. $\omega$) is also an ordinal. In fact, if a is an ordinal, so is the set $a^+ = a \cup {a}$ (proof : every element is either a, or an element of a. Given a is an ordinal, the result follows). We can therefore confidently proceed beyond $\omega^+$ and further. How far?

We now bring in the axiom of substitution. Coming to our function earlier on N, we can define, using the axion of substitution, a function F with domain N, such that F(n) = {x : x is in range of f(n)}. The important thing here is the range of F. 

The set $\omega2 = \omega \cup range (F)$ is an ordinal
Proof : TBD
One way : We know this is true upto omega. Now prove for w+1 to ... by induction. Order is defined by a < b if $a \in b$. By definition, every element contains its initial segment. We have to show that every element is equal to its initial segment. Let S be the natural numbers n for which it is true that $\omega + n$ is equal to its initial segment. Then 0 is in S (by definition of $\omega$). If n is in S, then $\omega+n$ is equal to its initial segment. $\omega + n+ = (\omega + n) \cup \{\omega + n\}$. The main thing to verify is that every element is in $\omega2$. For this we see that every element of $\omega + n+$ is its predecessor or belongs to the initial segment of its predecessor. These are all elements of $\omega2$. By definition then, $\omega + n+$ is equal to its initial segment and n belongs to S. This implies that ran(F) meets the criteria for an ordinal. And of course, all elements of $\omega$ are already ordinals. So $\omega2$ is an ordinal.

**An order (partial or total) is determined by its initial segments**

Proof : To prove : If A is a poset, a <= b in A <=> s(a) $\subseteq$ s(b) in S, the set of initial segments (proper subset). 

This follows immediately from the reflexivity, transitivity and anti-symmetric properties of partial orders. 

**If a well ordered set is an ordinal number, there is only one possible ordering**

Since each set is equal to its initial segment, the set is itself a collection of initial segments and hence induces a specific order.

**Every element of an ordinal number is a subset of that ordinal number**

By definition.

**Every element of an ordinal number is an ordinal number, or every initial segment of an ordinal number is an ordinal number**

If e is an element, then it is a subset of the ordinal number and hence every element b of e is itself equal to its initial segment. But b is a subset of e, because if c in b => c < b => c < e => c is in e. Hence every element of e contains its initial segment in e. And of course, being a subset of a well ordered set, e is well ordered, so e is an ordinal.

The statement is even more immediate : the initial segment of an ordinal number has several elements, each of which are the same as their initial segments, and this segment is well ordered => it is an ordinal. 

**Two similar ordinal numbers are the same**

Let f be the similarity from a to b, both ordinal numbers. Write S = {e : f(e) = e}. 

For any element e in a, e is the "lub" of s(e), and since f is a similarity, f(e) is the "lub" of the image of s(e). If s(e) $\subset$ S, then f(e) and e have the same initial segments, and hence e = f(e) i.e e is also in S.

**Either two ordinals are similar, or one is similar to an initial segment of the other.**

This follows since ordinals are well ordered sets.

**If an ordinal number b is similar to an initial segment of ordinal number a, then b < a, b $\in$ a, b $\subset$ a, and a is a continuation of b**

**Every set of ordinal numbers is well ordered**

If E is a non-empty set of ordinals. Let a be an element of E. If a <= than all elements of E, then we have shown E is well ordered. Otherwise there must be an element b of E such that b < a => $b \in a$ => $a \cap E$ is not empty. Since a is an ordinal and is well ordered, the intersection must have a least element say a0. a0 is the least element of E since any element of E less than a belongs to the intersection, and hence the least of these is the least element of E.

**finite ordinals - natural numbers, transfinite ordinals - not finite, limit ordinals - no immediate predecessor**

**Every collection of ordinal numbers has an unique supremum**

Let C be such a collection. C is totally ordered by continuation.

Let a be the union of C. a is well ordered, since it is the union of a chain of well ordered sets ordered by continuation (we have proved this earlier).

For every element e of a, distinct from a, s(e) and e are both in a. e has to be in a because if not, then e would be = a. Thus a is an ordinal number.

Thus a is an upper bound. If b is another upper bound of C, then $a \subset b$ => a is least upper bound of C.

**Burali-Forti Paradox : There is no set of all ordinals**

If there was - take the supremum. This is an ordinal, greater than or equal to every ordinal. But given an ordinal a, i can always find a greater one, hence there is no set of all ordinals.

**Counting Theorem: Every well-ordered set is similar to some ordinal number**

Uniqueness is obvious since two similar ordinal numbers are the same.

*In a well ordered set, if for an element a, every element of s(a) is similar to an ordinal, then s(a) itself is similar to an ordinal* : 

Let X be a well ordered set. Let a be an element of X, such that for each x in s(a), x is similar to an ordinal. Let S(x,b) be the statement : b is an ordinal number, and $s(x) \simeq b$, where x is in a.

The { b: S(x,b) is true} exists and is non-empty (it will be a singleton). Then, by axiom of substitution, there is a function F with domain s(a), such that f(a') = {b ; S(a',b)} is true. ran(F) is a (well-ordered) collection of ordinals.

Let x be in s(a), such that $s(x) \simeq b$. Then all initial segments of s(x) are similar to initial segments of b. But the initial segments of b are in fact elements of b => these are also elements of ran(F). Thus b is a subset of ran(F). This implies that ran(F) is an ordinal. F then is a similarity and s(a) is similar to an ordinal.

*In a well ordered set, every initial segment is similar to an ordinal*

Let S be the set of all elements of X whose initial segments are similar to an ordinal. If X - S is not empty, let a be the smallest element. Then $s(a) \subset S$, and each element in s(a) is similar to an ordinal, and hence, by the result above, s(a) itself is similar to an ordinal. Thus a is in S. Thus X = S i.e. every initial segment of X is itself similar to an ordinal. 

*In a well ordered set, if every initial segment is similar to an ordinal, then the set itself is similar to an ordinal*

Let X be a well ordered set. As proved, each of X's initial segments are similar to an ordinal. Let T(x,b) be the statement : b is an ordinal number, and $s(x) \simeq b$, where x is in X. Then, by axiom of substitution, we can form a function H with domain X, such that H(a) = {b : T(x,b)}. Since all initial segments of X are similar to ran(H), it follows that every element of b is in ran(H), and hence ran(H) is itself an ordinal. But then X is similar to an ordinal.

## Ordinal Numbers

For a well ordered set X, let ord X be the unique ordinal similar to X. 

Let A and B be disjoint well ordered sets, with ord A = a, and ord B = b. Then C = A U B is called the ordinal sum of A and B. Let ord C = c. 

Note: To derive the ordering on C, we can assume that any element of B is greater than any element of A. We can extend this to infinite unions also. Also, even if A,B are not disjoint, we can create equivalent disjoint sets, by creating (x,0) for each x in A, and (x,1) for each x in B and then taking the union.

We then define a + b = c.

Properties:
a + 0 = a, 0 + a = a  
a + 1 = $a^+$
a + (b + c) = (a + b) + c

Commutativity fails in general:

$1 + \omega = \omega, \omega + 1 = \omega^+$. In general, tacking something to the beginning of an infinite well ordered set does not change the structure of the ordering. 

The ordinal product of two sets A and B can similarly be defined as the set C = A x B, with ordering in reverse lexicographic order i.e. given (a1,b1) and (a2, b2) in A x B, if b1 != b2, use the order in B, else use the order in A.

ab = c

a0 = 0a = 0  
a1 = 1a = a
a(bc) = (ab)c
a(b + c) = ab + ac (left distributive law)

But again, commutativity fails, and right distributive law fails in general.

$2\omega = \omega \neq \omega2$

$(1 + 1)\omega = 2\omega = \omega$, but $1.\omega + 1.\omega = \omega + \omega = \omega2$

Similarly, exponentiation can de dealt with.

$0^a = 0, a >= 1$

$1^b = 1$

$a^{b+c} = a^b.a^c$

$a^{bc} = (a^b)^c$

But,

$(ab)^c$ is not, in general equal to $a^cb^c$. E.g. $(2.2)^\omega = 4^\omega = \omega$, but $2^\omega.2^\omega = \omega.\omega = \omega^2$