# What is an atom?

I don't mean an atom like on the periodic table, which is really a complicated enough thing of its own "made of" electrons, protons, neutrons, and so forth. I mean, an actual atom. The word in Ancient Greek means something indivisible. Sometimes "atomism" is characterized as the belief that the world is made of tiny little hard impenetrable pebbles whizzing around in empty space, perhaps, according to Lucretius, swerving from time to time in a pique of will. But in a literal sense, atomism is the idea that the world is in some sense composed of indivisibles. We can divide something up, but at some point we won't be able to divide any longer. And what's left are the atoms. 

One could interpret this "materialistically" that what's left is tiny little bits of "matter", perhaps differently shaped, perhaps of different substances, that can fit into each other in different ways. But "idealistically", we could interpret this as a reflection on the fact that when we analyze an idea, what we do is we reduce it down to a bunch of basic concepts fitted together into a jigsaw; and the most basic concepts have to be taken for granted, atomically. Of course, an idea can be interpreted in many ways, expressed in different ways as a complex of perhaps different basic concepts. So that from one point of view, a concept can be resolved into these atoms; whereas from another point of view, the concept can be resolved into some other notion of atoms. We prove our understanding to each other by demonstrating how we can reduce some shared concept into some agreed upon atoms. And arguably, what we mean by "understanding" is the ability to apprehend what something "is made of" immediately upon recognizing it. 

It's worth observing that when the materialist wants to have their atoms be tiny little "physical" cubes or octahedra, knocking around, naturally selecting, in actual practice, what they really need to do is plausibly reduce *all their concepts* down into a single set of basic atomic concepts, which correspond exactly to the ideas of "the tiny hard impenetrable sphere," "the tiny little pyramid," out in the real world. When put this way, it actually seems rather extravagant. While it is "obvious" in some sense that ideas can be analyzed into atomic ideas, it would be really quite something if *all* concepts could be reduced to jigsaws made of exactly the same ideas. It would mean all those inequivalent ways of analyzing concepts were really worthless: after all, why not just use the universal ideas, the universal atoms? 

The materialist could respond: The real world is made up of one set of atoms; it's only in the mind that we can analyze concepts into seemingly different inequivalent atomic schemes, leading to the disagreement and confusion. The response at this juncture can only be that if you're willing to consider that, why not consider the case where, it's only in the mind that all concepts can be reduced down to a single set of ur-concepts (your own!), whereas in the real world, real things can be reduced down to ur-concepts in different inequivalent ways.

The only way to proceed is to actually try to do it: we have to try to divide the world and our concepts into universal atoms and see what happens.

<hr>


So we begin with an atom. We symbolize it with a pebble: $\Large\bullet$.  Tiny, hard, impenetrable. Point-like. You can turn over a real pebble in your hand. Of course, you can split an actual pebble, but maybe you imagine splitting it again and again, and eventually maybe you can't split it any more: and we're talking about that, that ur-pebble at the bottom in the dust, separated from its fellow dust. Or from another point of view, we're tracing the roots of our human concept of a "pebble" and what we can do with it.

So we have a pebble: $\Large\bullet$. What can we do with it? Well, if we had it, we could lose it, get rid of it, take it away. We'll denote the absence of the pebble with $\Large\circ$.

For later reasons, I want to call what we just did: homogenization. For now, the terminology isn't important. The point is in our way of representing things, we have a way of symbolizing both a thing and its absence. And indeed, this is how we perceive the situation: the empty space there in your hand is just inviting me to place a pebble in it. To understand the concept of a pebble is to see the world as one giant set of opportunities to place a pebble somewhere where it's absent or to take a pebble from where it sits. 

Furthermore, homogenization is necessary for reliable communication. If we can in fact recognize them, we can use the presence or absence of a pebble to communicate something perhaps totally unrelated to the pebble. A little rock on a ledge by a window, a sign that the riots begin tomorrow. But there's a catch! Just the same, we could have prepared a little surprise, which is that it would be precisely the *absence* of the pebble that would signify the riots begin tomorrow, whereas its presence would have signified regrouping. So it's clear: in order to communicate something with a pebble, we have to both understand pebbles, to be able to pick them out of the background of perception, promoting their presence or absence to the foreground, but also: we have to make an ultimately arbitrary choice about whether presence or absence will significant, which one is $0$ and which one is $1$. The choice is arbitrary, but a choice must be made. And this fact means that if we swap pebbles for absences, we can on the flip side correct our rule for signification by swapping them in the same way, riots for regrouping.

<hr>

So far we've just been really considering a single pebble and what we can do with it. 

But, if $1$ pebble is sitting there, why not put another one beside it? So now we have $2$ pebbles: $ \Large \{ \Large\bullet, \Large\bullet \}$. We could add another pebble. Then we'd have $3$ pebbles: $ \Large \{ \Large\bullet, \Large\bullet, \Large\bullet \}$. And, all things being equal, there's no limit to our being able to add more pebbles. And so, we've discovered counting up. And naturally, if we can count up, we can count down, removing pebbles until we're back to our original pebble. 

One way to look at this is that we're providing more context for our pebble $\Large\bullet$. Picture it like our pebble is actually floating in an infinite sea of pebble-absences: $ \Large \{\dots, \Large\circ, \Large\bullet, \Large\circ, \dots \}$. Each absence represents an opportunity to add another pebble. You can take the bait: $ \Large \{\dots, \Large\circ, \Large\bullet, \Large\bullet, \dots \}$, and again: $ \Large \{\dots, \Large\bullet, \Large\bullet, \Large\bullet, \dots \}$, without limit: which implies there's always one more $\Large\circ$ left no matter how many $\Large\bullet$'s you add, just as you started with a single $\Large\bullet$ in a sea of $\Large\circ$'s. Of course the order doesn't matter. If before we had atoms, we now have composites made of these atoms, in which the atoms all coexist simultaneously, atop each other, however you want to look at it.

We've discovered the counting numbers $1, 2, 3, \dots$. You could picture them like: a bunch of pebbles in a pile. But notice we start counting at 1. This is necessary because we'd like to preserve the original symmetry, the fact that we could swap a pebble for its absence, as well as our interpretations, and still get a successful signification. In this case, we just flip the colors on all the pebbles. You could imagine it as "a universe which consists on a single pile of pebbles (or absences)."
<hr>

Well, naturally we'd be interested in having multiple piles of pebbles to play with.

What does it mean to group pebbles into piles? We could imagine bringing the pebbles closer and closer together until actually they're in the "same place," but since that's tricky, we signify grouping instead by drawing a circle around the pebbles, or just kind of making a little stack of them in a pile. The order/placement of the pebbles doesn't matter, just the fact that the pebbles are all in there, and there is some number of them. And just as we can have multiple pebbles in a pile, we can have multiple piles lying around. (Living analogously in a single "pile of piles" which provides the background context, just as a single pile provided the background context for a single pebble. But to that we will come in time. For now, we're just thinking about having multiple piles.)

What's the most basic thing we could do with piles? Well, we could combine piles. This is called "addition."

$ \Large \{ \Large\bullet, \Large\bullet \} + \Large \{ \Large\bullet, \Large\bullet, \Large\bullet \} = \Large \{ \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet \} $

In other words, $2 + 3 = 5$. We could think of this like iterated counting up: we count up by as many pebbles are in the pile, "all at once," as opposed to one at a time, and if we think in terms of combining piles obviously: $2 + 3 = 3 + 2 = 5$. But if we want an inverse operation, iterated counting down, where we count down by as many pebbles are in a pile, we suddenly have a problem: what if we subtract past 1 to 0 or what's words "negative numbers"? For example, $2 - 3 = -1$:

$ \Large \{ \Large\bullet, \Large\bullet \} - \Large \{ \Large\bullet, \Large\bullet, \Large\bullet \} = \Large \{ \Large{\color{red} \bullet} \} $

Clearly, we have to introduce a second kind of pebble (with its own kind of absence) to keep track of our "debts." The rule is that if a $\Large\bullet$ and a $\Large{\color{red} \bullet}$ are in a pile together, we can remove them both: $\Large \{ \Large\bullet, \Large{\color{red} \bullet} \} \rightarrow \Large\{ \}$. So now we can represent an empty pile, $0$, just a circle with nothing in it. Or with colored pebbles themselves as any pile with an equal number of $\Large\bullet$'s and $\Large{\color{red} \bullet}$'s. Black and red pebbles can't coexist in a pile together, and so eventually after all the credits and debts cancel out, the pebbles in a pile are all red, all black, or 0. And so we have a new kind of freedom in our representation: we can add arbitrary pairs of black and red pebbles to any pile and this won't change the number that it represents: $\Large \{ \Large\bullet, \Large\bullet,  \Large{\color{red} \bullet}, \Large{\color{red} \bullet}, \Large{\color{red} \bullet}, \Large{\color{red} \bullet}\} \rightarrow \Large\{ \Large{\color{red} \bullet}, \Large{\color{red} \bullet} \}$.

We started with an atomic pebble. We then joined pebbles into a complex we called a pile, in which multiple pebbles coexist. We then decided to work at an even higher level where we can have multiple piles, just as before we had multiple pebbles. We've invented the "integers": $ \ldots -3, -2, -1, 0, 1, 2, 3, \ldots$. 

<hr>

As if inevitably, the dialectic carries on. If we can iterate counting up to get addition, we should be able to iterate addition to get: multiplication, whose "all-at-once" inverse is division. And just as before, we graduate to a new notion of number, a new interpretation of our atoms: the rational numbers, which can be regarded as "piles of piles." Here's how it works.

If we have two piles of pebbles $A$ and $B$, we can multiply them by iterating the addition of $A$ by the number of pebbles in $B$, or vice versa.

$ \Large \{ \Large\bullet, \Large\bullet \} \times \Large \{ \Large\bullet, \Large\bullet, \Large\bullet \} = \Large \{ \Large\bullet, \Large\bullet \} + \Large \{ \Large\bullet, \Large\bullet \}  + \Large \{ \Large\bullet, \Large\bullet \} = \Large \{ \Large\bullet, \Large\bullet, \Large\bullet \} + \Large \{ \Large\bullet, \Large\bullet, \Large\bullet \} $

$ \Large \{ \Large\bullet, \Large\bullet \} \times \Large \{ \Large\bullet, \Large\bullet, \Large\bullet \} = \Large \{ \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet \} $

$ 2 \times 3 = 3 \times 2 = 6 $

If we wanted we could display the last identity like this:

$ 2 \times 3 = 3 \times 2 = \begin{matrix} \Large\bullet & \Large\bullet \\\ \Large\bullet & \Large\bullet \\\ \Large\bullet & \Large\bullet \end{matrix} = \begin{matrix} \Large\bullet & \Large\bullet & \Large\bullet \\\ \Large\bullet & \Large\bullet & \Large\bullet \end{matrix} $

And correspondingly: $\frac{6}{3} = 2 $ and $ \frac{6}{2} = 3$. And we have it that $\frac{n}{1} = n$ for any $n$.

Now an important structure arises out of the interplay between addition and multiplication: the prime numbers. A prime number is a number than is only divisible by itself or $1$. Whereas $12$ could be expressed $2 \times 2 \times 3$, $7$ is just $7 \times 1$. The primes go like: $2, 3, 5, 7, 11, 13, 17, 19, \dots$. And in fact, there are an infinite number of primes. The proof goes back at least to Euclid.

First we observe, that every 2nd number is divisible by 2, every 3rd number is divisible by 3, every 4th number is divisible by 4, every 5th number is divisible by 5, and so on. It follows that if we add $1$ to any number, it won't be divisible by any of its old primes. For example, $14 = 2 \times 7$, but $15 = 3 \times 5$. If we think about it "musically," $14$ falls on the beat of $2$ and $7$: as I say, every 2nd number is divisible by $2$, every 7th number is divisible by $7$, and both beats fall on $14$. But if we add $1$ to $14$, it can't fall on the $2$-beat nor the $7$-beat, and this is a general rule.

$ 1 \mid \color{green}2 \mid \color{blue}3 \mid \color{green}2 \times \color{green}2 \mid \color{orange}5 \mid \color{green}2 \times \color{blue}3 \mid \color{purple}7\mid \color{green}2 \times \color{green}2 \times \color{green}2 \mid \color{blue}3 \times \color{blue}3 \mid \color{green}2 \times \color{orange}5 \mid \color{pink}{11}\mid \color{green}2 \times \color{green}2 \times \color{blue}3 \dots $

So suppose there were a largest prime number. We could then take that prime and all the prime numbers lower than the largest, multiply them all together and add $1$: this number couldn't be divisible by any of our known primes! And so, it must contain a yet larger prime within it. Hence, our initial assumption has led to a contradiction: therefore, the converse is true: there *are* an infinite number of prime numbers.

Now it is a fact that every counting number can be broken down uniquely into a product of primes. Which suggests the following idea. What if we introduce now an infinite number of colored pebbles, one for each prime? Under the hood, each will be just a pile of pebbles.

$ 1 \rightarrow \Large{\color{black} \bullet} \rightarrow \{ \Large\bullet \}$

$ 2 \rightarrow \Large{\color{green} \bullet} \rightarrow \{ \Large\bullet, \Large\bullet \}$

$ 3 \rightarrow \Large{\color{blue} \bullet} \rightarrow \{ \Large\bullet, \Large\bullet, \Large\bullet \}$

$ 5 \rightarrow \Large{\color{orange} \bullet} \rightarrow \{ \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet \}$

$ 7 \rightarrow \Large{\color{purple} \bullet} \rightarrow \{ \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet \}$

$ 11 \rightarrow \Large{\color{pink} \bullet} \rightarrow \{ \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet \}$

We could now write $12$ like this:

$ 12 = 2 \times 2 \times 3 = \Large\{ \Large{\color{green} \bullet}, \Large{\color{green} \bullet}, \Large{\color{blue} \bullet} \} = \Large\{ \{ \Large\bullet, \Large\bullet \}, \{ \Large\bullet, \Large\bullet \}, \{ \Large\bullet, \Large\bullet, \Large\bullet \} \} = \{ \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet, \Large\bullet \}$

So we have a new kind of pile: a pile of prime numbers. And the rule is: we *multiply* elements in a pile of primes together into a composite. Just as pebbles coexist "unordered" in an "additive pile," prime pebbles coexist "unordered" in a "multiplicative pile." At this level, the primes are our new atoms and we interpret "piles of piles" multiplicatively. We can have, of course, multiple "piles of piles" and we can combine them just like piles:

$ \Large\{ \Large{\color{green} \bullet}, \Large{\color{green} \bullet} \Large\} \times \Large\{ \Large{\color{orange} \bullet} \} \rightarrow \Large\{ \Large{\color{green} \bullet}, \Large{\color{green} \bullet}, \Large{\color{orange} \bullet} \Large\}$ 

$ 2 \times 5 = 10$

But what happens if we do something crazy like $\frac{3}{2}$? There's no counting number $n$ such that $n \times 2 = 3$. But if we can multiply, we must be able to divide. So again we have to introduce a new kind of number: a rational number. We just interpret $\frac{3}{2}$ as a fraction. We could think of this like introducing a little inverse pebble for each prime.

$ \frac{1}{1} \rightarrow \bar{\Large{\color{black} \bullet}}$

$ \frac{1}{2} \rightarrow \bar{\Large{\color{green} \bullet}}$

$ \frac{1}{3} \rightarrow \bar{\Large{\color{blue} \bullet}}$

$ \frac{1}{5} \rightarrow \bar{\Large{\color{orange} \bullet}}$

$ \frac{1}{7} \rightarrow \bar{\Large{\color{purple} \bullet}}$

$ \frac{1}{11} \rightarrow \bar{\Large{\color{pink} \bullet}}$

And the rule is that:

$\Large\{ \Large{\color{green} \bullet}, \bar{\Large{\color{green} \bullet}} \Large\} \rightarrow \Large\{ \Large{\color{black} \bullet} \Large\}$

$ 2 \times \frac{1}{2} = 1$.

If a prime pebble and its inverse appear in a pile together, we can remove them both: which is the same as replacin them with a $1$ pebble. And actually we can add as many $1$ pebbles as we please without changing the number represented by the pile. Moreover, we can add any number of pairs of primes and inverses, and our pile will still represent the same number. Once all the cancellations have taken place, we say we've reduced the rational number to "lowest terms." 

We can take advantage of this freedom to add rational numbers. First, we have to find a common denominator, and then we can add the numerators without trouble.

$ \frac{1}{5} + \frac{2}{3} = \frac{3}{15} + \frac{10}{15} = \frac{13}{15} $

Finally, we also have our $-1$ pebble: $\Large{\color{red} \bullet}$, which can make piles negative. We can toss pairs of these pebbles into any multiplicative pile without changing the number.

$ \Large{\color{black} \bullet} \times \Large{\color{red} \bullet} = \Large{\color{red} \bullet}$ 

$ \Large{\color{red} \bullet} \times \Large{\color{red} \bullet} = \Large{\color{black} \bullet}$ 

If integers can keep track of credits and debts, which is related to the ability to arbitrarily decide what to regard as "0", rational numbers can keep track of different denominations, which is related to the ability to arbitrarily decide what to regard as "1".

For instance, maybe I have a ruler $A$ and I measure something to be $3$ units. You have a ruler $B$, and each unit of your ruler is worth two of mine: $\frac{yours}{mine}$. We could describe the relationship between our rulers with a rational number: $\frac{1}{2}$. So you would measure the same object to be $3 \times \frac{1}{2}$ or $\frac{3}{2}$ in your units. So rational numbers allow us to switch between different "currencies": it provides the exchange rate from gold to silver, dollars to cents, hours to minutes. Indeed, we recognize that if we want to communicate a number, in general we need to provide more context since we could be using different units to measure our numbers: we need to agree on how much of your "1" there is per my "1".

Finally, there remains a question about $0$. What happens if we take $\frac{1}{0}$? If we want to be able to add, subtract, multiply, and divide all our numbers, we need to find an interpretation of this. (This assumes we've added new kind of pebble for $0$, which must have an inverse.) So we actually need one last number.

We say: $\frac{1}{0} \rightarrow \infty$ and $\frac{1}{\infty} = 0$. But if that's the case, then $-\frac{1}{\infty}$ must also be $0$ since $-0 = 0$. So we have the idea of identifying positive and negative $\infty$.

Here's one way to think of this:

![](img/2d_stereographic_projection.png)

The idea is imagine all the rational numbers except $\infty$ arranged on a line. This can be done because every rational number divides the rationals into those less than and those greater than it. We then imagine a point at infinity beyond all the points on the left, and a point at infinity beyond all the points on the right, and we say that it's the same infinity. What we've done is wrap up the number line into a circle. Every rational number can be uniquely mapped to the circle via a stereographic projection. This we know: we can think about a rational number as divisions of a pie.

<hr>

The saga continues, and here's where things may begin to come as a surprise. The story so far is ultimately an old one: almost all the aforementioned was known even to the world of Antiquity. But even then, there was recognized to be an anomaly: the irrational numbers.

If we continue on our path, we ought now to iterate multiplication to get exponentiation, whose inverse is root-taking. If addition is repeated counting all-at-once, and multiplication is repeated addition all at once, then exponentiation is repeated multiplication all-at-once. 

$2^{3} = 2 \times 2 \times 2 = 8$

Notice that the order matters:

$3^{2} = 3 \times 3 = 9$.

Root-taking is just the reverse:

$ \sqrt[3] 8 = 2 $

$ \sqrt 9 = 3$

But what about something like $\sqrt 2$?

Indeed, Pythagoras tells us that if we have a right triangle, then the relationship between the sides $a$ and $b$ and the hypoteneuse $c$ is:

$a^{2} + b^{2} = c^{2}$

![](img/pythagorean_theorem.png)

Suppose $a = 1$ and $b = 1$, then:

$ 1^{2} + 1^{2} = c^{2} \rightarrow c = \sqrt 2$

There is a famous proof that the square root of 2 can't be any rational number. It is again a proof by contradiction.

Suppose $\sqrt 2 = \frac{p}{q}$, where $\frac{p}{q}$ is a rational number in lowest terms, so that $p$ and $q$ share no prime factors in common. Then:

$\sqrt 2 = \frac{p}{q}$

$ 2 = \frac{p^{2}}{q^{2}}$

$ 2q^{2} = p^{2} $

This says that $p^{2}$ has a factor of $2$, which actually means that $p$ has a factor of $2$ and $p^{2}$ has a factor of $4$. Let's factor out that $4$ by introducing a new symbol $r$.

$ 2q^{2} = 4r^{2}$, where $r^{2} = \frac{p^{2}}{4}$

We can then divide out by 2.

$ q^{2} = 2r^{2}$

This says that $q^{2}$ has a factor of $2$, really a factor of $4$, just as before. So we've shown that both $p$ and $q$ are even, but we assumed at the beginning that $p$ and $q$ had no factors in common! Therefore our initial assumption was wrong:

$ \sqrt 2 \neq \frac{p}{q} $

In other words, $\sqrt 2$ is not any of our rational numbers. It must be a new kind of number, an irrational number. 

There's something to do with 2D. We can use a rational number to convert between a horizontal ruler and a vertical ruler; but we can't use a rational number to convert between those rulers and a ruler at $45^{\circ}$. There is no way to come to a common "1" between them. Think about it like: the triangle is actually being displayed by tiny square pixels on the computer screen. This is no problem if we have horizontal and vertical lines, but a diagonal line would have to zig-zag.

<img src="img/pixel_triangle.png" width=200>

The idea is that if we kept adding finer and finer zig-zags to the diagonal, we'd get ever closer to the hypoteneuses "actual" length: $\sqrt 2$. If we assume that we can always refine our zig-zags, we're making an assumption about continuity, that between any two points, there always lies another point between them.

Indeed, let's observe that whether the length of something is irrational is somewhat in the eye of the beholder. For example, suppose we set $c = 1$ in $a^{2} + b^{2} = c^{2}$, and assume $a=b$.

$2a^{2} = 1$

$ a = b = \frac{1}{\sqrt 2} $

Imagine instead we'd started with diagonal grid lines. Then the hypoteneuse would be "measurable," but the base and height wouldn't be, unless we imagine shrinking the grid lines literally to an infinitesimally small size.

How can we work with the $\sqrt 2$ then? Well, we could treat it as a symbol to which we can do algebra, obviously. But also, we could approximate it.

The following idea works for any $\sqrt n $. We make a starting guess at the $\sqrt n$. Call it $g_{0}$. Now if $g_{0}$ is an over estimate, then $\frac{n}{g_{0}}$ will be an under estimate. Therefore, the average of the two should provide a better approximation.

$g_{1} = \frac{1}{2}(g_{0} + \frac{n}{g_{0}})$

We then make the same argument about $g_{1}$, that if it's an overestimate the $\frac{n}{g_{1}}$ will be an underestimate, and the average is our next guess $g_{2}$. 

$g_{n} = \frac{1}{2}(g_{n-1} + \frac{n}{g_{n-1}})$

Clearly, the longer we do this, the closer our guess will be to the $\sqrt n$, and we say that the $\sqrt n$ is the unique limit of this procedure. And so we can approximate $\sqrt n$ as closely as we like.

In the case of $\sqrt 2$ case:

$g_{0} = 1$

$g_{1} = \frac{1}{2}(1 + 2) = \frac{3}{2} = \textbf{1} .5$

$g_{2} = \frac{1}{2}(\frac{3}{2} + \frac{2}{\frac{3}{2}}) = \frac{17}{12} = \textbf{1.4}16\dots$

$g_{3} = \frac{577}{408}(\frac{17}{12} + \frac{2}{\frac{12}{12}}) = \frac{577}{408} = \textbf{1.41421}5\dots$

$g_{4} = \frac{665857}{470832} = \textbf{1.41421356237}46\dots$

And so, we get a series of fractions that gets ever closer to the $\sqrt 2$.

Now, one thing that's nice about having exponents around is we can use a base-numerical system of representation for our numbers, as we just did.

For example, we write $123 = 1 \times 10^{2} + 2 \times 10^{1} + 3 \times 10^{0} = 100 + 20 + 3$.

This is called "base 10." We fix 10 numerals: $0, 1, 2, 3, 4, 5, 6, 7, 8, 9$, and we can then represent any number as an ordered sequence of these numerals, with the understanding that the weight powers of $10$ in the above sum. Every whole number will have a unique representation.

$ \dots d_{2}d_{1}d_{0} = \dots d_{2} \times b^{2} + d_{1} \times b^{1} + d_{0} \times b^{0}$, for some base $b$.

We can use decimals to represent fractions:

$ 1.23 = 1 \times 10^0 + 2 \times 10^{-1} \times 10^{-2}$

In other words, we continue with "negative powers," which are defined like:

$ a^{-b} = \frac{1}{a^b}$

So we have:

$ \dots d_{2}d_{1}d_{0}.d_{-1}d_{-2} \dots = \dots d_{2} \times b^{2} + d_{1} \times b^{1} + d_{0} \times b^{0} + d_{-1} \times b^{-1} + d_{-2} \times b^{-2} \dots$, for some base $b$.

Some rational numbers have infinite long decimal expansions which repeat. For example:

$ \frac{1}{3} = 0.\overline{333} $

Irrational numbers have infinitely long decimal expansions which don't repeat, e. g., $\sqrt{2}$.

Clearly, infinity is starting to play an important role at this stage of the game, but it's a different kind of infinity that we've met before. Before we dealt with the infinity of the counting numbers. But now we're dealing with a continuous infinity of rationals and irrationals, and this infinity is actually larger.

The famous proof is thanks to Cantor and very nicely it is known as the "diagonal argument."

Now it depends on the fact that you can prove that there are no more rational numbers than counting numbers: in other words, they are the same size of infinity, and can be placed in a 1-to-1 correspondence via an enumeration. I won't give the full proof, but the intuition is that there is the following more or less obvious enumeration of the rationals:

![](img/rational_enumeration.png)

So in what follows, we prove the theorem for counting numbers, but it also applies to the rationals as a whole.

Suppose we're working in base-2, and we try to make a list of all the numbers.

$\begin{matrix} 
0 & 1 & 0 & 0 & 1 & 1 & \dots \\
1 & 1 & 1 & 1 & 0 & 0 & \dots \\
0 & 0 & 1 & 0 & 0 & 1 & \dots \\
1 & 0 & 0 & 0 & 0 & 1 & \dots \\
\vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \ddots \\
\end{matrix}$

We think we can do this since it seems by enumeration which can obtain all possible sequences of 0's and 1's.

$ 0 \rightarrow 0 $

$ 1 \rightarrow 1 $

$ 2 \rightarrow 10 $

$ 3 \rightarrow 11 $

$ 4 \rightarrow 100 $

$ 5 \rightarrow 101 $

$ 6 \rightarrow 110 $

$ 7 \rightarrow 1111 $

But now go down the diagonal of our infinite list and flip every $0 \rightarrow 1$ and every $1 \rightarrow 0$.

$\begin{matrix} 
\textbf{0} & 1 & 0 & 0 & 1 & 1 & \dots \\
1 & \textbf{0} & 1 & 1 & 0 & 0 & \dots \\
0 & 0 & \textbf{0} & 0 & 0 & 1 & \dots \\
1 & 0 & 0 & \textbf{1} & 0 & 1 & \dots \\
\vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \ddots \\
\end{matrix}$

The sequence of 0's and 1's along the diagonal is different by construction from the first sequence in the list in the first place, the second sequence in the list in the second place, the third sequence in the list in the third place. Therefore this sequence of 0's and 1's is well defined, but it can't be found anywhere on our list which apparently *enumerates* all possible sequences of 0's and 1's. Such a sequence represents an irrational number, and you get a different one for each possible ordering of the counting numbers. This is a proof that the infinity of the "real numbers" is greater than the infinity of the counting numbers/rationals, in the sense that the real numbers can't be *enumerated*. (The argument can be iterated to show there is actually an infinite hierarchy of ever more infinite infinities.)

One nice definition of real numbers goes like this. It's based on an idea called the Dedekind cut. We imagine "cutting" the rational number line into two infinite sets $A$ and $B$ such that every element of $A$ is less than every element of $B$, and $A$ has no greatest element. We consider the least element of $B$. If this number is rational, then the cut defines that rational number. But there might be an irrational number which is greater than every element in $A$. Since $A$ has no greatest element, then the first number in $B$ ought to be that irrational number, but instead: there's a gap, since we're working with the rational number line. We fill in that gap: that is the irrational number defined by the "cut". $A$ then contains every rational number less than the cut, and $B$ contains every rational number greater than or equal to the cut.

Clearly, at this stage, infinity is taking center stage and also a kind of self-referential logic: approximations where we feed outputs reflexively back into inputs, lists we make and then negate to infinity...

The culmination of these ideas can be expressed in terms of computability: the general theory of following definite rules. After all, there are many ways to calculate things, feeding outputs back into inputs... and not all of them correspond to rationals, or even real numbers!

For example, let's build a computer out of fractions. This is an idea due to John Conway, and goes by the name "fractran."

A computer program in fractran is an ordered list of fractions. For example:

![](img/fractran_primes.png)

The input to the program is some integer $n_{0}$. The idea is: you start going through the list and you try multiplying $n$ by the first element in the list. If the denominator cancels out and you get an integer back, then that's the output. You stop, go back to the beginning, and start over with the input to the program being $n_{1}$. If you don't get an integer back, however, then you move on to the second fraction, and so on. If you get to the end of the list without getting an integer back, then the program halts. And that's fractran!

For example, the above program, if the input is $2$, generates a sequence of integers which contains the following powers of 2:

$ 2^{2} \dots 2^{3} \dots 2^{5} \dots 2^{7} \dots 2^{11} \dots $

In other words, it calculates the prime numbers!

A program which adds two integers is given simply by $ \frac{3}{2}$. Given an input $2^{a}3^{b}$, it eventually outputs: $3^{a+b}$: it adds the integers! More examples can be found on the wikipedia page. 

This idea is just as we can write out an integer in base-n representation as an ordered list of numerals, we can also write out an integer as a product of primes.

$60 = 2^{2}3^{1}5^{1}$

In fact, we could really just denote it $211$, where it is understood that the first digit is the exponent of the first prime ($2$), the second digit is the exponent of the second prime ($3$), and so on.

A computer needs a memory, a series of registers. The idea is that each prime acts as a register for our computer, and the value of the register is the exponent of that prime. Multiplying by fractions allows us to shift data between the registers, and these rules are complete for universal classical computation. In other words, this computer can compute anything any other computer can compute. For example, we could have a list of fractions which represents: a computer program that approximates the $\sqrt 2$. 

So some computations can be shown to represent irrational numbers: if the more you run the computation, the closer the output gets to a single real number. But of course, not all algorithms "converge" on a number. And in fact, not everything can be computed!

The famous proof of this is due to Godel/Turing. Godel actually used something quite akin to fractran in his proof, but Turing's argument is easier to understand. The whole thing rests on self-reference: the fact that the instructions in a computer program which computes numbers are also numbers themselves, so that one can write computer programs *whose input is other computer programs*. It's yet another example of a "reductio ad absurdum" argument, and also a diagonal argument all in one!

The question is this. It would be great to have a computer program $H(A, I)$ that takes as input another computer program $A$ and an input $I$ tells you whether $A$ will halt or run forever. In some cases, it's clear, like if $A$ has an obvious infinite loop, or just returns a constant. But is there a program $H$ that can handle all cases?

Suppose there were. We have some $H(A, I)$ which returns true if $A$ halts, and false if $A$ doesn't, when run on $I$.

Now consider that Godel function: $G(A, I)$, which takes a program $A$ and an input $I$, and if $A$ halts, then $G$ loops forever; but if $A$ doesn't halt, then $G$ halts. This is akin to going along the diagonal in Cantor's proof and flipping the bits.

Now we consider: $H(G, G)$. This is supposed to return true if $G$ halts on $G$, but false if $G$ runs forever on $G$. If according to $G$, if $G$ halts, then $G$ runs forever, but if $G$ runs forever, then $G$ halts! This is a contradiction, and so: there is no program $H$ that takes a computer program and an input and can determine whether the program definitely halts or not on that input.

The moral is that if you want to know if a computer program halts or not, in general you have to run it potentially "forever" and just wait and see if it does actually halt. There may be no shortcut.

And so we see that there is an intrinsic limit to computation: there are questions which are in principle *not computable*.

Now Godel originally proved this theorem in terms of mathematical logic. He assumed the axioms which give you the elementary operations of addition, multiplication, subtraction, and division, and the quantifiers like: There exists... and For all... And so he was able to turn statements of mathematical logic into numbers (whose an encoding based on primes! Indeed, the primes in their interplay between addition and multiplication are necessary!), so that then statements of mathematical logic could talk about other statements. He then constructed a self-referential statement that led to a contradiction. 

The full statement of Godel's Incompleteness Theorems is that:

A system of logic powerful enough to contain basic arithmetic is necessarily incomplete in the sense that there are statements that can neither be proved nor disproved by those rules of logic. One such statement is the consistency of that formal system, i. e. whether the rules of logic lead to a contradiction. So that a system of logic powerful enough to contain arithemtic can't prove its own consistency. It may, in fact, be consistent; but that can only be proven in a different system of logic. Like: you could try to repair your incomplete system of axioms and rules of inference by adding more axioms and rules of inference, and then you might be able to resolve some previously unanswerable questions; but this will lead to yet more unanswerable questions, which can only be resolved by adding more axioms...

In other words, mathematics cannot be reduced down to a single set of axioms, from which all true statements can be derived via the mechanical procedure of following out all the rules of inference. This goes a long way towards answering our original question whether in some sense all composites can be broken down into the same kinds of atoms! But the unpacking of our concept of atom is hardly complete, and many unanticipatable surprises are in store.

To return to our main subject, the real line, which has a third point between any two points, contains rationals, irrationals, even uncomputable numbers!

But sometimes we can have recursively defined algorithms that converge on a certain real number, for example, the $\sqrt 2$. And it is these numbers that we are now taking as our atoms, which each contain within themselves a whole completed infinity "all at once."

Recall that we're now working with exponentiation and root-taking. And actually, there's something we've completely overlooked. What happens if we take $\sqrt -1$?

There is no rational (or even real number) such that when you square it, you get $-1$. So as usual, we have to add this number to our system. It's usually called $i = \sqrt -1$, because such numbers were originally thought of as "imaginary."

Once we have $i$, we can think of $\sqrt -7$ as $i\sqrt 7$. 

Note that we have:

$ 1 \times i = i $

$ i \times i = -1 $

$ -1 \times i = -i $

$ -i \times i = 1 $

There's a four-fold repeating pattern. This suggest the interpretation of multiplying by $i$ to be a $90^{\circ}$ rotation.

![](img/complex_plane.png)

So actually, we have more than just the real axis! There's a second axis, the imaginary axis, and actually our new kinds of numbers can live anywhere on the resulting plane, called the "complex plane."

So our "complex numbers" can have a real part and an imaginary part, and they can be written $z = a+bi$, which picks out a point on the plane.

![](img/complex_cartesian.gif)

Alternatively, we could use polar coordinates: $z = r(cos(\theta) + i sin(\theta))$, where $r$ is the radius and $\theta$ the angle.

<img src="img/eulers_formula.png"  width=300>

Later we'll come to understand why we can also write this $z = re^{i\theta}$, but for now take it as a useful notation.

Complex numbers follow the rules of algebra. You can add them:

$ a + bi + c + di = (a + c) + i(b + d)$

This just corresponds to laying the arrows represented by the two numbers end to end.

![](img/complex_addition.gif)

Multiplying complex numbers means to stretch the one by the other's length, and rotate by the other's angle.

$ re^{i\theta} \times se^{i\phi} = rs e^{i(\theta + \phi)}$

Finally, we have to mention complex conjugation. Given a complex number $z = a + bi = re^{i\theta} $, its conjugate $z* = a - bi = r^{-i\theta}$.

<img src="img/complex_conjugate.png" width=200>

Look at what happens when we multiply a complex number by its conjugate.

$zz* = (a + bi)(a - bi) = a^{2} - abi + abi - b^{2}i^{2} = a^{2} + b^{2}$

So if we consider a complex number $z$ to represent a right triangle with sides $a$ and $b$, the length of its hypoteneuse is given by $\sqrt zz*$.

If it wasn't clear before, it is now undeniable: there is something actually two-dimensional going on at this stage of the game. In fact, our numbers now have two parts and live on the plane, where they can represent points/arrows, and rotations/stretches thereon.

And this is exactly what we need.

Think about it like, all this time, we've been providing more and more context to the interpretation of a single pebble. First, we have to decide whether the presence of the pebble or its absence is significant. Then, once we have multiple pebbles, we have to agree where we start counting from. Then, once we have integers, we have to agree on where $0$ is, and so if we should consider the number to be positive or negative. Then, once we have the rational numbers, we have to agree on what is $1$, we need a rational number to translate between or different units. But now at this stage, we need to agree *on the angle between our axes*. And that's just what a complex number can do. It can take a diagonal line, which has to be represented with an "unwritable" $\sqrt 2$ for example, and could translate it into "our reference frame" by multipling by $e^{-i \pi/4}$, which by rotation aligns the diagonal line with the real axis. So each stage in our journey represents another kind of context, another kind of number that we have to give to align our reference frames, so that we can agree with certainty about the meaning of a pebble.

Before we transcend this plane, however, we have one final point to make. What about division by 0? Before, this wrapped up the line into a circle. Now that we have complex numbers, how can we interpret division by 0?

The answer is the Riemann sphere.

![](img/riemann_sphere_brr.jpg)

In the case of the line, there were two infinities, positive and negative. In the plane, the horizon is an infinite circle, and we imagine a point beyond the horizon, and you read the same point no matter which direction you approach it in: that's the point at infinity. And if it's there, what it means is that our plane is really a sphere: at the point $\infty$ is just the point directly opposite you on the sphere.

![](img/stereographic_projection.jpg)

We imagine standing at the North Pole of the sphere, and drawing a straight line from that point to a chosen point on the complex plane. That line will intersect the sphere at one location. If point on the plane is inside the unit circle, it gets mapped to the Southern Hemisphere and if the point on the plane is outside the unit circle, it gets mapped to the Northern Hemisphere. All points on the plane are mapped uniquely to a point on the sphere, but we have an extra point left over, and that's the one at the North Pole, the point of projection itself.

In what follows, however, we'll find it most convenient to take our projection from the South Pole. In that case:

If we have a complex number $c = a+bi$ or $\infty$, then:

$ c \rightarrow (x, y, z) = (\frac{2a}{1 + a^{2} + b^{2}}, \frac{2b}{1+a^{2}+b^{2}}, \frac{1 - a^{2} - b^{2}}{1 + x^{2} + y^{2}})$ or $(0, 0, -1)$ if $c = \infty$

And inversely, $(x, y, z) \rightarrow \frac{x}{1+z} + i\frac{y}{1+z}$ or $\infty$ if $(x, y, z) = (0, 0, -1)$.

And so, we discover that our numbers at this stage are actually: points on a sphere, picking out a direction in 3D space. At this level, such a number is a composite, a completed infinity defined by rational numbers. And we can count, add, subtract, multiply, divide, exponentiate, and take roots with wild abandon.

<hr>

At this stage, we should expect that the complex numbers will be our atoms, and we'll form composites out of them. What are these composites? You've probably heard of them before: they are *polynomials*. Indeed, we will now take for our composites "equations" themselves.

Recall that a polynomial is defined quite like a "base-n" number, but what before was the "base" is now a *variable*. The $c_{n}$ are generally complex coefficients.

$ f(z) = \dots + c_{4}z^{4} +  c_{3}z^{3} + c_{2}z^{2} + c_{1}z + c_{0} $

A classic problem is to solve for the "roots" of $f(z)$. These are the values of $z$ that make $f(z) = 0$.

A degree of a polynomial is the highest power of the variable that appears within it. It turns out that a degree $n$ polynomial has exactly $n$  complex roots. A polynomial can therefore be factored into roots, just as a whole number can be factored into primes.

$ f(z) =  \dots + c_{4}z^{4} +  c_{3}z^{3} + c_{2}z^{2} + c_{1}z + c_{0} = (z - \alpha_{0})(z - \alpha_{1})(z - \alpha_{2})\dots $

The only catch is that the roots of the polynomial $f(z)$ are left invariant if you multiply the whole polynomial by any complex number. So the roots define the polynomial up to multiplication by a complex "scalar." 

The proof of this is called the fundamental theorem of algebra, just as the proof that whole numbers can be uniquely decomposed into primes is called the fundamental theorem of arithmetic. Interestingly, the proof doesn't require much more than some geometry. Here's a brief sketch:

So suppose we have some polynomial $ f(z) = c_{n}z^n + \dots + c_{4}z^{4} +  c_{3}z^{3} + c_{2}z^{2} + c_{1}z + c_{0}$. Now suppose that we take $z$ to be very large. For a very large $z$ the difference between $z^{n-1}$ and $z^n$ is considerable, and we can say that the polynomial is dominated by the $c_{n}z^n$ term. Now imagine tracing out a big circle in the complex plane corresponding to different values of $z$. If we look at $f(z)$, it'll similarly wind around a circle $n$ times as fast (with a little wiggling as it does so corresponding to the other tiny terms). Now imagine shrinking the $z$ circle down smaller and smaller until $z=0$. But $f(0) = c_{0}$, which is just the constant term. So as the "input" circle shrinks down to 0, the "output" circle shrinks down to $c_{0}$. To get there, however, the shrinking circle in the output plane must have passed the origin at some point, and therefore the polynomial has at least one root. You factor that root out of the polynomial leading to a polynomial of one less degree, and repeat the argument, until there are no more roots left. Therefore, a degree $n$ polynomial has exactly $n$ roots in the complex numbers.

![](img/fundamental_theorem_of_algebra.png)


This is related to the fact that the complex numbers represent the algebraic closure of the elementary operations of arithmetic. One could consider other number fields: the real numbers, for instance.

It's worth noting "Vieta's formulas" which relate the roots to the coefficients:

Given a polynomial $f(z) = c_{n}z^n + \dots + c_{4}z^{4} +  c_{3}z^{3} + c_{2}z^{2} + c_{1}z + c_{0} = (z - \alpha_{0})(z - \alpha_{1})(z - \alpha_{2})\dots(z - \alpha_{n-1}) $, we find:

$ 1 = \frac{c_{n}}{c_{n}}$ 

$ \alpha_{0} + \alpha_{1} + \alpha_{2} + \dots + \alpha_{n-1} = -\frac{c_{n-1}}{c_{n}}$

$ \alpha_{0}\alpha_{1} + \alpha_{0}\alpha_{2} + \dots + \alpha_{1}\alpha_{2} + \dots + \alpha_{2}\alpha_{3} \dots = \frac{c_{n-2}}{c_{n}}$

$ \alpha_{0}\alpha_{1}\alpha_{2} + \alpha_{0}\alpha_{1}\alpha_{3} + \dots + \alpha_{1}\alpha_{2}\alpha_{3} + \dots + \alpha_{2}\alpha_{3}\alpha_{4} \dots = - \frac{c_{n-3}}{c_{n}}$

$ \vdots $

$ \alpha_{0}\alpha_{1}\alpha_{2}\dots = (-1)^{n-1}\frac{c_{0}}{c_{n}} $ 

In other words, the $c_{n-1}$ coefficient is the sum of roots taken one at a time times $(-1)^{1}$, the $c_{n-2}$ coefficient is given by the sum of the roots taken two at a time times $(-1)^{2}$, and so on, until the constant term is given by the product of the roots times $(-1)^{n-1}$, in other words the $n-1$ roots taken all at a time. We divide out by $c_{n}$ as the roots are defined up to multiplication by a complex sclar.

In other words, the relationship between coefficients and roots is "holistic"--just like the relationship between a composite whole number and its prime factors, the latter of which are contained "unordered" within it.

Now an important point is that polynomials form a "vector space." A vector space consists of multidimensional arrows called "vectors" and "scalars" which are just normal numbers. You can multiply vectors by scalars, and add them up, and you'll always get a vector in the vector space. Implicitly, we've worked with real vector spaces when we used $(x, y)$ and $(x, y, z)$ coordinates. Here, we're working with complex vector spaces. (Note we could use other "division algebras" for our scalars: reals, complex numbers, quaternions, or octonions, but complex numbers give us all we need.) But just like real vector spaces, the point is that we can expand any vector in some "basis." For real 3D vectors, we often use $(1,0,0)$, $(0,1,0)$, and $(0,0,1)$. The idea is that any 3D point can be written as a linear combination of these three vectors, which are orthonormal: they are at right angles, and of unit length. 

Eg. $(x, y, z) = x(1,0,0) + y(0,1,0) + z(0,0,1)$. Any set of linearly independent vectors can form a basis, although we'll stick to using orthnormal vectors.

In real vector spaces, the length of a vector is given by $\sqrt{v \cdot v}$, where $v \cdot v$ is the inner product: you multiply the entries of the two vectors pairwise and sum. If you take $u \cdot v$, this quantity will be 0 if the vectors are at right angles. For complex vector spaces, we use the bracket notation $\langle v \mid v \rangle$, where recall that the $\langle v \mid$ is a bra, which is the complex conjugated row vector corresponding to the column vector $\mid v \rangle$. 

Already this is looking like quantum mechanics. Indeed, in what follows, we'll be using polynomials to represent quantum spin states. You should note that while the exposition has been more or less self-contained so far, in what follows we'll be assuming knowledge from the previous essays.

For example, we've already alluded to the fact that instead of working with $\mathbb{C} + \infty$, we can work in the two dimensional complex projective space. Given some $\alpha$ which can be a complex number or infinity:

$\alpha \rightarrow \begin{pmatrix} 1 \\ \alpha \end{pmatrix}$ or $\begin{pmatrix} 0 \\ 1 \end{pmatrix}$ if $\alpha = \infty$.

In reverse:

$ \begin{pmatrix} a \\ b \end{pmatrix} \rightarrow \frac{b}{a}$ or $\infty$ if $a=0$.

We're free to multiply this complex vector by any complex number and this won't change the root, so we can always normalize the vector so its length is 1. Then it represents the state of a qubit, a spin-$\frac{1}{2}$ particle, consisting of an average spin-axis (a point on the sphere), and a complex phase. Supposing we've quantized along the Z-axis, then $aa*$ is the probability of measuring this qubit to be $\uparrow$ along the Z direction, and $bb*$ is the probability of measuring it to be $\downarrow$. ("Quantized along the Z-axis" just means that we take eigenstates of the Pauli Z operator to be our basis states, so that each component of our complex vector weights one of those eigenstates. All Hermitian matrices have eigenvectors which form an orthogonal basis.)

Now we can see where this comes from. It's the difference between considering a "root" vs a "monomial".

Suppose we have a polynomial with a single root, a monomial: $f(z) = c_{1}z + c_{0}$. Let's solve it:

$ 0 = c_{1}z + c_{0}$

$ z = -\frac{c_{0}}{c_{1}} $

Indeed, we could have written $f(z) = (z + c_{0}/c_{1})$. The point is that the monomial can be identified with its root up to multiplication by any complex number--just like our complex projective vector!

So if we had some complex number $\alpha$, which we upgraded to a complex projective vector, we could turn it into a monomial by remembering about that negative sign:

$\alpha \rightarrow \begin{pmatrix} 1 \\ \alpha  \end{pmatrix}  \rightarrow f(z) = z -\alpha  $

$\begin{pmatrix} c_{1} \\ c_{0} \end{pmatrix} \rightarrow f(z) = c_{1}z - c_{0} \rightarrow \frac{c_{0}}{c_{1}} $


But what if $\alpha = \infty$? Above, we suggested that this should get mapped to the vector $\begin{pmatrix} 0 \\ 1 \end{pmatrix}$.

Interpreted as a monomial this says that $f(z) = 0z - 1 = -1$. Which has no roots! Indeed, we've reduced the polynomial by a degree: from degree 1 to degree 0. However, it makes a lot of sense to interpret this polynomial as having a root: $\infty$.

So we could just tack on the rule that if we lose a degree, we add a root at infinity.

But there's a more systematic way to deal with this. We *homogenize* our polynomial. In other words, we add a second variable so that each term in the resulting two-variable polynomial has the same degree. In this case, we want to do something like:

$f(z) = c_{1}z + c_{0} \rightarrow f(w, z) = c_{1}z + c_{0}w$

Suppose we want $f(w, z)$ to have a root $\begin{pmatrix} 1 \\ 0 \end{pmatrix}$. In other words, we want $f(1, 0) = 0$. Then we should want $f(w, z) = 1z + 0w = z$, which has a root when $z=0$ and $w$ is anything. If we want $f(w, z)$ to have a root $\begin{pmatrix} 0 \\ 1 \end{pmatrix}$, aka $f(0, 1) = 0$, then $f(w, z) = 0z + 1w = w$, which has a root when $w=0$ and $z$ is anything. So we have it now that a $z$-root lives at the North Pole, and a $w$-root lives at the South Pole.

We also have to consider the sign. If we want $f(w, z)$ to have a root $\begin{pmatrix} 2 \\ 3 \end{pmatrix}$, aka $f(2, 3) = 0$, then we want $f(w, z) = 2z - 3w$, so that $f(2, 3) = 2(3) - 3(2) = 0$. 

So the correct homogenous polynomial with root $\begin{pmatrix} c_{1} \\ c_{0} \end{pmatrix}$ is $f(w, z) = c_{1}z - c_{0}w$. Or the other way around, if we have $f(w, z) = c_{1}z + c_{0}w$, then its root will be $\begin{pmatrix} c_{1} \\ -c_{0} \end{pmatrix}$, up to overall sign.

Now it turns out that this construction generalizes to any spin-$j$, not just a spin-$\frac{1}{2}$. If a degree 1 polynomial represents a spin-$\frac{1}{2}$ state as a 2d complex vector, a degree 2 polynomial represents a spin-$1$ state as a 3d complex vector, a degree 3 polynomial represents a spin-$\frac{3}{2}$ state as a 4d complex vector, and so on. Considering the roots, instead of the coefficients, this is saying that up to a complex phase, a spin-$\frac{1}{2}$ state can be identified with point on the sphere, a spin-$1$ state with two points on the sphere, a spin-$\frac{3}{2}$ state with three points on the sphere.

This construction is due to Ettore Majorana, and is often known as the "stellar representation" of spin, and the points on the sphere are often referred to as "stars." The theory of quantum spin, therefore, becomes in many ways the theory of "constellations on the sphere."

So far example, we could have a degree 2 polynomial. For simplicity, let's consider one that has two roots at $\begin{pmatrix} 2 \\ 3 \end{pmatrix}$. 

If before we had $f(w,z) = 2z - 3w$, we now want $f(w,z) = (2z - 3w)^2 = (2z - 3w)(2z - 3w) = 4z^2 - 12zw + 9w^2$.

If we want a polynomial with two roots at $\begin{pmatrix} 1 \\ 0 \end{pmatrix}$, we want $f(w, z) = z^2$. If we want a polynomial with two roots at $\begin{pmatrix} 0 \\ 1 \end{pmatrix}$, we want $f(w, z) = w^2$. If we want a polynomial with one root at $\begin{pmatrix} 1 \\ 0 \end{pmatrix}$ and one root at $\begin{pmatrix} 0 \\ 1 \end{pmatrix}$, we want $f(w, z) = zw$.

Indeed, we can use these last three as basis states. Let's take a look:

$
\begin{array}{ |c|c|c| } 
 \hline
 z^{2} & z & 1 \\ 
 \hline
 z^{2} & zw & w^{2} \\
 \hline
 \hline
 1 & 0 & 0 & \rightarrow f(w, z) = z^{2} = 0 & \{ \begin{pmatrix} 1 \\ 0 \end{pmatrix}, \begin{pmatrix} 1 \\ 0 \end{pmatrix} \} & \{ 0, 0 \}\\ 
 0 & 1 & 0 & \rightarrow f(w, z) = zw = 0 & \{\begin{pmatrix} 1 \\ 0 \end{pmatrix}, \begin{pmatrix} 0 \\ 1 \end{pmatrix} \} & \{ 0, \infty \}\\ 
 0 & 0 & 1 & \rightarrow f(w, z) = w^{2} = 0 & \{ \begin{pmatrix} 0 \\ 1 \end{pmatrix}, \begin{pmatrix} 0 \\ 1 \end{pmatrix}\} & \{ \infty, \infty\}\\
 \hline
\end{array}
$


So we have three basis states $z^{2}$, $zw$, and $w^{2}$, and they correspond to three constellations, albeit simple ones. The first constellation has two stars at the North Pole. The second has one star at the North Pole and one star at the South Pole. And the third has two stars at the South Pole.

Now the interesting fact is that *any 2-star constellation can be written as a complex superposition of these three constellations*. The stars are just given by the roots of the homogenous polynomial (or the single variable polynomial with the rule about roots at infinity.)

There's one more subtlety, however, we need to take into account to be consistent with how the X, Y, Z operators are defined. The full story is this:

If we have a spin state in the usual $\mid j, m \rangle$ representation, quantized along the Z-axis, we can express it as a n-dimensional ket, where $n = 2j + 1$. E.g, if j = $\frac{1}{2}$, the dimension of the representation is 2; for j = $1$, the dimension is 3, and so on.

$\begin{pmatrix} a_{0} \\ a_{1} \\ a_{2} \\ \vdots \\ a_{n-1} \end{pmatrix}$

Then the polynomial whose 2j roots correspond to the correct stars, taking into account all the secret negative signs, and the various conventions for the spin matrices, is given by:

$p(z) = \sum_{m=-j}^{m=j} (-1)^{j+m} \sqrt{\frac{(2j)!}{(j-m)!(j+m)!}} a_{j+m} z^{j-m}$.

$p(z) = \sum_{i=0}^{i=2j} (-1)^{i} \sqrt{\begin{pmatrix} 2j \\ i \end{pmatrix}} a_{i} z^{2j-i}$, where $\begin{pmatrix} n \\ k \end{pmatrix}$ is the binomial coefficient aka $\frac{n!}{k!(n-k)!}$.

Or homogenously, $p(w, z) = \sum_{i=0}^{i=2j} (-1)^{i} \sqrt{\begin{pmatrix} 2j \\ i \end{pmatrix}} a_{i} z^{2j-i} w^{i}$.

This is known as the Majorana polynomial. Why does the binomial coefficient come into play? I'll just note that $\begin{pmatrix} 2j \\ i \end{pmatrix}$ is the number of groupings of 2j roots taken 0 at a time, 1 at a time, 2 at a time, 3 at a time, eventually 2j at a time. That's just the number of terms in each of Vieta's formulas, which relate the roots to the coefficients! So when we go from a polynomial $\rightarrow$ the $\mid j, m \rangle$ state, we're normalizing each coefficient by the number of terms that contribute to that coefficient.

In [None]:
# Displays the eigenstates of the X, Y, Z operators 
# for a given value of j and the associated constellations.
# X, Y, Z are arranged left to right, and their height is given by their m value.
import numpy as np
import qutip as qt
import vpython as vp
np.set_printoptions(precision=3)

scene = vp.canvas(background=vp.color.white)

##########################################################################################

# from the south pole
def c_xyz(c):
        if c == float("Inf"):
            return np.array([0,0,-1])
        else:
            x, y = c.real, c.imag
            return np.array([2*x/(1 + x**2 + y**2),\
                             2*y/(1 + x**2 + y**2),\
                   (1-x**2-y**2)/(1 + x**2 + y**2)])

# np.roots takes: p[0] * x**n + p[1] * x**(n-1) + ... + p[n-1]*x + p[n]
def poly_roots(poly):
    head_zeros = 0
    for c in poly:
        if c == 0:
            head_zeros += 1 
        else:
            break
    return [float("Inf")]*head_zeros + [complex(root) for root in np.roots(poly)]

def spin_poly(spin):
    j = (spin.shape[0]-1)/2.
    v = spin
    poly = []
    for m in np.arange(-j, j+1, 1):
        i = int(m+j)
        poly.append(v[i]*\
            (((-1)**(i))*np.sqrt(np.math.factorial(2*j)/\
                        (np.math.factorial(j-m)*np.math.factorial(j+m)))))
    return poly

def spin_XYZ(spin):
    return [c_xyz(root) for root in poly_roots(spin_poly(spin))]

##########################################################################################

def display(spin, where):
    j = (spin.shape[0]-1)/2
    vsphere = vp.sphere(color=vp.color.blue,\
                        opacity=0.5,\
                        pos=where)
    vstars = [vp.sphere(emissive=True,\
                        radius=0.3,\
                        pos=vsphere.pos+vp.vector(*xyz))\
                            for i, xyz in enumerate(spin_XYZ(spin.full().T[0]))]
    varrow = vp.arrow(pos=vsphere.pos,\
                      axis=vp.vector(qt.expect(qt.jmat(j, 'x'), spin),\
                                     qt.expect(qt.jmat(j, 'y'), spin),\
                                     qt.expect(qt.jmat(j, 'z'), spin)))
    return vsphere, vstars, varrow

##########################################################################################

j = 1/2
XYZ = {"X": qt.jmat(j, 'x'),\
       "Y": qt.jmat(j, 'y'),\
       "Z": qt.jmat(j, 'z')}
for i, o in enumerate(["X", "Y", "Z"]):
    L, V = XYZ[o].eigenstates()
    for j, v in enumerate(V):
        display(v, vp.vector(3*i,3*L[j],0))
        spin = v.full().T[0]
        print("%s(%.2f):" % (o, L[j]))
        print("\t|j, m> = %s" % spin)
        poly_str = "".join(["(%.1f+%.1fi)z^%d + " % (c.real, c.imag, len(spin)-k-1) for k, c in enumerate(spin_poly(spin))])
        print("\tpoly = %s" % poly_str[:-2])
        print("\troots = ")
        for root in poly_roots(spin_poly(spin)):
            print("\t  %s" % root)
        print()

So we can see that the eigenstates of the X, Y, Z operators for a given spin-$j$ representation correspond to constellations with 2j stars, and there are 2j+1 such constellations, one for each eigenstate, and they correspond to the following simple constellations:

All the stars at, say, Y+, and none at Y-;
then all but one star at Y+, and one star at Y-;
then all but two stars at Y+, and two stars at Y-;
then all but three stars at Y+, and three stars at Y-;
until you get to all the stars at Y-. 

We could choose X, Y, Z or some combination thereof to be the axis we quantize along, but generally we'll choose the Z axis. 

And the remarkable fact is that any constellation of 2j stars can be written as a superposition of these basic constellations. We simply find the roots of the corresponding polynomial. And you can check, for instance, that the whole constellation rotates rigidly around the X axis when you evolve the spin state with $e^{iXt}$, for example. (And $e^{Xt}$ corresponds to a boost!) It is fun to watch the constellation evolve under some arbitrary Hamiltonian: the stars swirl around, permuting among themselves, seeming to repel each other like little charged particles.

Below, you can display a random spin-$j$ state and evolve it under some random Hamiltonian and watch its constellation evolve. Meanwhile, you can see to the side the Z-basis constellations, and the amplitudes corresponding to them: in other words, the yellow arrows are the components of the spin vector in the Z basis.

It's worth noting: if the state is an eigenstate of the Hamiltonian, then the constellation doesn't change: there's only a phase evolution. Perturbing the state slightly from that eigenstate will cause the stars to precess around their former locations. Further perturbations will eventually cause the stars to begin to swap places, until eventually it because visually unclear which point they had been precessing around.

In [None]:
import numpy as np
import qutip as qt
import vpython as vp
scene = vp.canvas(background=vp.color.white)

##########################################################################################

# from the south pole
def c_xyz(c):
        if c == float("Inf"):
            return np.array([0,0,-1])
        else:
            x, y = c.real, c.imag
            return np.array([2*x/(1 + x**2 + y**2),\
                             2*y/(1 + x**2 + y**2),\
                   (1-x**2-y**2)/(1 + x**2 + y**2)])

# np.roots takes: p[0] * x**n + p[1] * x**(n-1) + ... + p[n-1]*x + p[n]
def poly_roots(poly):
    head_zeros = 0
    for c in poly:
        if c == 0:
            head_zeros += 1 
        else:
            break
    return [float("Inf")]*head_zeros + [complex(root) for root in np.roots(poly)]

def spin_poly(spin):
    j = (spin.shape[0]-1)/2.
    v = spin
    poly = []
    for m in np.arange(-j, j+1, 1):
        i = int(m+j)
        poly.append(v[i]*\
            (((-1)**(i))*np.sqrt(np.math.factorial(2*j)/\
                        (np.math.factorial(j-m)*np.math.factorial(j+m)))))
    return poly

def spin_XYZ(spin):
    return [c_xyz(root) for root in poly_roots(spin_poly(spin))]

##########################################################################################

def display(spin, where, radius=1):
    j = (spin.shape[0]-1)/2
    vsphere = vp.sphere(color=vp.color.blue,\
                        opacity=0.5,\
                        radius=radius,
                        pos=where)
    vstars = [vp.sphere(emissive=True,\
                        radius=radius*0.3,\
                        pos=vsphere.pos+vsphere.radius*vp.vector(*xyz))\
                            for i, xyz in enumerate(spin_XYZ(spin.full().T[0]))]
    varrow = vp.arrow(pos=vsphere.pos,\
                      axis=vsphere.radius*vp.vector(qt.expect(qt.jmat(j, 'x'), spin),\
                                                    qt.expect(qt.jmat(j, 'y'), spin),\
                                                    qt.expect(qt.jmat(j, 'z'), spin)))
    return vsphere, vstars, varrow

def update(spin, vsphere, vstars, varrow):
    j = (spin.shape[0]-1)/2
    for i, xyz in enumerate(spin_XYZ(spin.full().T[0])):
        vstars[i].pos = vsphere.pos+vsphere.radius*vp.vector(*xyz)
    varrow.axis = vsphere.radius*vp.vector(qt.expect(qt.jmat(j, 'x'), spin),\
                                           qt.expect(qt.jmat(j, 'y'), spin),\
                                           qt.expect(qt.jmat(j, 'z'), spin))
    return vsphere, vstars, varrow

##########################################################################################

j = 3/2
n = int(2*j+1)
dt = 0.001
XYZ = {"X": qt.jmat(j, 'x'),\
       "Y": qt.jmat(j, 'y'),\
       "Z": qt.jmat(j, 'z')}
state = qt.rand_ket(n)#qt.basis(n, 0)#
H = qt.rand_herm(n)#qt.jmat(j, 'x')#qt.rand_herm(n)#
U = (-1j*H*dt).expm()

vsphere, vstars, varrow = display(state, vp.vector(0,0,0), radius=2)

ZL, ZV = qt.jmat(j, 'z').eigenstates()
vamps = []
for i, v in enumerate(ZV):
    display(v, vp.vector(4, 2*ZL[i], 0), radius=0.5)
    amp = state.overlap(v)
    vamps.append(vp.arrow(color=vp.color.yellow, pos=vp.vector(3, 2*ZL[i], 0),\
                            axis=vp.vector(amp.real, amp.imag, 0)))

T = 100000
for t in range(T):
    state = U*state
    update(state, vsphere, vstars, varrow)
    for i, vamp in enumerate(vamps):
        amp = state.overlap(ZV[i])
        vamp.axis = vp.vector(amp.real, amp.imag, 0)
        vp.rate(2000)

Finally, let's check out our "group equivariance."

In other words, we could treat our function $f(w, z)$ as a function that takes a spinor/qubit/2d complex projective vector/spin-$\frac{1}{2}$ state as input. So we'll write $f(\psi_{little})$, and it's understood that we plug the first component of $\psi_{little}$ in for $w$ and the second component in for $z$. And we'll say $f(\psi_{little}) \rightarrow \psi_{big}$, where $\psi_{big}$ is the corresponding $\mid j, m \rangle$ state of our spin-$j$ . If $\psi_{little}$ is a root, then if we rotate it around some axis, and also rotate $\psi_{big}$ the same amount around the same axis, then $U_{little}\psi_{little}$ should be a root of $U_{big}\psi_{big}$.


In [None]:
import numpy as np
import qutip as qt
import vpython as vp
scene = vp.canvas(background=vp.color.white)

##########################################################################################

# from the south pole
def c_xyz(c):
        if c == float("Inf"):
            return np.array([0,0,-1])
        else:
            x, y = c.real, c.imag
            return np.array([2*x/(1 + x**2 + y**2),\
                             2*y/(1 + x**2 + y**2),\
                   (1-x**2-y**2)/(1 + x**2 + y**2)])

# np.roots takes: p[0] * x**n + p[1] * x**(n-1) + ... + p[n-1]*x + p[n]
def poly_roots(poly):
    head_zeros = 0
    for c in poly:
        if c == 0:
            head_zeros += 1 
        else:
            break
    return [float("Inf")]*head_zeros + [complex(root) for root in np.roots(poly)]

def spin_poly(spin):
    j = (spin.shape[0]-1)/2.
    v = spin if type(spin) != qt.Qobj else spin.full().T[0]
    poly = []
    for m in np.arange(-j, j+1, 1):
        i = int(m+j)
        poly.append(v[i]*\
            (((-1)**(i))*np.sqrt(np.math.factorial(2*j)/\
                        (np.math.factorial(j-m)*np.math.factorial(j+m)))))
    return poly

def spin_XYZ(spin):
    return [c_xyz(root) for root in poly_roots(spin_poly(spin))]

def spin_homog(spin):
    n = spin.shape[0]
    print("".join(["(%.2f + %.2fi)z^%dw^%d + " % (c.real, c.imag, n-i-1, i) for i, c in enumerate(spin_poly(spin))])[:-2])
    def hom(spinor):
        w, z = spinor.full().T[0]
        return sum([c*(z**(n-i-1))*(w**(i)) for i, c in enumerate(spin_poly(spin))])
    return hom

def c_spinor(c):
    if c == float('Inf'):
        return qt.Qobj(np.array([0,1]))
    else:
        return qt.Qobj(np.array([1,c])).unit()

def spin_spinors(spin):
    return [c_spinor(root) for root in poly_roots(spin_poly(spin))]

##########################################################################################

j = 3/2
n = int(2*j+1)
spin = qt.rand_ket(n)
print(spin)
h = spin_homog(spin)
spinors = spin_spinors(spin)

for spinor in spinors:
    print(h(spinor))
print()

dt = 0.5
littleX = (-1j*qt.jmat(0.5, 'x')*dt).expm()
bigX = (-1j*qt.jmat(j, 'x')*dt).expm()

spinors2 = [littleX*spinor for spinor in spinors]
spin2 = bigX*spin
print(spin2)
h2 = spin_homog(spin2)

for spinor in spinors2:
    print(h2(spinor))

Now you might ask what is all this good for?

After all, if we really can represent a spin-$j$ state as $2j$ points on the sphere, why not just keep track of those $(x, y, z)$ points and not worry about all these complex vectors and polynomials and so forth? Even for the spin-$\frac{1}{2}$ case, what's the use of working with a two dimensional complex vector representation?

And so, now we need to talk about the Stern-Gerlach experiment.

Suppose we have a bar magnet. It has a North Pole and a South Pole, and it's oriented along some axis. Let's say you shoot it through a magnetic field that's stronger in one direction, so that it gets weaker the more you move up, stronger the more you move down. To the extent that the magnet is aligned with or against the magnetic field, it'll be deflected up or deflected down and tilted a little bit. For a big bar magnet, it seems like it can be deflected any continuous amount, depending on the original orientation of the magnet.

But what makes a bar magnet magnetic? We know that moving charges produce a magnetic field, so we might imagine that there is something "circulating" within the bar magnet.

What happens if we start dividing the bar magnet into pieces? We divide it in two: now we have two smaller bar magnets, each with their own pole. We keep dividing. Eventually, we reach the "atoms" themselves whose electrons are spin-$\frac{1}{2}$ particles, which means they're spinning, and this generates a little magnetic field: to wit, a spin-$\frac{1}{2}$ particle is like the tiniest possible bar magnet. It turns out that what makes a big bar magnet magnetic is that all of its spins are entangled so that they're all pointing in the same direction, leading to a big magnetic field.

Okay, so what happens if we shoot a spin-$\frac{1}{2}$ particle through the magnetic field?

Here's the surprising thing. It is deflected up or down by some fixed amount, with a certain probability, and if it is deflected up, then it ends up spinning perfectly aligned with the magnetic field; and if it is deflected down, then it ends up spinning perfectly anti-aligned with the magnetic field. There's only two outcomes.

![](img/stern_gerlach1.jpg)

The experiment was originally done with silver atoms which are big neutral atoms with a single unpaired electron in their outer shells. In fact, here's the original photographic plate.

![](img/stern_gerlach2.jpg)

This was one of the crucial experiments that established quantum mechanics. It was conceived in 1921 and performed in 1922. The message is that spin angular momentum isn't "continuous" when you go to measure it; it comes in discrete quantized amounts.

To wit, suppose your magnetic field is oriented along the Z-axis. If you have a spin-$\frac{1}{2}$ state $\begin{pmatrix} a \\ b \end{pmatrix}$ represented in the Z-basis, then the probability that the spin will end up in the $\mid \uparrow \rangle$ state is $aa*$ and the probability that the spin will end up in the $\mid \downarrow \rangle$ state is given by $bb*$. This is of course if we've normalized our state so that $aa* + bb* = 1$, which of course we always can.

What if we measured the spin along the Y-axis? Or the X-axis? Or any axis? We'd express the vector $\begin{pmatrix} a \\ b \end{pmatrix}$ in terms of the eigenstates of the Y operator or the X operator, etc. We'd get another vector $\begin{pmatrix} c \\ d \end{pmatrix}$ and $cc*$ would be the probability of getting $\uparrow$ in the Y direction or getting $\downarrow$ in the Y direction.

For spin-$\frac{1}{2}$ there is an easy way to think about where these probabilities come from geometrically.

![](img/spin_probability.jpg)

The eigenstates of a 2x2 Hermitian matrix correspond to orthogonal complex vectors, but antipodal points on the sphere: so they define an axis: this is the axis you're measuring along, the axis of the magnetic field in the Stern-Gerlach set-up. Now given a point on the sphere representing spin-$\frac{1}{2}$ particle's spin axis, you project that point perpendicularly onto the measurement axis. This divides that line segment into two pieces. If you imagine that the line is like an elastic band that snaps randomly in some location, and then the two ends are dragged to the two antipodal points, carrying the projected point with it, the projected point will ends up at one or other of the two antipodal locations with a certain probability, which is indeed the correct probability for that experiment. Note that this simple geometric picture only works for spin-$\frac{1}{2}$; the generalization to higher spin isn't so easily visualizable, but is basically analogous.

What happens if we send a spin-$1$ particle through the machine?

![](img/stern_gerlach3.jpg)

The particle ends up in one of three locations, and it's spin will be in one of the eigenstates of the operator in question, each with some probability: recall they correspond to, two points at the North pole; one point at North Pole, one point at South Pole; two points at South Pole. So if we're in the Z-basis we have some $\begin{pmatrix} a \\ b \\ c \end{pmatrix}$ and the probability of the first outcome is $aa*$, the second outcome is $bb*$, and the third outcome is $cc*$.

And in general, a spin-$j$ particle sent through a Stern-Gerlach set-up will end up in one of $2j+1$ locations, each correlated with one of the $2j+1$ eigenstates of the spin-operator in that direction.

The idea is that a big bar magnet could be treated as practically a spin-$\infty$ particle, with an infinite number of stars at the same point. And so, it "splits into so many beams" that it seems like it could end up in a whole continuum of possible locations, so it took so many thousands of years before we realized that (spin) angular momentum is actually quantized.

Below is a simple version of a Stern-Gerlach set-up. We make some simplifications. We work in 2D. The particle starts to the left and zips along horizontally to the right. The magnetic field/Z direction is up/down. The particle's Z operator couples to its position along the Z axis. And the magnetic field acts on the spin. Finally, we make a finite approximation which is effectively like putting the particle in a box, so after a while it starts reflecting off the edges. In any case, the probability of the particle to be at a given location is visualized as the radius of the sphere shown at that point. Try changing the particles $j$-value, it's initial state, and even the number of lattice spacings!

In [None]:
import numpy as np
import qutip as qt
import vpython as vp

vp.scene.background = vp.color.white

dt = 0.001
n = 5
spinj = 0.5
spinn = int(2*spinj+1)

S = {"X": qt.tensor(qt.identity(n), qt.identity(n), qt.jmat(spinj, 'x')/spinj),\
     "Y": qt.tensor(qt.identity(n), qt.identity(n), qt.jmat(spinj, 'y')/spinj),\
     "Z": qt.tensor(qt.identity(n), qt.identity(n), qt.jmat(spinj, 'z')/spinj)}

P = {"X": qt.tensor(qt.momentum(n), qt.identity(n), qt.identity(spinn)),\
     "Z": qt.tensor(qt.identity(n), qt.momentum(n), qt.identity(spinn))}

B = np.array([0, 0, -1])
H = (P["X"]*P["X"] + P["Z"]*S["Z"]) + \
        sum([B[i]*S[o] for i, o in enumerate(S)])
U = (-1j*dt*H).expm()

Q = qt.position(n)
Ql, Qv = Q.eigenstates()
zero_index = -1
for i, l in enumerate(Ql):
    if np.isclose(l, 0):
        zero_index = i

initial = qt.tensor(Qv[0], Qv[zero_index], qt.basis(2, 0))
state = initial.copy()

vspheres = [[vp.sphere(color=vp.color.blue,\
                        radius=0.5, opacity=0.3,\
                        pos=vp.vector(Ql[i], Ql[j], 0))\
                for j in range(n)] for i in range(n)]
varrows = [[vp.arrow(pos=vspheres[i][j].pos,\
                      axis=vp.vector(0,0,0))\
                 for j in range(n)] for i in range(n)]

Qproj = [[qt.tensor(qt.tensor(Qv[i], Qv[j])*qt.tensor(Qv[i], Qv[j]).dag(), qt.identity(spinn))\
              for j in range(n)] for i in range(n)]

evolving = True
def keyboard(event):
    global evolving
    key = event.key
    if key == "q":
        evolving = False if evolving else True
#vp.scene.bind('keydown', keyboard)

while True:
    if evolving:
        for i in range(n):
            for j in range(n):
                vspheres[i][j].radius = qt.expect(Qproj[i][j], state)
                proj_state = Qproj[i][j]*state
                axis = [qt.expect(S[o], proj_state) for i, o in enumerate(["X", "Z"])]
                varrows[i][j].axis = vp.vector(axis[0], axis[1], 0)
                vp.rate(2000)
        state = U*state

Okay, so let's take a moment to stop and think.

If we wanted to be fancy, we could describe a point on a plane $(x, y)$ as being in a "superposition" of $(1,0)$ and $(0,1)$. For example, the point $(1,1)$ which makes a diagonal line from the origin would be an equal superposition of being "horizontal" and "vertical." In other words, we can add locations up and the sum of two locations is also a location. 

It's clear that we can describe a point on the plane in many different ways. We could rotate our coordinate system, use any set of two linearly independent vectors in the plane, and write out coordinates for the same point using many different frames of reference. This doesn't have anything to do with the point. The point is where it is. But we could describe it in many, many different ways as different superpositions of different basis states. Nothing weird about that. They just correspond to "different perspectives" on that point.

Another example of superposition, in fact, the ur-example if you will is waves. You drop a rock in the water and the water starts rippling outwards. Okay. You wait until the water is calm again, and then you drop another rock in the water at a nearby location and the water ripples outwards, etc. Now suppose you dropped both rocks in at the same time. How will the water ripple? It's just a simple sum of the first two cases. The resulting ripples will be a  sum of the waves from the one rock and the waves from the other rock. In other words, you can superpose waves, and that just makes another wave. This works for waves in water, sound waves, even light waves.

Waves are described by differential equations. And solutions to (homogenous) differential equations have the property that a sum of two solutions is also a solution.

For example, suppose we have a simple ordinary differential equation $y'' + 2y' + y = 0$. The idea is to find some function $y(x)$ such that the sum of its second derivative plus two times its first derivative plus the function equals 0. Suppose we had such a function. And what if we had another function $z(x)$ such that $z'' + 2z' + z = 0$, in other words, $z(x)$ is also a solution to the differential equation? If we took any linear combination of $y$ and $z$ then that will also be a solution.

E.g:

$y'' + 2y' + y = 0$

$z'' + 2z' + z = 0$

$(y+z)'' + 2(y+z)' + (y+z) = y'' + z'' + 2y' + 2z' + y + z = (y'' + 2y' + y) + (z'' + 2z' + z) = 0 + 0 = 0 $

This works because taking the derivative is a linear operator! So that $(y + z)' = y' + z'$.

Indeed, we can find a general solution to the different equation using an auxilliary polynomial.

$y'' + 2y' + y = 0 \rightarrow k^{2} + 2k + 1 = 0 \rightarrow (k + 1)(k + 1) = 0 \rightarrow k = \{-1,-1\}$

So that the general solution is $y = c_{0}e^{-x} + c_{1}xe^{-x}$. It doesn't matter the whole theory of how I solved this equation. You can confirm that it works:

$y' = -c_{0}e^{-x} + c_{1}(-xe^{-x} + e^{-x})$

$y'' = c_{0}e^{-x} + c_{1}(xe^{-x} - e^{-x} - e^{-x})$

$ y'' + 2y' + y \rightarrow (c_{0}e^{-x} + c_{1}xe^{-x} - c_{1}e^{-x} - c_{1}e^{-x}) + 2(-c_{0}e^{-x} - c_{1}xe^{-x} + c_{1}e^{-x}) + (c_{0}e^{-x} + c_{1}xe^{-x}) = 2c_{0}e^{-x} - 2c_{0}e^{-x} + 2c_{1}xe^{-x} - 2c_{1}e^{-x} + 2c_{1}e^{-x} - 2c_{1}e^{-x} = 0 $

The point is that the solutions form a 2D vector space and any solution can be written as a linear combination of $e^{-x}$ and $xe^{-x}$. In other words, I can plug any scalars in for $c_{0}$ and $c_{1}$ and I'll still get a solution.

Furthermore, we've seen this happen in terms of our constellations. We can take any two constellations (with the same number of stars) and add them together, and we always get a third constellation. Moreover, any given constellation can be expressed in a multitude of ways, as a complex linear combination of (linearly independent) constellations.

We could take one at the same constellation and express it as a complex linear superposition of X eigenstates, Y eigenstates, or any eigenstates of some Hermitian operator. The nice thing about Hermitian operators is that their eigenvectors are all orthogonal, and so we can use them as a basis.

In other words, the constellation is how it is, but it can be described in different reference frames in different ways, as a superposition of different basis constellations.

Now the daring thing about quantum mechanics is that is says that basically, everything is just like waves. If you have one state of a physical system and another state of a physical system, then if you add them up, you get another state of the physical system. In theory, everything obeys the superposition principle!

There is something mysterious about this, but not in the way it's usually framed. Considering a spin-$\frac{1}{2}$ particle, it seems bizarre to say that its quantum state is a linear superposition of being $\uparrow$ and $\downarrow$ along the Z axis. But we've seen that it's not bizarre at all! Any point on the sphere can be written as linear superposition of $\uparrow$ and $\downarrow$ along some axis! So that you really can imagine that the spin is definitely just spinning around some definite axis, given by a point on the sphere. But when you go to measure it, you pick out a certain special axis: the axis you're measuring along. You can express the point in a sphere as a linear combination of $\uparrow$ and $\downarrow$ along that axis, and then $aa*$ and $bb*$ give you the probabilities that you get either the one or the other outcome.

In other words, the particle may be spinning totally concretely around some axis, but when you go to measure it, you just get one outcome or the other with a certain probability. You can only reconstruct what that axis must have been before you measured it by preparing lots of such particles in the same state, and measuring them all in order to empirically determine the probabilities. In fact, to nail down the state you have to calculate (given some choice of X, Y and Z) $(\langle \psi \mid X \mid \psi \rangle, \langle \psi \mid Y \mid \psi \rangle, \langle \psi \mid Z \mid \psi \rangle)$, in other words, you have to do the experiement a bunch of times measuring the (identically prepared) particles along three orthogonal axes, and get the proportions of outcomes in each of those cases, weight those probabilities by the eigenvalues of X, Y, Z, and then you'll get the $(x, y, z)$ coordinate of the spin's axis.

So yeah, we could just represent our spin-$j$ state as a set of $(x, y, z)$ points on the sphere. But it's better to use the *unitary* representation, as a complex vector, because then we can describe the constellation as a superposition of *outcomes to an experimental situation* where the components of the vector have the interpretation of *probability amplitudes* whose "squares" give the probability for that outcome.

In other words, quantum mechanics effects a radical generalization of the idea of a perspective shift or reference frame. We normally think of a reference frame as provided by three orthogonal axis in 3D, say, by my thumb, pointer finger, and middle finger splayed out. In quantum mechanics, the idea is that *an experimental situation* provides a reference frame in the form of a Hermitian operator--an *observable*--whose eigenstates provide "axes" along which we can decompose the state of system, even as the correspond to *possible outcomes to the experiment*. The projection of the state on each of these axes gives the probability (amplitude) of that outcome.

It's an amazing twist on the positivist idea of describing everything as a black box: some things go in, some things come out, and you look at the probabilities of the outcomes, and anything else one must pass over in silence. And that's all one can hope for. In the spin case, we can see that this works, but with a twist: the spin state is a superposition of "possible outcomes to the experiment," which at first seems metaphysically bizarre, but this superposition can also be interpreted geometrically as: a perfectly concrete constellation on the sphere. (Indeed, you can interpret the "stars" in the constellations as little vortices, little tornados, churning the sphere around--more on this momentarily.)

Indeed, Dirac writes in his Principles of Quantum Mechanics, "[QM] requires us to assume that ... whenever the system is definitely in one state we can consider it as being partly in each of two or more other states. The original state must be regarded as the result of a kind of superposition of the two or more new states, in a way that cannot be conceived on classical ideas."

Similarly, Schrodinger proposed his cat as a kind of reductio ad absurdum to the universal applicability of quantum ideas. He asks us to consider whether a cat can be in a superposition of $\mid alive \rangle \ + \mid dead \rangle$. Such a superposition seems quite magical if we're considering a cat. How can a cat be both alive and dead? And yet, the analogous question for a spin isn't mysterious at all. How can a spin be in a superposition of $\mid \uparrow \rangle \ + \mid \downarrow \rangle$ (in the Z basis)? Easily! It's just pointed in the X+ direction! The analogous thing for a cat would be something like $\mid alive \rangle \ + \mid dead \rangle = \mid sick? \rangle$--I don't propose that seriously, other than to emphasize that what the superposition principle is saying is that superpositions of states *have to be perfectly good comprehensible states of the system as well*.

In the case of a quantum spin, we can see that we can regard the spin's constellation as being perfectly definite, but describable from many different reference frames, different linearly independent sets of constellations. Which reference frame you use is quite irrelevant to the constellation, of course, until you go to measure it, and then the spin ends up in one of the reference states with a certain probability. So the mystery isn't superposition per se; it has something to do with measurement. (You might ask how this plays out in non-spin cases, for example, superpositions of position: I'll return to this issue.) 

You might ask: Doesn't the *actual* Hamiltonian fix a basis? Sure, but the Hamiltonian can change. The deeper question is: we prepare a spin in some superposition, and we can regard it as definitely having a certain constellation, but implicitly, since we prepared the spin, we fixed what we mean by the X, Y, Z axes. This is implicit in our ability to even represent the particle's state relative to us. But what if we hadn't done that, so that we didn't necessarily have any shared reference frame between us? Are we allowed to still say that the particle as a "definite constellation"? We return to these questions later on when we briefly discuss spin networks.

One nice analogy is the idea of a "forced choice question." I show you a picture of a diagonal line and I ask you is it a vertical line or a horizontal line? And you're like it's both! It's equal parts vertical and horizontal! How can I choose? I have as many reasons to choose vertical or horizontal? If the line were inclined a little to the horizontal, then maybe I'd be inclined to say it was horizontal more than vertical, although not by much! But I force you to choose. You have to pick one! What do you do? You have to make a choice, but there is literally no reason for you to choose one or the other, in the sense that you have as many reasons to choose horizontal as to choose vertical. And so, you have to choose randomly. 

It's the same with the spin. Suppose it's spinning in the X+ direction. And I ask it, Are you up or down along the Z axis? Well, both! In terms of the question, it's equal parts up and down in the Z direction. But the experimental situation forces the particle to give some answer, and it has as many reasons to choose the one or the other, and so it just... picks one at random! In other words, this is the ultimate origin for quantum indeterminism: you've forced a physical system to make a choice, but it has no reason to choose: and so, it just decides for itself. It's like: you ask a stupid question, you get a stupid answer.

There's a old folktale from the Middle Ages called Buridan's Ass or the Philosopher's Donkey. The old donkey has been ridden hard by the philosopher all day, and by evening, it's both hungry and thirsty. The philosopher (being, who knows, perhaps in an experimental mindset) lays out a bucket of water and a bucket of oats before the donkey. The donkey is really hungry, so it wants to go for the oats, but it's also really thirsty, so it wants to go for the water. In fact, it's just as hungry as it is thirsty. It has to pick oats or water to dig into first, but it has as many reasons to pick the oats first as the water first, and so it looks back and forth and back and forth, and can't decide, and eventually passes out, steam coming out of its ears.

Leibniz articulated something called the Principle of Sufficient Reason, which was once very popular, and it basically said that: everything happens for a reason. The donkey exposes the consequence of this idea. If everything has to happen for a reason, and the donkey has as many reasons to choose the water as the oats, then the donkey can't do anything, since there would be no reason for choosing one over the other. And so, the donkey keels over--for what reason? It had as many reasons to choose the water as the oats.

Quantum mechanics strongly suggests that the Principle of Sufficient Reason is false. Nature, as it were, cuts the Gordian knot of the philosopher. An X+ spin presented with a bucket of $\uparrow$ and a bucket of $\downarrow$ doesn't glitch out, even though it has as many reasons to choose the one as the other. Instead: it just fucking picks one, for no reason at all. It decides all on its own without any help from "reasons." 

If the spin had been in the Z+ state, of course, and we asked $\uparrow$ or $\downarrow$ along the Z axis, then it would answer up with certainty, and vice versa for Z-.

A further point. A spin in itself can be described in terms of its constellation. This constellation can be described from many points of view, each corresponding to some "observable," some experimental situation, which as it were provides a filter, separating out this outcome from that outcome. We can also think about this in terms of entanglement. In the Stern-Gerlach apparatus, we actually measure the position of the particle, which has become entangled with the spin, so that if the particle is deflected up, it's spin is $\uparrow$ and if it's deflected down, it's spin is $\downarrow$. So that even though the particle's constellation is defined intrinsically, regardless of reference frame, the entanglement between the particle's spin and its position provides a reference frame, in that some (arbitrary) basis states of the particle's spin are paired with some (arbitrary) basis states of the particle's position. So that now there is a dependency between the outcomes. If the particle is measured to be here, then it's spin will be $\uparrow$. If the particle is measured to be there, then it's spin will be $\downarrow$--and vice versa. (You might wonder how we know that the spin's state at all? Well, if we measure a spin to be $\uparrow$ along the Z-axis, and then send it through another Stern-Gerlach set-up oriented along the Z-axis, then it'll always deflect up, but if we set it through a X-oriented apparatus, it'll give $\uparrow$ or $\downarrow$ with equal probability, etc--and the spin state explains why this is the case, which can't be derived from the position state alone.)

So although the reference frame by which we describe something is arbitrary, things can get entangled which pairs basis states of the one with basis states of the other, so that each pairing of basis states now has its own probability, and the pairs are all in superposition, leading to correlations between the outcomes of experiments done on the two things.

So that, while in its own terms the spin has its constellation, it interfaces with the world via some decomposition of that constellation into a superposition over "experimental outcomes" (which are constellations themselves), which can get entangled with other outcomes of other systems. So that the in itself arbitrary decomposition of the constellation into basis constellations provides the hooks by which the spin becomes correlated with the rest of the world.

(Returning to the pebble. If at first, to communicate with a pebble, we had to agree on whether an absence or a presence was significant, then on where to start counting, and then where 0 was, and then what rational number translated between our different notions of "1", and then a complex number to rotate from my axis to your axis, now we need to specify the basis vectors in terms of which we've written our "polynomial." This could mean: the 3D coordinate frame for some $(x, y, z)$ point, but in the general case, the basis vectors are bundled into a Hermitian matrix representing a quantum observable, so that to specify the basis vectors is to: specify which experiment you did. To wit, I do the Stern-Gerlach experiment, and I put a pebble in pile A if I got up, and a pebble in pile B if I got down. In order to successfully communicate to you what I mean by those pebbles, I have to specify: what experiment provided the reference frame for those outcomes, and which eigenvalue/eigenvector pair the pebble corresponds to. Then we can both compute the proportions between the outcomes, and reconstruct the state of the physical system, and we three will all be in agreement about what we're talking about.)

A further point.

Do we ever observe the "stars" themselves? Well, not exactly: we can reconstruct the quantum state of the spin via a buch of measurements on identically prepared systems, and then: express it in the Z-basis, and find the roots of the polynomial, etc. But they really are there, as the following argument suggests.

A "spin coherent state" is just a state where all the stars in the same location. The "coherent states" are like the "most classical states." In the infinite dimensional case, you could define them as eigenstates of the annihilation operator--in other words, you remove a quantum and it doesn't make a difference. In the finite dimensional spin case, if you had a spin-$j$ state, then the coherent states would be all the states with $2j$ stars in the same location. They have very interesting properties. For example, any set of $2j+1$ coherent states (each $2j$ stars at different locations on the sphere) forms a linearly independent (but not necessarily orthogonal) basis for the spin-$j$ Hilbert space.

In [None]:
import numpy as np
import qutip as qt
import itertools

def xyz_c(xyz):
    x, y, z = xyz
    if np.isclose(z,-1):
        return float("Inf")
    else:
        return x/(1+z) + 1j*y/(1+z)

def roots_coeffs(roots):
    n = len(roots)
    coeffs = np.array([((-1)**(-i))*sum([np.prod(term) for term in itertools.combinations(roots, i)]) for i in range(0, len(roots)+1)])
    return coeffs/coeffs[0]

def roots_poly(roots):
    zeros = roots.count(0j)
    if zeros == len(roots):
        return [1j] + [0j]*len(roots)
    poles = roots.count(float("Inf"))
    roots = [root for root in roots if root != float('Inf')]
    if len(roots) == 0:
        return [0j]*poles + [1j]
    return [0j]*poles + roots_coeffs(roots).tolist()

def poly_spin(poly):
    j = (len(poly)-1)/2.
    spin = []
    for m in np.arange(-j, j+1):
        i = int(m+j)
        spin.append(poly[i]/\
            (((-1)**(i))*np.sqrt(np.math.factorial(2*j)/\
                        (np.math.factorial(j-m)*np.math.factorial(j+m)))))
    aspin = np.array(spin)
    return aspin/np.linalg.norm(aspin)

def XYZ_spin(XYZ):
    return qt.Qobj(poly_spin(roots_poly([xyz_c(xyz) for xyz in XYZ])))

def spinor_xyz(spinor):
    if isinstance(spinor, np.ndarray):
        spinor = qt.Qobj(spinor)
    return np.array([qt.expect(qt.sigmax(), spinor),\
                     qt.expect(qt.sigmay(), spinor),\
                     qt.expect(qt.sigmaz(), spinor)])

############################################################################

j = 3/2
n = int(2*j+1)
M = np.array([XYZ_spin([spinor_xyz(qt.rand_ket(2))]*(n-1)).full().T[0] for i in range(n)]).T
print("%s != 0" % np.linalg.det(M))

Furthermore, the spin coherent states define a resolution of the identity. For a given $j$, define the spin coherent state (of $2j$ stars) at a given location on the sphere in terms of spherical coordinates $\theta$ and $\phi$: $\mid \theta, \phi \rangle$. Then:

$\int \mid \theta, \phi \rangle \langle \theta, \phi \mid d\mu(\theta, \phi) = \mathbb{1}$, where $d\mu(\theta, \phi) = \frac{2j+1}{4\pi}sin(\theta)d\theta d\phi$

(One minor point is that, in a previous adventure, we talked about one qubit steering another qubit. We went through all the points on the sphere of qubit A, projected A into those states, and looked to see where B was steered to in turn. This was in the context of a spin-$\frac{1}{2}$. But we can see it kinda works for higher spin too: we just go through all the points on the sphere of spin A, and project A into the *spin coherent state of $2j$ stars at that point*, and see where B is steered to (which of course might be some arbitrary constellation). Since going over the whole sphere exhausts the states of A, this can completely characterize the entanglement as before.)

One thing we can do with the spin coherent states is use them to define a wavefunction on $S^{2}$, the sphere, corresponding to our spin state. Given a spin state $\mid \psi \rangle$, 

$\psi(x, y, z) = \langle (x, y, z) \mid \psi \rangle$, where $\langle (x, y, z) \mid$ is the spin coherent state with $2j$ stars at $(x, y, z)$.

This wave function will be 0 at the points directly opposite to the Majorana stars. This makes sense because $\psi(x, y, z)$ is the probability amplitude for *all the stars to be at $(x, y, z)$*, but we know that there's a star in the opposite direction, so the amplitude has to be 0 at $(x, y, z)$. This is sometimes called the Husimi wave function. It's zeros are precisely opposite to the zeros of the Majorana polynomial.

This gives us an operational meaning for the individual stars. They represent the directions along which there is 0% probability that the spin will be entirely in the opposite direction. And the state on the whole sphere can be completely characterized by these points.

In [None]:
import numpy as np
import qutip as qt
import vpython as vp
from magic import *
scene = vp.canvas(background=vp.color.white)

def coherent_states(j, N=25):
    theta = np.linspace(0, math.pi, N)
    phi = np.linspace(0, 2*math.pi, N)
    THETA, PHI = np.meshgrid(theta, phi)
    return ([[qt.spin_coherent(j, THETA[i][k], PHI[i][k])\
                    for k in range(N)] for i in range(N)], THETA, PHI)

def husimi(state, CS):
    cs, THETA, PHI = CS
    N = len(THETA)
    Q = np.zeros_like(THETA, dtype=complex)
    for i in range(N):
        for j in range(N):
            amplitude = cs[i][j].overlap(state)
            Q[i][j] = amplitude#*np.conjugate(amplitude)
    pts = []
    for i, j, k in zip(Q, THETA, PHI):
        for q, t, p in zip(i, j, k):
            pts.append([q, sph_xyz([1, t, p])])
    return pts

def tangent_rotations(CS):
    cs, THETA, PHI = CS
    rots = []
    for j, k in zip(THETA, PHI):
        for t, p in zip(j, k):
            normal = sph_xyz([1, t, p])
            tan = sph_xyz([1, t+np.pi/2, p])
            vv = np.cross(tan, normal)
            vv = vv/np.linalg.norm(vv)
            trans = np.array([tan, vv, normal])
            itrans = np.linalg.inv(trans)
            rots.append(itrans)
    return rots

dt = 0.01
j = 3/2
n = int(2*j+1)

CS = coherent_states(j, N=20)
rots = tangent_rotations(CS)

state = qt.rand_ket(n)
H = qt.rand_herm(n)
U = (-1j*H*dt).expm()

vsphere = vp.sphere(color=vp.color.blue,\
                    opacity=0.4)
vstars = [vp.sphere(radius=0.2, emissive=True,\
                    pos=vp.vector(*xyz)) for xyz in spin_XYZ(state)]

pts = husimi(state, CS)
#vpts = [vp.sphere(pos=vp.vector(*pt[1]), radius=0.5*pt[0]) for pt in pts]
vpts = []
for i, pt in enumerate(pts):
    amp, normal = pt
    amp_vec = np.array([amp.real, amp.imag, 0])
    amp_vec = np.dot(rots[i], amp_vec)
    vpts.append(vp.arrow(pos=vp.vector(*normal), axis=0.5*vp.vector(*amp_vec)))

while True:
    state = U*state
    for i, xyz in enumerate(spin_XYZ(state)):
        vstars[i].pos = vp.vector(*xyz)
    pts = husimi(state, CS)
    #for i, pt in enumerate(pts):
    #    vpts[i].radius = 0.5*pt[0]
    for i, pt in enumerate(pts):
        amp, normal = pt
        amp_vec = np.array([amp.real, amp.imag, 0])
        amp_vec = np.dot(rots[i], amp_vec)
        vpts[i].axis = 0.5*vp.vector(*amp_vec)
    vp.rate(2000)

One is reminded of the (in)famous Hairy Ball theorem, that if you have a hairy ball and try to comb its hair flat, no matter how you try, there will always be one hair like Alfalfa's that'll always be sticking up. That single hair? It's the point at infinity.


Okay, now we're going to shift gears a little bit. It turns out that there is another related and very useful representation of a spin-$j$ particle.

It turns out there is a 1-to-1 map between spin-$j$ states and the symmeterized tensor product of $2j$ spin-$\frac{1}{2}$ states. And you'll never guess: the spin-$\frac{1}{2}$ states are the spinors corresponding to the roots from before.

Consider we have two spin-$\frac{1}{2}$ states $\begin{pmatrix} a \\ b \end{pmatrix}$ and $\begin{pmatrix} c \\ d \end{pmatrix}$. Their homogeneous polynomials are: $f(w, z) = az - bw$ and $g(w, z) = cz - dw$. Let's multiply them together: $h(w, z) = (az - bw)(cz - dw) = acz^{2} - adzw - bcwz + bdw^{2} = acz^{2} - (ad + bc)zw + bdw^{2}$. Converting things into a $\mid j, m \rangle$ state, we get: $\begin{pmatrix} \frac{ac}{ (-1)^{0}\sqrt{\begin{pmatrix} 2 \\ 0 \end{pmatrix}}} \\ \frac{-(ad + bc)}{(-1)^{1}\sqrt{\begin{pmatrix} 2 \\ 1 \end{pmatrix}}} \\ \frac{bd}{(-1)^{2}\sqrt{\begin{pmatrix} 2 \\ 2 \end{pmatrix}}} \end{pmatrix}$ or $\begin{pmatrix} ac \\ 2(ad + bc) \\ bd \end{pmatrix}$. This is a spin-$1$ state.

On the other hand, let's consider the permutation symmetric tensor product of our two states:

$ \begin{pmatrix} a \\ b \end{pmatrix} \otimes \begin{pmatrix} c \\ d \end{pmatrix} + \begin{pmatrix} c \\ d \end{pmatrix} \otimes \begin{pmatrix} a \\ b \end{pmatrix} = \begin{pmatrix} ac \\ ad \\ bc \\ bd \end{pmatrix} + \begin{pmatrix} ca \\ cb \\ da \\ db \end{pmatrix} = \begin{pmatrix} 2ac \\ ad + bc \\ ad + bc \\ 2bd \end{pmatrix}$.

Just looking at the components we can see that the spin-$1$ state and the permutation symmetric product of the two spin-$\frac{1}{2}$ states are really encoding the same information.

Now consider the following basis for permutation symmetric states of two spin-$\frac{1}{2}$'s:

$\begin{matrix} 
\mid \uparrow \uparrow \rangle \\ 
\mid \uparrow \downarrow \rangle \ + \mid \downarrow \uparrow \rangle \\ 
\mid \downarrow \downarrow \rangle
\end{matrix}$

It's three dimensional, just like a spin-$1$ state! (Incidentally, we could normalize the middle term with a $\frac{1}{\sqrt{2}}$, etc.)

For three:

$\begin{matrix} 
\mid \uparrow \uparrow \uparrow \rangle \\ 
\mid \uparrow \uparrow \downarrow \rangle \ + \mid \uparrow \downarrow \uparrow \rangle \ + \mid \downarrow \uparrow \uparrow \rangle \\ 
\mid \downarrow \downarrow \uparrow \rangle \ + \mid \downarrow \uparrow \downarrow \rangle \ + \mid \uparrow \downarrow \downarrow \rangle \\ 
\mid \downarrow \downarrow \downarrow \rangle
\end{matrix}$

It's four dimensional, just like a spin-$\frac{3}{2}$ state!

For four:

$\begin{matrix} 
\mid \uparrow \uparrow \uparrow \uparrow \rangle \\ 
\mid \uparrow \uparrow \uparrow \downarrow \rangle \ + \mid \uparrow \uparrow \downarrow \uparrow \rangle \ + \mid \uparrow \downarrow \uparrow \uparrow \rangle \ + \mid \downarrow \uparrow \uparrow \uparrow \rangle \\ 
\mid \uparrow \uparrow \downarrow \downarrow \rangle \ + \mid \uparrow \downarrow \downarrow \uparrow \rangle \ + \mid \uparrow \downarrow \uparrow \downarrow \rangle \ + \mid \downarrow \uparrow \downarrow \uparrow \rangle \ + \mid \downarrow \uparrow \uparrow \downarrow \rangle \ + \mid \downarrow \downarrow \uparrow \uparrow \rangle \\
\mid \downarrow \downarrow \downarrow \uparrow \rangle \ + \mid \downarrow \downarrow \uparrow \downarrow \rangle \ + \mid \downarrow \uparrow \downarrow \downarrow \rangle \ + \mid \uparrow \downarrow \downarrow \downarrow \rangle \\ 
\mid \downarrow \downarrow \downarrow \downarrow \rangle
\end{matrix}$

It's five dimensional, just like a spin-$2$ state!

We can see that the symmeterized basis states are just: all the stars $\uparrow$, all but one star $\uparrow$, all but two stars $\uparrow$, etc.

In [None]:
import qutip as qt
import numpy as np
from magic import spin_XYZ
from itertools import permutations, product

def spin_sym_trans(j):
    n = int(2*j)
    if n == 0:
        return qt.Qobj([1])
    N = {}
    for p in product([0,1], repeat=n):
        if p.count(1) in N:
            N[p.count(1)] += qt.tensor(*[qt.basis(2, i) for i in p])
        else:
            N[p.count(1)] = qt.tensor(*[qt.basis(2, i) for i in p])
    Q = qt.Qobj(np.array([N[i].unit().full().T[0].tolist() for i in range(n+1)]))
    Q.dims[1] = [2]*n
    return Q.dag()

def spin_sym(spin):
    spinors = [qt.Qobj(c_spinor(r)) for r in spin_roots(spin)]
    return sum([qt.tensor(*[spinors[i] for i in p]) for p in permutations(range(len(spinors)))]).unit()

def get_phase(v):
    c = None
    if isinstance(v, qt.Qobj):
        v = v.full().T[0]
    i = (v!=0).argmax(axis=0)
    c = v[i]
    return np.exp(1j*np.angle(c))

def normalize_phase(v):
    return v/get_phase(v)

j = 3/2
n = int(2*j + 1)
spin = qt.rand_ket(n)
S = spin_sym_trans(j)

sym = S*spin
spin2 = S.dag()*sym
print(spin)
print(sym)
print(spin == spin2)

sym2 = spin_sym(spin)
print(np.isclose(normalize_phase(sym).full().T[0], normalize_phase(sym2).full().T[0]).all())

So we calculate our symmeterized state in two ways. First, we calculate the symmeterized basis states and make a change-of-basis matrix that transforms our $\mid j, m \rangle$ state into a symmeterized state of $2j$ spin-$\frac{1}{2}$'s. The symmeterized basis states in the order given above are paired with the $\mid j, m \rangle$ states in their natural order, from largest $m$ to smallest $m$. We also show we can undo the transformation without any harm.

Second, we find the roots of the associated polynomial of the $\mid j, m\rangle$ state, upgrade the roots to spinors, and then tensor them in all possible orders, and add up all the permutations, normalizing the state in the end. We find that we get exactly the same symmeterized state! Up to complex phase, however. So we normalize the phases (basically, impose that the first (non-0) component is real), and behold: it's precisely the same symmeterized state.


One nice thing that immediate follows from this is that we can actually explicitly calculate the X, Y, Z matrices for any spin-$j$. If we wanted the X matrix for spin-$\frac{3}{2}$, for example, we'd just take:

$X_{sym} = (X_{\frac{1}{2}} \otimes I \otimes I) + (I \otimes X_{\frac{1}{2}} \otimes I) + (I \otimes I \otimes X_{\frac{1}{2}})$

This applies the Pauli X matrix to each of the symmeterized spin-$\frac{1}{2}$ guys each separately. Then we downgrade this to act on our $\mid j, m \rangle$ vector with $X_{\frac{3}{2}} = S^{\dagger} X_{sym} S$, where $S$ is our change-of-basis matrix.

In [None]:
import qutip as qt
import numpy as np
from magic import spin_XYZ
from itertools import permutations, product

def spin_sym_trans(j):
    n = int(2*j)
    if n == 0:
        return qt.Qobj([1])
    N = {}
    for p in product([0,1], repeat=n):
        if p.count(1) in N:
            N[p.count(1)] += qt.tensor(*[qt.basis(2, i) for i in p])
        else:
            N[p.count(1)] = qt.tensor(*[qt.basis(2, i) for i in p])
    Q = qt.Qobj(np.array([N[i].unit().full().T[0].tolist() for i in range(n+1)]))
    Q.dims[1] = [2]*n
    return Q.dag()

j = 3/2
n = int(2*j + 1)
spin = qt.rand_ket(n)
S = spin_sym_trans(j)

X = S.dag()*sum([qt.tensor(*[qt.jmat(0.5, 'x') if i == j else qt.identity(2)\
            for j in range(n-1)]) for i in range(n-1)])*S
Y = S.dag()*sum([qt.tensor(*[qt.jmat(0.5, 'y') if i == j else qt.identity(2)\
            for j in range(n-1)]) for i in range(n-1)])*S
Z = S.dag()*sum([qt.tensor(*[qt.jmat(0.5, 'z') if i == j else qt.identity(2)\
            for j in range(n-1)]) for i in range(n-1)])*S

print(X == qt.jmat(j, 'x'))
print(Y == qt.jmat(j, 'y'))
print(Z == qt.jmat(j, 'z'))

So anyway this is interesting. We can represent our spin-$j$ state as: a set of roots/$(x, y, z)$ points, a single variable polynomial, a homogeneous two variable polynomial, a $\mid j, m \rangle$ state, and also a state of $2j$ symmeterized spin-$\frac{1}{2}$'s. In the latter case, we can swap any of the symmeterized qubits and this doesn't change the constellation. Indeed, the constellation is "holistically" encoded in the entanglement between the parts and not in the parts (qubits) individually. What does this mean? Well, suppose we just rotate a single one of the symmeterized qubits around the X axis a bit. This won't lead to a rigid rotation of the whole sphere. In fact, we're no longer in a permutation symmetry state! But a local rotation can't change the entanglement between the parts in any way. And yet: if we then rotate the rest of the qubits in exactly the same, we can recover our original constellation (just rotated around the X-axis)! In other words, the constellation isn't ultimately affected by local rotations of the qubits: we can rotate all the qubits separately locally, but the constellation is still in there somewhere--there will always be a way to undo all those changes and recover the constellation by acting locally separately on the qubits. In this sense, the constellation is encoded in the entanglement of the whole and not in the parts.

Indeed, we could separate the $2j$ spin-$\frac{1}{2}$ particles and distribute them to $2j$ parties. The constellation is still encoded in them, even though the individual parts might be scattered around the universe!

There's an idea that goes by the name SLOCC: Stochastic Local Operations and Classical Communication. The idea is: we want to describe what's invariant about a multipartite quantum state, in other words, characterize its entanglement structure, supposing we can: act separately with unitaries on the parts, classically communicate to each each other (like if we each had one of the parts), and also: entangle our part with some auxilliary system (which will change the entanglement structure), but then measure the auxilliary system (which will restore the entanglement structure). In this context, if we do the same auxilliary operation to all the qubits, we can *boost* the constellation: we can get not just SU(2), but also SL(2,C). By coordinating our local operations, we can always recover the original constellation by local ops: we can always turn local rotations/boosts into global rotations/boosts by making sure each qubit is rotated in the same way.

Another great thing about the symmeterized qubit representation of spin is that it makes calculating spin couplings aka "the addition of angular momentum" very elegant and easy! Normally, we need to calculate a basis transformation in terms of the so-called Clebsch-Gordan coefficients, or find the eigenvalues/eigenvectors of the total spin operator, but we can get the same answer in a very cool way with our symmeterized states.

So what are we talking about it? It's a kind of generalization of the Fourier transform to SU(2). Suppose we have two spin-${1}{2}$ particles tensored together. We normally describe this in terms of the standard tensor basis: $\{\mid \uparrow \uparrow \rangle, \mid \uparrow \downarrow \rangle, \mid \downarrow \uparrow \rangle, \mid \downarrow \downarrow \rangle \}$. But we could also expand it in another basis, for example, in terms of eigenstates of the total spin operator: $J^{2} = \textbf{XX} + \textbf{YY} + \textbf{ZZ}$, where $\textbf{X}$ is the sum of the X operators on each of the qubits, $\textbf{Y}$ is the sum of the Y operators on each of the qubits, etc, each tensored with identities appropriately.


In [None]:
import qutip as qt
import numpy as np

def symmeterize(qubits):
    return sum(qt.tensor(*perm)\
        for perm in permutations(qubits, len(qubits))).unit()

j1, j2 = 1/2, 1/2
n1, n2 = int(2*j1+1), int(2*j2+1)
state = qt.rand_ket(n1*n2)
state.dims = [[n1,n2], [1,1]]

X = qt.tensor(qt.jmat(j1, 'x'), qt.identity(n2)) + qt.tensor(qt.identity(n1), qt.jmat(j2, 'x'))
Y = qt.tensor(qt.jmat(j1, 'y'), qt.identity(n2)) + qt.tensor(qt.identity(n1), qt.jmat(j2, 'y'))
Z = qt.tensor(qt.jmat(j1, 'z'), qt.identity(n2)) + qt.tensor(qt.identity(n1), qt.jmat(j2, 'z'))

J = X*X + Y*Y + Z*Z
JL, JV = J.eigenstates()
M = qt.Qobj(np.array([v.full().T[0] for v in JV]))
M.dims = [[n1,n2], [n1,n2]]
state2 = M*state
print(M)
print(state2)

Looking at the rows of $M$, which are the eigenvectors of $J^{2}$, we find that the following provide an alternative basis for the states of two qubits:

$\begin{matrix}
\mid \uparrow \downarrow \rangle \ - \mid \downarrow \uparrow \rangle \\
\mid \uparrow \uparrow \rangle \\
\mid \uparrow \downarrow \rangle \ + \mid \downarrow \uparrow \rangle \\
\mid \downarrow \downarrow \rangle
\end{matrix}
$

Well, we recognize the latter three basis states as just the 3 permutation symmetric states from before! So the last three components in this basis correspond to a spin-$1$ state. The first basis state, in contrast, is the antisymmetric state: there's only one of them, and this corresponds to a spin-$0$ state, a singlet. So we can decompose the tensor product of two spin-$\frac{1}{2}$ states into a "direct sum" or concatenation of a spin-$0$ state and a spin-$1$ state.

$ H_{\frac{1}{2}} \otimes H_{\frac{1}{2}} \rightarrow H_{0} \oplus H_{1}$

I like to think of this like: we can turn a spin-$\frac{1}{2}$ AND a spin-$\frac{1}{2}$ into a spin-$0$ OR a spin-$1$. It's like finding the "spectrum": but instead of a direct sum of "frequencies", we get a direct sum of spin sectors. If the "product/tensor" basis makes manifest the fact that there are two "separate" particles (which might be entangled), the "Clebsch-Gordan basis" makes manifest their togetherness. Moreover, it's worth pointing out that a concatenation of Hilbert spaces is also a Hilbert space. The spin-${0}$ sector and the spin-${1}$ sector are each weighted by an amplitude, like: $\begin{pmatrix} a_{spin-0} \\ b_{spin-1} \end{pmatrix}$. So we can interpret $a$ and $b$ as the probability amplitudes that if two spin-$\frac{1}{2}$'s come in, depending on their state, they'll combine into either a spin-$0$ or a spin-$1$.

We have:

$ H_{\frac{1}{2}} \otimes H_{1} \rightarrow H_{\frac{1}{2}} \oplus H_{\frac{3}{2}}$

$ H_{\frac{1}{2}} \otimes H_{\frac{3}{2}} \rightarrow H_{1} \oplus H_{2}$

In other words, if we combine a spin-$\frac{1}{2}$ and a spin-$j$, we get either a spin-($j+\frac{1}{2}$) or spin-($j-\frac{1}{2}$).

$ H_{1} \otimes H_{1} \rightarrow H_{0} \oplus H_{1} \oplus H_{2}$

Indeed, $3 \times 3 = 1 + 3 + 5 = 9$. So the dimensionality checks out.

$ H_{1} \otimes H_{2} \rightarrow H_{1} \oplus H_{2} \oplus H_{3}$

As: $3 \times 5 = 3 + 5 + 7 = 15$.

$ H_{1} \otimes H_{\frac{3}{2}} \rightarrow H_{\frac{1}{2}} \oplus H_{\frac{3}{2}} \oplus H_{\frac{5}{2}}$

As: $3 \times 4 = 2 + 4 + 6 = 12$.

$ H_{\frac{3}{2}} \otimes H_{\frac{3}{2}} \rightarrow H_{0} \oplus H_{1} \oplus H_{2} \oplus H_{3}$

As: $4 \times 4 = 1 + 3 + 5 + 7 = 16$.

We'll come to the case of if we have more than two spins later.

So we can calculate this decomposition by just diagonalizing the total spin operator (and making sure the basis states are in the right order!). But here's another way.

First we define $\epsilon = \begin{pmatrix} 0 \\ 1 \\ -1 \\ 0 \end{pmatrix}$: it's just the (unnormalized) antisymmetric state of two spin-$\frac{1}{2}$'s. Now suppose we have two spins with $j_{1}$ and $j_{2}$, each represented as the symmetrized tensor product of $2j$ spinors. We tensor them together. If we then symmeterize over *all* the spinors, we obtain a spin-($j_{1} + j_{2}$) state. Its constellation is just given by: the constellation of the first spin overlaid with the constellation of the second spin. If we had two spin-$\frac{1}{2}$'s, this would be the spin-$1$ state. To get the rest of the states, before we symmeterize, we contracte $k$ spinors of the first group with $k$ spinors of the second group using the $\epsilon$. In other words, we contract a spinor from the first group with the first spinor in $\epsilon$ and a spinor from the second group with the second spinor in $\epsilon$. Once we've contracted k spinors, we symmeterize the $2(j_{1} + j_{2} - k)$ spinors which are left. And by going over all the possible k's, we obtain the Clebsch-Gordan decomposition.

Now the normalization of the states is a little tricky so let's ignore that! It's the principle that matters. To get each lower state, we remove a star's worth of angular momentum from each of the two groups: and because we use $\epsilon$ to do our contractions, the overall state gets multiplied by a factor that depends on the degree to which the removed quanta point in opposite directions. (Note because of the permutation symmetry, it doesn't matter which spinors we choose to contract within a group!)

In [None]:
import numpy as np
import qutip as qt
from itertools import permutations, product
from magic import *

######################################################

def spin_sym_trans(j):
    n = int(2*j)
    if n == 0:
        return qt.Qobj(np.array([1]))
    N = {}
    for p in product([0,1], repeat=n):
        if p.count(1) in N:
            N[p.count(1)] += qt.tensor(*[qt.basis(2, i) for i in p])
        else:
            N[p.count(1)] = qt.tensor(*[qt.basis(2, i) for i in p])
    Q = qt.Qobj(np.array([N[i].unit().full().T[0].tolist() for i in range(n+1)]))
    Q.dims[1] = [2]*n
    return Q.dag()

######################################################

def symmeterize_indices(tensor, dims):
    tensor = tensor.copy()
    old_dims = tensor.dims
    tensor.dims = [dims, [1]*len(dims)]
    n = tensor.norm()
    pieces = [tensor.permute(p) for p in permutations(list(range(len(tensor.dims[0]))))]
    for piece in pieces:
        piece.dims = [[piece.shape[0]], [1]]
    v = sum(pieces)/len(pieces)
    v.dims = old_dims
    return v

######################################################

def clebsch_split(state, sectors):
    v = state.full().T[0]
    dims = [int(2*sector + 1) for sector in sectors]
    running = 0
    clebsch_states = []
    for d in dims:
        clebsch_states.append(qt.Qobj(v[running:running+d]))
        running += d
    return clebsch_states

def possible_j3s(j1, j2):
    J3 = [j1-m2 for m2 in np.arange(-j2, j2+1)]\
            if j1 > j2 else\
                [j2-m1 for m1 in np.arange(-j1, j1+1)]
    return J3[::-1]

def tensor_clebsch(j1, j2):
    J3 = possible_j3s(j1, j2)
    states = []
    labels = []
    for j3 in J3:
        substates = []
        sublabels = []
        for m3 in np.arange(-j3, j3+1):
            terms = []
            for m1 in np.arange(-j1, j1+1):
                for m2 in np.arange(-j2, j2+1):
                    terms.append(\
                        qt.clebsch(j1, j2, j3, m1, m2, m3)*\
                        qt.tensor(qt.spin_state(j1, m1),\
                                    qt.spin_state(j2, m2)))
            substates.append(sum(terms))
            sublabels.append((j3, m3))
        states.extend(substates[::-1])
        labels.append(sublabels[::-1])
    return qt.Qobj(np.array([state.full().T[0] for state in states])), labels

######################################################

j1, j2 = 1/2, 1/2# make sure the smaller spin is first
n1, n2 = int(2*j1+1), int(2*j2+1)
state = qt.rand_ket(n1*n2)
state.dims = [[n1,n2], [1,1]]

######################################################

CG, labels = tensor_clebsch(j1, j2)
CG.dims = [state.dims[0], state.dims[0]]
cg_state = CG*state
cgr = clebsch_split(cg_state, possible_j3s(j1, j2))
for i, r in enumerate(cgr):
    if r.shape[0] != 1:
        cgr[i] = normalize_phase(r.unit())
    else:
        cgr[i] = cgr[i].unit()

######################################################

def repair(q):
    q.dims[1] = [1]*len(q.dims[0])
    return q

S1, S2 = spin_sym_trans(j1), spin_sym_trans(j2)
S = qt.tensor(S1, S2)
together = S*state

a_indices = [2]*(n1-1)
b_indices = [2]*(n2-1)
results = [repair(symmeterize_indices(together, together.dims[0]))]
contracted = [together.copy()]
for i in range(int(2*j1)):
    if contracted[-1].dims[0] == [2,2]:
        results.append(np.sqrt(2)*qt.singlet_state().dag()*contracted[-1])
    else:
        intermediate = qt.tensor(np.sqrt(2)*qt.singlet_state(), contracted[-1])
        intermediate = repair(qt.tensor_contract(intermediate, (0, 2)))
        a_indices.pop()
        if intermediate.dims[0] == [2,2]:
            results.append(np.sqrt(2)*qt.singlet_state().dag()*intermediate)
        else:
            intermediate = repair(qt.tensor_contract(intermediate, (0, len(intermediate.dims[0])-1)))
            b_indices.pop()
            contracted.append(intermediate.copy())
            if intermediate.dims[0] != [2]:
                results.append(repair(symmeterize_indices(intermediate, intermediate.dims[0])))
            else:
                results.append(intermediate.copy())

cgr2 = []
for result in results:
    if result.shape[0] == 1:
        cgr2.append(result.unit())
    else:
        SR = spin_sym_trans(len(result.dims[0])/2)
        temp = repair(normalize_phase(SR.dag()*result))
        if temp.norm() != 0:
            temp = temp.unit()
        cgr2.append(temp)
cgr2 = cgr2[::-1]

print(cgr)
print(cgr2)

To see whether the above code worked, we use qutip's built-in Clebsch-Gordan calculator. We construct the basis transformation given $j_{1}$ and $j_{2}$ by iterating over the possible $j$'s that can result. For each possible output $j$, we iterate over its possible $m$ values. For each one, we iterate over the possible $m_{1}$ and $m_{2}$ values for $j_{1}$ and $j_{2}$: for each $(j_{1}, j_{2}, j, m_{1}, m_{2}, m)$, we get the Clebsch-Gordan coefficient, which weights the state $\mid j_{1}, m_{1} \rangle \mid j_{2}, m_{2}\rangle$. We then sum all those states. We repeat the procedure for each $m$ value of $j$. We stick all those states into a matrix, and that gives us our basis transformation.

Anyway, let's see it in action!

In [None]:
import qutip as qt
import numpy as np
from magic import *
import vpython as vp

######################################################

def clebsch_split(state, sectors):
    v = state.full().T[0]
    dims = [int(2*sector + 1) for sector in sectors]
    running = 0
    clebsch_states = []
    for d in dims:
        clebsch_states.append(qt.Qobj(v[running:running+d]))
        running += d
    return clebsch_states

def possible_j3s(j1, j2):
    J3 = [j1-m2 for m2 in np.arange(-j2, j2+1)]\
            if j1 > j2 else\
                [j2-m1 for m1 in np.arange(-j1, j1+1)]
    return J3[::-1]

def tensor_clebsch(j1, j2):
    J3 = possible_j3s(j1, j2)
    states = []
    labels = []
    for j3 in J3:
        substates = []
        sublabels = []
        for m3 in np.arange(-j3, j3+1):
            terms = []
            for m1 in np.arange(-j1, j1+1):
                for m2 in np.arange(-j2, j2+1):
                    terms.append(\
                        qt.clebsch(j1, j2, j3, m1, m2, m3)*\
                        qt.tensor(qt.spin_state(j1, m1),\
                                    qt.spin_state(j2, m2)))
            substates.append(sum(terms))
            sublabels.append((j3, m3))
        states.extend(substates[::-1])
        labels.append(sublabels[::-1])
    return qt.Qobj(np.array([state.full().T[0] for state in states])), labels

######################################################

j1, j2 = 1,2# make sure the smaller spin is first
n1, n2 = int(2*j1+1), int(2*j2+1)
state = qt.tensor(qt.rand_ket(n1), qt.rand_ket(n2))#qt.rand_ket(n1*n2)
state.dims = [[n1,n2], [1,1]]

######################################################

CG, labels = tensor_clebsch(j1, j2)
CG.dims = [state.dims[0], state.dims[0]]
cg_state = CG*state
poss_js = possible_j3s(j1, j2)
cgr = clebsch_split(cg_state, poss_js)

######################################################

Astate = state.ptrace(0)
AL, AV = Astate.eigenstates()
Bstate = state.ptrace(1)
BL, BV = Bstate.eigenstates()
vsphereA = vp.sphere(pos=vp.vector(-2, 5, 0), radius=j1, opacity=0.5, color=vp.color.blue)
vstarsA = [[vp.sphere(pos=vsphereA.pos + vsphereA.radius*vp.vector(*xyz),\
                      radius=0.2*vsphereA.radius,\
                      opacity=AL[i])\
                 for xyz in spin_XYZ(v)] for i,v in enumerate(AV)]
vsphereB = vp.sphere(pos=vp.vector(2, 5, 0), radius=j2, opacity=0.5, color=vp.color.blue)
vstarsB = [[vp.sphere(pos=vsphereB.pos + vsphereB.radius*vp.vector(*xyz),\
                      radius=0.2*vsphereB.radius,\
                      opacity=BL[i])\
                 for xyz in spin_XYZ(v)] for i,v in enumerate(BV)]

vcgspheres = []
vcgstars = []
colors = [vp.color.red, vp.color.orange, vp.color.yellow, vp.color.green, vp.color.blue, vp.color.magenta, vp.color.cyan]
lengths = [c.shape[0] for c in cgr]
L = sum(lengths)/2
running = -L
for i, c in enumerate(cgr):
    if c.shape[0] == 1:
        z = c.unit()
        vsph = vp.arrow(color=colors[i], opacity=c.norm(), pos=vp.vector(running, -2, 0),\
                        axis=vp.vector(z[0][0][0].real, z[0][0][0].imag, 0))
        vcgspheres.append(vsph)
        vcgstars.append([])
        running += 4
    else:
        vsph = vp.sphere(radius=(c.shape[0]-1)/2,\
                         pos=vp.vector(running, -2, 0),\
                         opacity=c.norm(),\
                         color=colors[i])
        vsts = [vp.sphere(radius=0.2*vsph.radius, \
                          pos=vsph.pos + vsph.radius*vp.vector(*xyz))\
                            for xyz in spin_XYZ(c)]
        vcgspheres.append(vsph)
        vcgstars.append(vsts)
        running += 1.5*c.shape[0]

######################################################

dt = 0.01
H = qt.rand_herm(n1*n2)
H.dims = [[n1, n2], [n1, n2]]
U = (-1j*H*dt).expm()

while True:
    state = U*state

    Astate = state.ptrace(0)
    AL, AV = Astate.eigenstates()
    Bstate = state.ptrace(1)
    BL, BV = Bstate.eigenstates()

    for i, v in enumerate(AV):
        for j, xyz in enumerate(spin_XYZ(v)):
            vstarsA[i][j].pos = vsphereA.pos + vsphereA.radius*vp.vector(*xyz)
            vstarsA[i][j].opacity = AL[i]

    for i, v in enumerate(BV):
        for j, xyz in enumerate(spin_XYZ(v)):
            vstarsB[i][j].pos = vsphereB.pos + vsphereB.radius*vp.vector(*xyz)
            vstarsB[i][j].opacity = BL[i]

    cg_state = CG*state
    cgr = clebsch_split(cg_state, poss_js)
    for i, c in enumerate(cgr):
        if c.shape[0] == 1:
            z = c.unit()
            vcgspheres[i].opacity = c.norm()
            vcgspheres[i].axis = vp.vector(z[0][0][0].real, z[0][0][0].imag, 0)
        else:
            vcgspheres[i].opacity = c.norm()
            for j, xyz in enumerate(spin_XYZ(c)):
                vcgstars[i][j].pos = vcgspheres[i].pos + vcgspheres[i].radius*vp.vector(*xyz)

    vp.rate(2000)

Now there's so much one could talk about here. One could talk about spin networks, originally developed by Penrose in the 60's. You imagine a network of spin interactions, with edges labeled by $j$ values consistent with the addition of angular momenta. It turns out you can extract probabilities from a closed network without ever specifying the states per se, just the j values at the interactions--and these probabilities happen to be rational numbers. Penrose's motivation was an example of how space can emerge out of interaction. His reasoning was that we can only assign a state to a spin relative to some X, Y, Z axes, which presupposed that 3D space exists. So instead of specifying the $\mid j, m \rangle$ state, he just wants to specify the $j$ values. He then wonders what happens when you consider the limit of large networks and high spin $j$'s, and he proves the Spin Geometry Theorem. The idea is this, suppose you have a large network, and in that context, a large $j$ spin. You could imagine it interacts with a single spin-$\frac{1}{2}$ particle. Now either the interaction will result in a $j+\frac{1}{2}$ or a $j-\frac{1}{2}$ state, depending on the angle between the spins. But we know how to calculate probabilities using the spin network itself! (The rules are interesting and relate to our string diagram language from before, where the role of cup and cap is played by the $\epsilon$. But we won't go into details here.) So imagine the network where the interaction results in a $j+\frac{1}{2}$, and get the probability; and then the network where the interaction results in a $j-\frac{1}{2}$, and get the probability. These two probabilities should relate to the angle between the spin's rotation axes, which we havn't determined at all, and so we work backwards from the probabilities to the angles. Actually one imagines doing this twice to separate out classical uncertainty from quantum uncertainty (see his original paper for details!). Anyway, you could imagine doing this "experiment" for any spins in your network, but it's not clear that the angles so obtained will be consistent with each other. One might find that A and B have this angle between them, and B and C have that angle between them, which in normal 3D space would imply something about A and C's angle; but here this isn't generally the case. Penrose, however, shows that in the limit of big networks and high spin-$j$, that the angles between the spins become consistent with each other as if there were all embedded consistently in an emergent 3D geometry. 

Such ideas have been explored in other forms in loop quantum gravity. Again, we'll save the details for another time, but for example, one can consider the tensor product of a bunch of spins, and demand that they live in the angular momentum 0 subspace, which may have some number of dimensions. If you do this, then these states act like an "intertwiner," an angular momentum preserving interaction vertex, and the spins can be interpreted as the faces of a *quantum polyhedron* living at that vertex, where the $j$ values now refer to the areas of the face. It turns out the best way to quantize polyhedra is in terms of Minkowski's theorem, which says that a polyhedron is uniquely specified by the normal vectors to each of its faces, multiplied by their areas, all of which sum to 0 if the polyhedron is closed. But we digress... (Okay one more thing: there's an intimate relationship between quantum spin and Bezier curves, the components of the $\mid j, m \rangle$ vector being like the control points of a complexified Bezier curve! What!)

For our final destination in this tour of spin angular momentum theory (which itself is just one chapter in this larger story about "atoms"), we have to discuss the "Jordan-Schwinger" representation of a spin, as a fixed energy subspace of two quantum harmonic oscillators. 

A brief review of the quantum harmonic oscillator. 

It's hamiltonian is basically: $H = P^{2} + Q^{2}$, where $P$ is the momentum operator and $Q$ is the position operator. One great thing to do is define the creation and annihilation operators:

$a = \frac{1}{\sqrt 2} (Q + iP)$
$a^{\dagger} = \frac{1}{\sqrt 2} (Q - iP)$

So that: $Q = \frac{1}{\sqrt 2} (a^{\dagger} + a)$ and $P =  \frac{1}{\sqrt 2} (a^{\dagger} - a)$.

What's nice about them is that they increase or decrease the energy levels of the harmonic oscillator by 1: you can interpret them as adding or subtracting a "quantum" from the oscillator. So if you can get the 0 energy state, the ground state of the oscillator, then you can get all the other states by repeated action of the creation operator. 

The number operator, which counts the number of quanta, is $N = a^{\dagger}a$, and the hamiltonian can be rewritten $H = N + \frac{1}{2}$.

Now this is often developed using the formalism of position and momentum wave functions, where one relates the Hermite polynomials to the eigenstates of the oscillator. But there's another way to represent the state of a harmonic oscillator: in the energy basis. In that basis, the vacuum state is just $\begin{pmatrix} 1 \\ 0 \\ 0 \\ \vdots \end{pmatrix}$. 

And in this basis,

$a^\dagger =\begin{pmatrix}           
0 & 0 & 0 & \dots & 0 &\dots \\
\sqrt{1} & 0 & 0 & \dots & 0 & \dots\\
0 & \sqrt{2} & 0 & \dots & 0 & \dots\\
0 & 0 & \sqrt{3} & \dots & 0 & \dots\\
\vdots & \vdots & \vdots & \ddots  & \vdots  & \dots\\
0 & 0 & 0 & \dots & \sqrt{n} &\dots &  \\
\vdots & \vdots & \vdots & \vdots & \vdots  &\ddots \end{pmatrix}$

$a =\begin{pmatrix}
0 & \sqrt{1} & 0 & 0 & \dots & 0 & \dots \\
0 & 0 & \sqrt{2} & 0 & \dots & 0 & \dots \\
0 & 0 & 0 & \sqrt{3} & \dots & 0 & \dots \\
0 & 0 & 0 & 0 & \ddots & \vdots & \dots \\
\vdots & \vdots & \vdots & \vdots & \ddots & \sqrt{n} & \dots \\
0 & 0 & 0 & 0 & \dots & 0 & \ddots \\
\vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \ddots \end{pmatrix}$

(One thing that's nice is that we can truncate the number of possible quanta in our oscillator and obtain a finite dimensional oscillator representation, although the commutation relations between $P$ and $Q$ won't be exactly satisfied.)

Another way of thinking about this is that we can represent the state of a quantum harmonic oscillator as a *polynomial*, where the powers of $z$ represent the number of quanta. 

$f(z) = n_{0}z^{0} + n_{1}z^{1} + n_{2}z^{2} + n_{3}z^{3} + \dots $

Let's act with with the annihilation operator:

$af(z) = n_{1}z^{0} + \sqrt{2}c_{2}z^{1} + \sqrt{3}c_{3}z^{2} + \dots $

Up to the square roots, this is just taking the derivative of the polynomial!

$ \frac{d}{dz} z^{n} = nz^{n-1}$ The creation operator is kind of like a janky version of integration--up to the square roots, it's just multiplication by $z$.

$a^{\dagger}f(z) = n_{0}z^{1} + \sqrt{2}n_{1}z^{2} + \sqrt{3}n_{2}z^{3} \dots $

This analogy can be made rigorous and goes by the name the Segal-Bargmann representation.

Okay, so we have a 1D quantum harmonic oscillator. We can view it as a little guy that "counts" quanta: we can add quanta to it, subtract quanta from it, measure the number of quanta. The more quanta, the higher the energy. AndaA general state of the oscillator is a superposition of different numbers of quanta. What is a quantum? Here, it's like a nugget of energy. It doesn't have an identity of its own: it's like the idea of a pebble.

One of the simplest ways of understanding quantum field theory, at least at first, is via something called "second quantization." Basically, if you start with a quantum system living in some Hilbert space of dimension $d$, then you can construct a theory of a *variable number of indistinguishable copies of that system* by introducing a quantum harmonic oscillator for each basis state of the original quantum system. Hence, second quantization. (We won't get into the eventual difficulties with this view as regards QFT as a whole.) 

The oscillators count the number of particles in the state it represents. And because we use a quantum harmonic oscillator to count them, the particles as a whole will always be indistinguishable, like quanta in a harmonic oscillator. For a bosonic field, there can be any number of particles in the same state, and the particles will always be in the permutation symmetric state. For a fermionic field, there can be at most one particle in a given state, and the particles will always be in the permutation antisymmetric state.

Our current theories of physics regard all "particles" as quanta of some quantum field, as indistinguishable nuggets, which can be counted. 

Usually, one imagines: a first quantized quantum wave function, with an amplitude at each position (or momentum). And then one second quantizes, imagining a quantum harmonic oscillator at each location, counting the number of particles at that location (or in that momentum mode), and indeed: this is like a quantized model of a field, with little oscillating springs at each point. And then one can add other things to the first quantized state, like spin or more exotic things, and so forth, and even make it relativistic. (The difficulty comes in dealing with interactions.)

But actually, the simplest "quantum field theory" involves second quantizing a little spin-$\frac{1}{2}$ particle. It has two states, and so we introduce two quantum harmonic oscillators. It's like the first oscillator keeps track of the number of $\uparrow$ quanta, and the second oscillator keeps track of the number of $\downarrow$ quanta. And we'll get a theory of a variable number of permutation symmetric spin-$\frac{1}{2}$'s. But we know what a permutation symmetric state of spin-$\frac{1}{2}$'s is! It's a spin-$j$ particle! And so we can look at our double oscillator as a model of spin with *variable $j$*, where the $j$ value can change, and indeed, we can have a superposition of different $j$ values.

Given two oscillators:

$f(z) = z^{0} + z^{1} + z^{2} + z^{3} + \dots$

$g(w) = w^{0} + w^{1} + w^{2} + w^{3} + \dots $

We can tensor them:

$F(z, w) = z^{0}w^{0} + z^{0}w^{1} + z^{0}w^{2} + z^{0}w^{3} + \dots + z^{1}w^{0} + z^{1}w^{1} + z^{1}w^{2} + z^{1}w^{3} + \dots + z^{2}w^{0} + z^{2}w^{1} + z^{2}w^{2} + z^{2}w^{3} + \dots + z^{3}w^{0} + z^{3}w^{1} + z^{3}w^{2} + z^{3}w^{3} + \dots$

But this can be rearranged:

$F(z, w) = \Big{\{} z^{0}w^{0} \Big{\}} + \Big{\{} z^{1}w^{0} + z^{0}w^{1} \Big{\}} + \Big{\{} z^{2}w^{0} + z^{1}w^{1} + z^{0}w^{2} \Big{\}} + \Big{\{} z^{3}w^{0} + z^{2}w^{1} + z^{1}w^{2} + z^{0}w^{3} \Big{\}} + \dots$

So that we see that the two variable polynomial is a sum of *homogenous* polynomials, of dimension 1, 2, 3, 4... So that each sector of the Hilbert space corresponding to a given total energy $n$ can be interpreted as a spin-$\frac{n}{2}$ state. In other words, $n$ is just $2j$, the number of stars.

Furthermore, we can upgrade any 2x2 operator to act globally on the whole space via simple map:

$ \sum_{i, j} a_{i}^{\dagger} O_{i, j} a_{j} $

Or: $\begin{pmatrix} a_{0}^{\dagger} & a_{1}^{\dagger} \end{pmatrix} \begin{pmatrix} a & b \\ c & d \end{pmatrix} \begin{pmatrix} a_{0} \\ a_{1}\end{pmatrix}$, where the $a_{i}$'s recall are matrices.

If we upgrade the X operator, it will act as an X operator on each of the spin-$j$ subspaces. If we use it to rotate, it'll rotate them all at once, although at a speed proportional to the $j$ value.

We can also form a creation operator that creates a star at a given location. Given some spinor $\begin{pmatrix} \alpha \\ \beta \end{pmatrix}$, the star creation operator is $a_{star}^{\dagger} = \alpha a_{0}^{\dagger} + \beta a_{1}^{\dagger}$. And so naturally, we can decompose a spin-$j$ state into $2j$ spinors, and so lift a constellation into the double harmonic ocillator Hilbert space. Or we can form a homogenous polynomial in the creation operators:

$ a_{constellation}^{\dagger} = \sum_{i=0}^{2j} \frac{c_{i}}{\sqrt{i!(2j-i)!}} a_{0}^{2j-i}a_{1}^{i} $

where the $c_{i}$'s are the components of the $\mid j, m \rangle$ state: $\begin{pmatrix} c_{0} \\ c_{1} \\ c_{2} \\ \vdots \end{pmatrix}$.

Once we're working with this representation, it's interesting to consider the meaning of the position operators associated to each oscillator. If one examines the eigenstates of the X, Y, Z operators, one finds that they correspond to diagonal/antidiagonal oriented state, circular states, and vertical/horizontal oriented states in the plane: one imagines the 2 1D oscillators at right angles, so that we have a quantum harmonic oscillator in 2D. This immedietly makes one think of the polarization of light.

Indeed, imagine a light wave is rushing toward you. It has to move forward at the speed of light, as always, but it can oscillate in the plane orthagonal to its motion. This gives light a corkscrew character, and this is called its polarization. You can send light through a polarizing filter to filter out horizontally polarized light, circularly polarized light, etc. So we have a model of the polarization of light, oscillating in the 2D plane orthagonal to its motion.

If we imagine the simplest model of polarization, we imagine that light can generally oscillate in an ellipse. There is an intimate connection between the sphere and the ellipse. Since what is an ellipse, but a circle seen askew in 3D?

![](img/polarization_ellipse.jpeg)

Indeed, if you have a qubit quantized along the Y axis (in other words, the left/right circularly polarized axis), if you rotate its overall phase, while taking the real parts of the two components to be $(x, y)$ points on the plane, it makes the appropriate ellipse. There is a relationship here to the old theory of epicycles.

So this gives us a model for the polarization state of a photon, ignoring its momentum. Our double oscillator construction gives us a model of potentially several photons, all sharing the same momentum. Each photon has a qubit representing its polarization state, and photons are a bosonic field, so the qubits must be permutation symmetric. (A photon is technically a spin-$1$ particle, but because it's massless, it has only two possible states, hence its polarization is described by a qubit.)

And so, after moving from the plane to the sphere, we return again to the plane.

Anyway, check out [oscillators.py](code/oscillators.py)! One can choose a max number of quanta in each of the two oscillators, as well as a spin state to be loaded into the double oscillator Hilbert space, and a Hamiltonian to evolve with (X, N, a random H, etc). One sees above a red sphere representing the original spin state (just to make sure it's loaded in correctly!), and below a series of blue spheres representing the spin-$0$, spin-$\frac{1}{2}$, spin-$1$, $\dots$ states that the oscillator state decomposes into. Their opacity is just the norm of that sector. In the middle, one sees the plane, with little arrows at each point representing the amplitude at that location. Using the keyboard, one can measure the X, Y, Z, N (number), and Q (position) operators with "x", "y", "z", "n", and "q". And with "i" one can reset to a random state. Play around with it!

Finally, instead of using SU(2), we could use SU(3), which acts on 3D states, and so then we'd have 3 harmonic oscillators, but actually we need 6, since SU(3) is rank 2. And so on, and so forth. And so we can construct theories of a variable number of indistinguishable "particles" with different symmeteries, which just like in the spin case can be interpreted as a superposition of possible higher order states, which can also be represented by an unordered collection of states *to be symmeterized*, corresponding to the "roots."

<hr>

At last, it's time to conclude. 

We began (1) by investigating our most basic idea of a "pebble." We could place a pebble somewhere, and we could take it away. To communicate with a pebble, we had to agree on whether presence or absence would be significant. 

We then engaged in a mathematical dialectic. We imagined that anything we could do once, we could do again. So we could place another pebble, and another, and another. And make a little pile of indistinguishable pebbles which we could also take away from. And so we invented the counting numbers (2). The rule was we could repeat an action, but we always had to be able to go backwards too. Indeed, to communicate with a counting number, we have to agree on what number we start counting from. We can conceive of this as a contextualization of our original pebble: this pebble is actually the 4th pebble, if you've been keeping count, and to keep count is: to build up a little pile. So the pile is a contextualization of the pebble.

Then (3), we could imagine repeating "counting" itself, counting "all at once," and so we invented addition, or combining piles. We had to have an inverse, and so we invented the negative numbers, when it became clear that we could subtract past 0. And thus, we found the "integers." To communicate with an integer, we have to agree on what is 0, and also which direction we're in, positive or negative.

Then (4), we imagined iterating addition to get multiplication, whose inverse is division. We thus found the rational numbers, or piles of "prime pebbles." We can now yet further contextualize: we can specify a rational number which translates between your units and my units, between our different ideas of "1". We also noticed that allowing division by 0 wraps the number line into a circle.

Then (5), we imagined iterating multiplication to get exponentiation, whose inverse is root-taking. We thus discovered the irrational numbers like the $\sqrt 2$, but also the complex numbers $a + b\sqrt{-1}$. We could thus rotate between two coordinate axes, turning irrationals into rationals, and back. We investigated meaning of a "limit" and how at this stage the numbers turn reflexive, leading us to reflect on the general theory of computation and its limitations. We observed that no single set of logical atoms which are powerful enough to axiomatize arthimetic is complete in the sense that it can prove all true theorems about itself, and indeed, one such undecidable statement is the consistency of that formal system itself, in other words, if it leads to a contradiction. In other to prove the consistency of such a logical system, one as to move to a larger, more powerful system, adding axioms, which leads to yet other undecidable truths, which can be decided with yet more axioms. The point being is that: one can't start from a single set of axioms and rules for inference and imagine a machine trying out every possible rearrangement of symbols and thus proving all possible theorems. And so we proved that conceptually, there can't be a single set of "master concepts" from which all concepts can be mechanically derived. Hence, one requires a *dialectical* unfolding of mathematical ideas alongside the purely deductive. And indeed, that's exactly what we've been doing.

Then (6), we began by considering an unordered set of complex numbers, which we associated to points on the plane/sphere, and realized we could interpret them as the roots of a polynomial, giving us a yet higher order version of the idea of a "pile of pebbles": an *equation*. And we developed an interpretation of polynomials as "quantum states," indeed, as the states of spin-$j$ particles. Polynomials brought with them the idea of vector spaces, and we realized that polynomials were only defined up to a set of basis vectors, and that Hermitian matrices, which are observables in quantum mechanics, provide such basis sets via their eigenvectors. And so, to give context to our polynomials, to correctly communicate a quantum state, we had to specify a set of basis polynomials. Linear algebra provides the theory of generalized "perspective switches," and we realized we could generalize our idea of perspective to that of an "experimental situation," by which a state is filtered probablistically into outcome states, which provide a complete basis for the state: re the Stern-Gerlach apparatus. And we realized the importance, therefore, of unitary representations. 

Vector spaces, as well, bring with them the idea of the tensor product, and we realized that the symmetric tensor product of $2j$ spin-$\frac{1}{2}$ states gives us a representation of a spin-$j$ state. And this was interesting because it involved entanglement between potentially spatially separate systems. In one of the great defeats of all time, entanglement provdes that any simple reductionism can't work in science. What particles get entangled, there is more information in the whole than in the parts. For example, in the antisymmetric state, two spin-$\frac{1}{2}$ particles, each are maximally uncertain with regard to their rotation axis, but their entanglement means that they must always point in the opposite direction. So that if one is measured to be $\uparrow$, the other one must be $\downarrow$. Of course, through repeated experiementation on the two particles, one could precisely determine the quantum state of the whole. But for each instance of the experiment, there is more information contained in the two particles together, than can be separated into the two spatially separate parts. To wit, we found that our constellation was encoded not individually in any of the symmetric spin-$\frac{1}{2}$ particles, but in the entanglement between them. What's more, however, when we consider the constellation itself: it is just a simple juxtaposition, a product of roots; so that it seems that even if have a situation where an entangled whole is greater than its parts, from another perspective, we can describe it as a whole (constellation) which is a simple sum of its parts (stars). We digressed to discuss the theory of Clebsch-Gordan coefficients, and the idea that the tensor product of a bunch of spins could be split into separate sectors, turning the AND of the tensor product into the OR of a choice. And so we arrived at the theory of angular momentum conserving interactions, and alluded to spin networks. (Indeed, this construction can be generalized to other types of interactions that conserve other quantities besides angular momentum.) 

We then discussed the idea of second quantization. We realized that polynomials (without the Majorana interpretation, in other words, generally infinite dimensional) also provide a representation of a quantum harmonic oscillator, full of indistinguishable, countable energy quanta. And we imagined introducing a quantum harmonic oscillator to each degree of freedom of a first quantized quantum system. In our case, we introduced two harmonic oscillators, one each for the $\uparrow$ state and the $\downarrow$ state of a spin-$\frac{1}{2}$. And the fixed total energy subspaces of this Hilbert space turned out to correspond to spin-$0$, spin-$\frac{1}{2}$, spin-$1$, $\dots$ states, each of them indeed being a permutation symmetric state of spin-$\frac{1}{2}$'s.

And so we developed a representation capable of expressing a superposition of spins with different $j$ values, which was also a representation of the polarization of light. It turned out to be a theory of indistinguishable particles, and all known actual particles are of this type, quanta of some quantum field, albeit much more complex. It's then clear we could rephrase our theory of spin in terms of the repeated measurements of "the number of quanta in the $\uparrow$ oscillator" and "the number of quanta in the $\downarrow$ oscillator," and that due to second quantization, we can think about any measurement, in some sense, as being reducible to the measurement of *some* number operator. And so we've come full circle: we can now contextualize our pebbles as counting the number of quanta of some mode of a quantum field. Indeed, we could say that the world is conceptually made of "pebbles" insofar as we can use pebbles, appropriately contextualized, to represent it. And any actual clay pebble has emerged in some limit of this quantum theory.

But one thing that is also clear is that there can be many perspectives conceptually on the same quantum system. We have a constellation. Is it a spin-$j$ particle? Is it $2j$ spin-$\frac{1}{2}$'s, is it two quantum harmonic oscillators, is it the polarization of a beam of photons? The interpretation is physically fixed by how the system interacts with the rest of the world; and conceptually each new  interpretation brings with it a whole unheralded set of ideas and connections to the rest of the world.

Although there is a notion of identity that persists through different physical representations. If my spin is entangled with other spins, then if I split it into symmeterized spin-$\frac{1}{2}$'s, then these guys as whole will still be entangled with the rest of the world in the same way. So that even if something is transformed into different physical systems it still retains its unique connection to the rest of the universe. To wit, I can load a given quantum state into a quantum computer made of spins, of light, of whatever, and this won't in principle affect its entanglement in anyway. So it really is *that unique non-clonable* quantum state that you have there, and not merely a *representation*, a *copy* of it, as would be the case if you loaded in some classical data into your computer.

Moreover, when one works in relativistic quantum field theory, a basic demand is that the vacuum state (with no particles) is Lorentz invariant. In other words, translated, rotated, boosted observers all agree on the vacuum state, indeed, on the idea of "what is a particle." This theory leads to Wigner's classification scheme. The idea is that what we mean by a "particle" anyway is whatever is invariant, or the same, whether we rotate it, whether we translate it, whether we translate by it, whether we see it while moving at top speed, etc. In other words, we define a particle in terms of what we can do that leaves it the same, that thing which is invariant underneath the different perspectives we might have on it, in this case, spatio-temporal perspectives. Hence the importance of group theory. Indeed, today in the field of neural networks, people are starting to build networks which respect the underlying group structure of some domain, learning representations of images that are translation invariant in the place, and so forth. The idea being that would a neural network should be able to recognize the identity of something even if it's seen askew.

But when one moves to field theory in a curved space, one must employ (not necessarily unitary) symplectic transformations between perspectives. Famously, in the Unruh effect, while a stationary observer in the vacuum measures 0 particles, an accelerated observer experiences a different vacuum state, indeed, one with a non-0 number of particles! A symplectic transformation is one that maps creation and annihilation operators to different creation and annihilation operators while preserving the fact that they are, indeed, creation and annihilation operators. So that the $a/a^{\dagger}$'s in the accelerated reference frame are built out of the $a/a^{\dagger}$'s in the rest frame, in a particular way relating to the acceleration. Indeed, there is a class of symplectic transformations known as Bogoliubov transformations, which can turn any Hamiltonian quadratic in the creation and annihilation operators, into a simple oscillator Hamiltonian: $\sum_{i} a^{\dagger}a$, etc. This can be used in condensed matter, to describe how in certain systems electrons pair up to rove around as meta-particles. So that, what is one set of particles from one point of view may be another set of particles from another point of view. This reaches its culmination in holographic theory, where one as for, example, a conformal field theory on the boundary of some space (like on the surface of the sphere) is from another point of view a gravitational theory in the interior of that space (like in the interior of the sphere), usually an anti-DeSitter space, one with negative curvature, so that signals going off to infinity, return in finite time--just like the inside of a black hole. Indeed, that was the motivation: one wants to regard the seemingly inaccessible quantum state of the interior of the black hole (including the things that fall into it) as being, from another perspective, the quantum state that lives on a surface, the event horizon (and also the Hawking radiation). In these models, there is an interesting connection between phenomena that extend across large scales of the boundary and phenomena occur deep in the interior of bulk of the "emergent spacetime."

And so realize that even a quantum field has to contextualized with reference to some observer: one inside the black hole vs one outside.

Indeed, step by step, as it has unfolded, our story has been as much about translating between the perspective of observers (so that communication can be successful) as it has been about the pebbles which are passed between them. And so we have to ask: what is an observer, anyway? We have a theory that allows observers to come to agreement about the way the world is independent of their perspectives, using mathematics to represent the relationship between their points of view, so that they don't get out of synch. But the observers themselves, as it were, stand outside the theory as such, as *users* of the theory, which provides a reliable means of guaranteeing communication and agreement.

But before we get there, some lose ends.

There is something kinda magical about the mathematics behind quantum spin. It's almost too easy, the sphere the center of a remarkable series of coincidences, the way we can view points on n-spheres as (n-2)-stars on the 2-sphere, that the space of quantum states of a spin-$\frac{1}{2}$ particle is just the same space of the classical states of a spinning top. There is a theory that generalizes this idea called "Geometric Quantization," and the full story isn't complete. But generally, one investigates how copies of classical system can be interpreted as states of a corresponding quantum system. 

The concept of coherent states also applies to the harmonic oscillator: the simplest example is a Gaussian wave packet, and indeed they provide a basis for all the states of the oscillator, just as the coherent states of $2j$ points at the same place on the sphere, across the sphere, provide a basis of states for a spin-$j$ particle. (Incidentally, you can look at an infinite dimensional polynomial as being determined by an infinite number of roots, but there are some subtleties, re: Weierstrass factorization theorem.

I bring this up because it's very nice to be able to say, with regard to a spin-$\frac{1}{2}$ say, well, it's point in the X+ direction, but you asked, is it $\uparrow$ or $\downarrow$ in the Z direction, and that's a dumb question since it's equally both, and so you get a dumb answer: it's $\uparrow$ or $\downarrow$ at random. And that, even though the system can be quantized in terms of these outcomes ($Z-\uparrow$ and $Z-\downarrow$), in itself, it's just pointed along the X+ direction, perfectly definitely. In other words, superpositions of constellations are sensible constellations as well, unlike the situation of a dead cat plus an alive cat.

But what about a position state, where the particle is in a superposition of being at location A and location B? I ask, is it at A? And it's a dumb question because it's at A *and* at B, but I forced it to choose, and so maybe it's there or not. But in the spin case, the equivalent of being at A *and* B was a perfectly comprehensible, classical, if you will state: it was just pointed in some direction on the sphere. But classically, you can't have one thing in two places at once. And so it seems like our luck has run out. 

Arguably, we can't answer this question until we have a final theory of quantum gravity, and really nail down what we mean by "position" anyway. But here's something I was thinking about.

Maybe it's like, if the particle is at A *and* B for me, whereas I'm definitely here, then perhaps symmetrically for the particle, it's definitely "here", and I'm the one at two possible locations.

On that same note, one could (and some have tried) to replace the idea of a position with that of a "view." Whereas I can't imagine a superposition of "particle at A" and "particle at B" being itself a classical oridnary state of a particle's position, I could imagine that a superposition of a particle being seen from the left, and a superposition of a particle being seen from the right, is just: particle being seen from head on. 

And so, I leave you on this note. There is an interesting conjecture by Atiyah which says: if you arrange n non-coincident points in 3D, and then imagine a little sphere around each point, and imagine drawing straight lines between the points, the lights intersect the spheres, and put a star at each intersection point. So each point has a constellation on a sphere associated to it which is its "view" of the other points. So we have $n$ constellations, each of $n-1$ stars. We can interpret these $n-1$ stars as an $n$ dimensional spin state, or as polynomials. Atiyah has conjectured, and no one has found a counterexample, that these states always form a linearly independent basis for constellations of $n-1$ stars. In other words, the mutual views of each other of points arranged in 3D always form a linearly independent basis of the space of possible views. Of course, not all collections of constellations form consistent views. Nor is this in general an orthogonal basis, and so the components can't be interpreted so neatly as probability amplitudes. But given a $n-1$ star constellation, we can express it in the basis provided by a collection of $n$ views, and so give a kind of weight to the degree to which that constellation is "at" those vantage points.

Who knows! But it is certaintly true that our basic experience of being situated in the world is not being "at a position" per se, but instead: being surrounded by a sphere of incoming momenta, carrying views of the world with them. (Atiyah generalizes his conjecture to hyperbolic space, Minkowski space, curved space, more general manifolds, etc.) (And it's also worth noting Atiyah arrived at the conjecture following up on Berry and Robbin's attempt at proving the spin-statistics theorem, which says that half-integer spin particles are antisymmetric under exchange in space (fermions) whereas integer spin particles are symmetric under exchange in space (bosons), without invoking relativistic quantum field theory. The proof hinges on the use of the oscillator representation, using four oscillators to represent two spins, and building X, Y, Z rotation operators that act across the spins, rotating the one into the other.) (At another time, we'll show off some visualizations of what Atiyah is going on about.)