# DC4 FFT Part 1

## 207 Polynomial Multiplication

What we've seen so far is how to use divide and conquer in a clever way to multiply large integers. So for N bit integers, we were able to multiply, compute their product in time better than order n square. A similar idea applies to matrices as well. What we're going to do now is multiply polynomials. And to do this, we're going to use the beautiful divide and conquer algorithm known as FFT. FFT stands for fast fourier transform. So here's the set up. We have a pair of polynomials A of X and B of X. For the polynomial A of X, will denote its coefficients as A knot, A1, A2 up to An minus one. So it's of degree at most n minus one. And for B of X, will denote its coefficients as B knot, B1 up to Bn minus one. And it also is of degree of most n minus one. Our goal is to compute the product polynomial, C of X which is A of X times B of X. And the coefficient of C of X will denote as C knot, C1 up to C2 n minus two. Since the degree of C of X is at most two n minus two. Now recall that the Kth coefficient of the polynomial C of X, so this is denoted by c sub k. This is obtained by taking the constant term for A of X, this is A knot times the coefficient for the Kth term of B of X, this is BK. So I look at A knot times BK and I add that to the other possibilities A1 X Bk minus 1 and so on, up to a Nk times B knot. We want to solve the following computational problem. We're given the vector of coefficients defining A of X and we're given the vector of coefficients defining B of X and we want to compute the vector of coefficients for the product polynomial C of X. Now this vector C is known as the convolution of A and B. So this star symbol denotes the convolution.

## 208 Quiz PM Example Question

Let's take a quick quiz to make sure you're familiar with multiplying polynomials and let's consider the polynomial A(x), which is 1 + 2x + 3x squared. And B(x), let's define as 2 - x + 4x squared. So the vector of coefficients for A(x) is (1, 2, 3), and the vector for B(x) is (2, -1, 4). And now compute the convolution of A and B.

## 209 Quiz PM Example Solution

The solution to this problem is 2,3,8,5,12. To get the x cubed coefficient 5, we obtain it in the following way, multiplying 3x square by negative x, we get negative 3x cubed. Multiplying 2x by 4x square, we get 8x cubed. Adding these up, we get 5x cubed.

## 210 PM General Problem

Once again, the general computational problem that we're considering is, given this vector of coefficients for A of X, and this vector of coefficients defining B of X, we want to compute their convolution C. Now a simple algorithm for computing this convolution is going to compute each of these coefficients in turn. Now there were order n coefficient. How long did it take to compute each coefficient? Well a naive algorithm takes order K time to compute the Kth coefficient because there are order K terms that we have to sum up. This is going to lead to an order n squared time algorithm to compute all of the order n coefficients of the polynomial C of X. What we are going to do now in this lecture is an order n log n time algorithm for computing this product polynomial C of X, the convolution of a_n.

## 211 Convolution Applications

Before we dive into the algorithm, let us take a look at a few of the many applications of convolution. One important application is filtering. So here we have a data set of endpoints. What we are going to do is we are going to replace each data point by a function of their neighboring points. This is used for such things as reducing noise or adding effects. Let's take a look at a more detailed example of filtering so it becomes clear. In mean filtering we have a parameter capital M and we replace the data point Yj by the mean of the neighboring 2M+1 points and we do this for all j. Now this smooth dataset Yi had can be viewed as the convolution of Y with a vector f. The vector f is this vector of size 2M+1. To smooth the data set in this way with computers convolution with this simple vector f. Now of course we can smoothen more sophisticated ways for example, we can replace this factor f by a Gaussian function, in this way points nearby Yj are given more weight. In particular a Gaussian filter uses the following vector. This quantities Z, is just a normalizing factor so that the sum of the entries in this vector sum up to one. Now there are many other filters one might consider. A different type of filter one might consider is Gaussian blur. This is used to add some visual effect to an image, in particular to blur an image. In particular, Gaussian blur applies a two dimensional Gaussian filter to an image. Now let's get back to our original problem of how do we compute the convolution of a pair of vectors.

## 212 Polynomials Basics

And let's look at some basic properties of polynomials. Consider the polynomial A(x). There are two natural ways of representing the polynomial A(x). The first way, is by its coefficients, this is the vector A that we've been considering so far. A second way, is by looking at the value of this polynomial A(x) at n points. So, we take n points (x1), (x2), up to (xn) and and we evaluate this polynomial A(x) at these n points. The key fact is that a polynomial of degree n minus one is uniquely determined by its values at any n distinct points. So, that this statement makes intuitive sense, the example to keep in mind is that of a line. A line is a degree one polynomial, and a line is defined by any two points on that line. And in general, a degree n-1 polynomial is defined by any n points on that polynomial. The vector of coefficients is a more natural way to represent a polynomial. However, the values are more convenient for the purposes of multiplying polynomials. We'll see this in a second. Now, what FFT does, is it convert from coefficients to the values and values to coefficients. So, it does this transformation between these two representations of the polynomial. And the point is that coefficients is a more convenient way to represent a polynomial oftentimes but the values are more convenient for multiplying polynomials. So, we'll take the coefficients as input, we'll convert them to the values. We'll multiply the polynomials and then we'll use FFT to convert it back to the coefficients once again. One important point is that FFT converts from the coefficients to the values, not for any choice of (X1) through (Xn) but for a particular well chosen set of points, (X1) through (Xn). And part of the beauty of the FFT algorithm comes from this choice of these points (X1) through (Xn). Before we dive into FFT, let's take a look at why the values are convenient for multiplying polynomials.

## 213 MP Values

One of the key ideas for multiplying polynomials is that multiplying polynomials is easy when we have the polynomials in the values representation. So suppose that we know the polynomial A(x) evaluated at two n points, x1 through x2n, and we know this polynomial B(x) at the same two n points. Then we can compute the product polynomial C(x) at these two n points by just computing the product of two numbers for each i. So C(xi) is the product of the number A(xi) and B(xi). This is just a number, and this is just a number. Since C(xi) is just the product of two numbers, it takes order one time for each i, and therefore it takes order n total time to compute this product polynomial. Now why do we take A and B at two n points? Well, C is a polynomial of degree at most 2n minus two. So we needed at least 2n minus one points. So two n points suffices. The summary is that, if we have the value of these polynomials, A(x) and B(x), at n, or two n points, then we compute this product polynomial at the same points in order n total time. So what we're going to do is we're going to use FFT to convert from the coefficients to the values, this is going to take order n log n time to do this conversion, and then I'll take order n total time to compute the product polynomial at these two n points, and then we do FFT to convert back from the value of C of x, at these two n points, back to the coefficients for C(x), and that again will take order n log n time. So let's dive in now to see how we do FFT, which converts from the coefficients to the values.

## 214 FFT Opposites

Now here's the smaller computational task that we're focusing on now. We're given this vector a. These correspond to the coefficients for the polynomial A(x) and we want to compute the value of this polynomial A(x) at two endpoints, x1, through x2n. And the key point is that we get to choose these two endpoints x1 through x2n. How do we choose them? Well that's our main task. A crucial observation is suppose that the first n points are opposite of the next endpoints. In other words Xn+1 is -X1. Xn+2 two is -X2, and so on. The last one is X2n is -Xn. Let's suppose that our two endpoints satisfy this property, which we'll call the plus minus property, and let's see how this plus minus property

## 215 FFT Splitting A(x)

Once again the plus minus property says that the first n points are opposite of the next n points. And let's see how this plus minus property is useful for recursively computing a of x at two n points. Let's look at a evaluated at the point xi and a evaluated at the point xn plus i. Since we're assuming the plus minus property, these two points are opposites of each other. Now let's break up this polynomial A(x) into the even terms and the odd terms. Even terms are of the form a, two k times x raised to the power two k. Since the powers even, this is the same for xi and negative xi. So these even terms are the same for the point xi and the point xn plus i. Now what about for the odd terms. Well since it's an odd power it's going to be opposite for xi and xn plus i. Now given this observation about the even terms and the odd terms, it makes sense to split up A(x) into the even terms and the odd terms. So let's partition the coefficients for the polynomial A(x) into those coefficients for the even terms. Let's call this vector a even and into the coefficients for the odd terms. Let's call this factor ai.

## 216 FFT Even & Odd

So given this vector of coefficients for this polynomial A of X, to find A even as the coefficients for the even terms and A odd as the coefficients for the odd terms. Then we can look at the polynomials defined by this vector of coefficients. This vector A even defines this polynomial A even of Y. I have used a variable y to avoid confusion with A of X. Notice that A 2 is the quadratic term in A of X, but in A even, it's the linear term since it's the second coefficient. And similarly A 4 is going to be the quadratic term. Similarly this vector A odd defines this polynomial A odd of y. Notice that the degree of these two polynomials is N over two minus one. So we took a polynomial A of X of degree N minus one and we defined a pair of polynomials of degree N over two minus one. How does a polynomial A of X relate to the polynomial A even and the polynomial A odd? Well notice if I take the polynomial A even and I evaluated at the Y equals X squared, then I get the even terms of the polynomial A of X. Similarly if I evaluate the polynomial A odd at the point Y equals X squared. Then I get the odd terms except it's off by one in the exponent. So I multiply by X. So A of X is equal to A even evaluated at the point X squared plus X times A odd evaluated at the point X square. At this point you just start to see the semblance of the idea of the divide and conquer approach. We started with a polynomial of degree N minus one. And now we have defined a pair of polynomials of degree N over 2 minus one. So we went down from N to N over two and we got two subproblems, A even and A odd. And if we want A at a point X, then it suffices to know A even and A odd at the point X squared. So we've got two subproblems of half the size. Now the degree of this polynomial A of X went down from N minus one to N over two minus one for these two smaller polynomials. However if we need a of X at two endpoints, we still need A even and A odd at two endpoints the square of these original two endpoints. So the degree of these polynomials went down by a factor of two. But the number of points we need to evaluate them at hasn't gone down by a factor of two yet. This is where we're going to use the plus minus.

## 217 FFT Recursion

Given this polynomial A of X, we would define this pair of polynomials, A even and A odd. They satisfy the following identity, A evaluated at the point x is equal to A even evaluated at the point x squared plus x times A odd evaluated at the point x squared. Now let's suppose that the two endpoints that we want to evaluate A of X at satisfy the plus minus property. So the first end points are the opposite of the next end point. So let's look at A evaluated at the point XI and A evaluated the point X N plus I. Because of the plus minus property, this is A evaluated at XI and A evaluated at negative X I. Plugging this into our earlier formula. We have that A evaluated at the point XI is equal to A even evaluated at the point XI squared plus XI times A odd evaluated at the point XI squared. And similarly for A evaluated to point negative XI, we have it's equal to A even again at the point XI squared minus XI times again, A odd at the point XI squared. So notice to get A at these two different points, we need A even and A odd at the same points. Our conclusion is that if we're given A even and A odd at these endpoints Y1 through YN which are the square of these two endpoints. Notice that since these two endpoints satisfy this plus minus property, the square of these two endpoints are these endpoints. Then in order N time, we get A evaluated at these two endpoints in particular to evaluate A at the point XI, it takes order one time given the value of A even at this point and A odd at this point. And similarly to evaluate A at the point XN plus I it takes order one time given the value of A even at this point and A odd at this point.

## 218 FFT Summary

Now, let's summarize our approach. We have this polynomial, A of x of degree at most n minus one, and we want to evaluate this polynomial at two n points x_1 through X_2n. And we get to choose these two n points however we want. And we're looking at how we choose these points. One very minor point that I wanted identify now is why we consider this polynomial at 2n points instead of n points. And in fact, later we'll go back and we'll look at it at n point, instead of 2n points. But for now, we want this polynomial A of x at 2n points. Why? Because of our application to polynomial multiplication. Recall, our first step in our construction is to define this pair of polynormial Aeven, and Aodd. We do this by taking the even turns of A of x to define Aeven, and the odd terms define Aodd. Whereas the original polynomial A of X was of degree n most n minus one. Each of these polynomials is of degree at most n over to minus one, so the degree went down by half. Next, we recursively run the FFT algorithm on this pair of polynomials, and we'll evaluate these pair of polynomials at n points. What are the n points? The n points that we evaluate this pair of polynomials at are these points Y_1 through Y_n, which are the squares of these two n points. Since the original two n points satisfy the plus minus property, then x_1 is the opposite of X_n plus 1. So these squares are the same. That's our first point Y_1. X_2 is the opposite of X_n plus 2. So their squares are the same and that's Y_2, and so on up to Y_n, which is X_n square and x_2n square. Why do we want this pair of polynomials at the square of these two n points? We'll recall, to evaluate this polynomial A at this point X. It's straight forward if we know Aeven and Aodd at the point X square. So if we know this pair of polynomials at the square of these points then it's straight forward to get A at these two n points. In particular, in order one time per point, we can evaluate this polynomial A of x at that point, using Aeven and Aodd at the square of that point. So, in order n total time, we can get this polynomial A of x at these two n points, using Aeven and Aodd at these n points, which are the squares of these two n points. This is the high level idea of our divide and conquer algorithm for FFT. What's the running time of this algorithm? Well let T of n denote the running time of input of size n, we have two sub problems of exactly half the size. So, two t of n over two, takes us order and time to form these two polynomials, and it takes us order n time to merge their solution together, to get the solution to the original problem. So we get the following recurrence, T of n is most two T of an over two plus order n. This is the same recurrence in the classic merge sort and many of you must recall that it resolves to order n login. So it looks like we have an order, nlogn time algorithm to solve this problem. All we need is that these two n points satisfy the plus minus property. So the first n are opposite of the second n. But notice we're going to recursively run this problem on this pair polynomials Aeven and Aodd with the square of these points. So again, we're going to need the plus minus property for this sub problem, and for all smaller subproblems. Looking for this sub problem, we have Y_1 through Y_n. We need the first n over two to be the opposite of the second n over two. It will be straightforward to define two n points which satisfy the plus minus property. The challenge will be to define two n points which satisfy the plus minus property, and for all recursive sub problems to also satisfy the plus minus property. So let's dive into how we achieve.

## 219 FFT Recursive Problem

In order to get two endpoints which satisfy the plus minus property, we are going to choose points so that the second N are the opposite of the first endpoints. What happens in the next level of the recursion? Then the points we're considering are the square of these two endpoints. So these are the endpoints which are X one square up to XN square and we want these endpoints to also satisfy the plus minus property. Let's assume that N is a power of two. Then we need that Y one is the opposite of YN over two plus one and up to YN over two is the opposite of YN. What does this mean in terms of X? This means that X one squared is the opposite of XN over two plus one square. But the square of a number is always positive, so this is a positive number, and this is a positive number. So how can they be opposites of each other? Well it's impossible if we're working with real numbers. The only way to achieve this is by looking at complex numbers, so we can easily achieve the plus minus property at the top level of the recursion. But for all further levels of the recursion, in order to achieve the plus minus property we need to look at complex numbers. So let's do a quick review of complex numbers and then we'll see the appropriate choice of these two N numbers.

## 220 Review Complex Numbers

We saw that we need to consider complex numbers for our choice of the n or two n points where we evaluate the polynomial A(x). Here, I'll give a brief review of the relevant concepts regarding complex numbers. Some of you may get a bit scared at the use of complex numbers but the mathematics involved is fairly simple, so don't get intimidated at all. And the final algorithm that we get is very simple and very beautiful. We have a complex number Z which is a plus bi. A Is the real part and B is the imaginary part. It's often convenient to look at complex numbers in the complex plane. In the complex plane, one of the axis corresponds to the real component and one of the axis corresponds to the imaginary component. So, for this number A plus bi, we look at A in the real axis, B in the imaginary axis and we look at this point ab. This number Z corresponds to the point ab in the complex plane. Now, there's another way to describe this point, this number Z and it's called the polar coordinates. In polar coordinates, we look at the length of this vector from the origin to this point Z. Let's call R, the length and we look at this angle theta. This number Z in polar coordinates is R comma theta. Just as we had two representations of a polynomial, either the coefficients or the values. We also have two representations of a complex number. Either, with the cartesian coordinates ab or the polar coordinates R theta. Now it turns out the polar coordinates are more convenient for certain operations such as multiplication. Before we delve into properties of polar coordinates, let's review some basic properties. First off, how do we convert between polar coordinates and complex or cartesian coordinates? Well, if you remember a little bit of trigonometry, you'll recall that A, this length A is equal to R times cos theta and this length B is equal to R times sin theta. So, this gives us a way to convert between cartesian coordinates and polar coordinates. Now there's a third way of representing complex numbers. This is given to us by Euler's formula. This gives us a quite compact representation of complex numbers. The basic fact is Euler's formula, which says that cosine theta plus I times sin theta equals exponential of i times theta. Then we can multiply both sides by R and then we have that this number Z, this complex number Z equals R times exponential of i times theta. The basic idea of the proof of this Euler's formula is by looking at the Taylor expansion of exponential function cosine and sine function.

## 221 Multiplying in Polar

As we mentioned earlier polar coordinates are convenient for multiplication. Let's consider an example. Consider Z1 and Z2, a pair of complex numbers and let's look at their product. Let's say that Z1 in polar coordinates is r1 theta 1 and Z2 in polar coordinates is r2 theta 2. Now let's look at the product of these polar coordinates. We simply multiply the lengths and we add up the angles. Now suppose that I have a point Z. I have a complex number Z and I want to look at its opposite. So I want to look at negative Z so this corresponds to multiplying Z times negative 1. What is negative 1 in polar coordinates? Here's the point negative 1 in the complex plane. It's a distance one from the origin and it has angle 180 degrees or pi, so its polar coordinates are 1 comma pi. And let's say Z is this point at polar coordinates r comma theta, then negative Z corresponds to the product of r comma theta times negative 1 in polar coordinates which is 1 comma pi. Then using this above rule we have that negative Z is at the polar coordinates r comma theta plus pi. So the point negative C is the reflection of the point Z. It's just the same distance from the origin. It's a distance r from the origin, and instead of going in angle theta we go in angle theta plus pi. So if you view the complex numbers in the complex plane, it's easy to find their opposites by looking at their reflection.

## 224 Roots Graphical View

Now we're trying to find those Z. Where Z raised to the power n equals 1. And let's draw the number 1 on our complex plane. What is its point in polar coordinates? Its distance 1 obviously, from the origin. What's the angle? The angle is zero. It's also 2pi or 4pi or 6pi or 2pi times j for any integer j. And note j can also be negative number. So the angle is any multiple of 2pi. Now let's say Z equals the point r theta in polar. We're looking at those Z where Z raised to the power n equals 1. So it's equals 1 comma 2pi j. Recall the expression for multiplying two complex numbers in polar coordinates. So look at Z raised to the power n. So Z is r comma theta. So Z to the n is r raised to the power n. And then we add up the angles. So we get n times theta. Now we're assuming this equals 1. So what does r equal and what does theta equal? Well, r raised to the power n equals 1. So then we know that r equals 1. So what does that mean? That means that all the complex roots of unity lie on this unit circle, this circle of radius 1. Where do they line the circle? Well, we have to look at the angle. n times theta equals 2pi j. Solving for theta. We have 2pi j over n. Let's take the case n equals 8 and see what this looks like on the unit circle. The case j equals zero, corresponds to angle zero. So this is the point 1. That's good because we know that the point 1, that number 1 is an nth root of unity. Let's look at j equals 1. So we got 2pi over n. What does this correspond to, this angle 2pi over n? It's like you took the whole pi, the whole circle, and you subdivided it into equal slices. For the case n equals 8, this angle is pi over 4. So this is the point 1 comma pi over 4. j equals 2 corresponds to this point which is i. Which is 1 comma pi over 2. Next point is 1 3 pi over 4. And we get 1 pi which is negative 1. And we get 1 5 pi over 4, and so on. Notice that when we get to j equals 8, we repeat. We get back to this point 1. j equals 9, j equals 10, and so on. So notice there are n distinct values. And what we're doing is we're starting at the point 1 and we're taking steps sizes of 2pi over n.

## 225 Roots Notation

And what we just saw are that the nth roots of unity correspond to the points in polar coordinates 1 comma 2pi over n times j. j equals 0 corresponds to this point which is 1. Then we take step sizes of angle theta, where theta equals 2pi over n, this gives us j equals 1 comma j equals 2 and so on and we keep taking the equal size steps around the unit circle. Let's introduce a little bit of notation to make it more convenient for expressing these nth roots of unity. So let's let omega sub n denote the point corresponding to j was 1. So it's this 1 right here so this is omega sub n. Now what is the next point correspond to? Well, we just doubled the angle so it's just a square of this point omega sub n. And the next one is going to be the cube of this point. And the jth point is going to be the jth power. So the last one is going to be omega sub n to the n minus 1th power and then we have omega sub n to the nth power, which is the same as omega sub n to the zero power which is the number 1. So the nth roots of unity are the following n numbers, it's omega sub n to the zero power, to the 1th Power, to the square, to the n minus 1 power. These n numbers in polar coordinates are 1 comma 2pi over n times j, where j varies from 0 to n minus 1. And recall Euler's formula, so this number omega sub n equals e raised to the power 2pi i over n so the jth of the nth roots of unity, omega sub n raised to the jth power, where j varies between 0 and n minus 1, is e raised to the power 2pi i times j over n.

## 226 Roots Examples

That we saw that for the case n equals 2, the square roots of unity are the points 1 and minus one. For the case n equals 4, we also have the points i and minus i so we have the four points 1_i minus 1 minus i. This corresponds to going around the unit circle in step sizes of space Pi over 2. And for n equals 8, we further subdivide the unit circle. Now we go step sizes of Pi over 4 and then how do we get the points for n equal 16? We further subdivide. So this corresponds to omega_sub_16. The next one is omega_sub_16 squared and notice that it's also omega_sub_8. It's the first one of the eighth roots of unity. And this point i is omega_sub_16 to the fourth power, it's also omega_sub_8 squared, it's the second one of the 8th roots. And it's the first of the 4th roots. So take the nth roots and square them. And let's suppose that n is a power of 2 as in these examples. So what happens when you take the 16th root square? Well this guy goes there, this one goes there, this one goes there, this one goes there. What do you get? You get every other one of the 16th roots, which are the 8th roots so the 16th root squared are the 8th roots. And when n is a power of 2, the nth root squared are the n over 2nd second roots. The other key property is this plus minus property. Take this point omega_sub_16. What is the opposite of it? It's this point right here. This is negative omega_sub_16. It's also a 16th root of unity. Which one is it? Well, this is the 1st first, this is the 8th, this is the 9th. This is the point omega_sub_16 to the 9th power. And in general, for even n, omega_sub_n to the jth power is opposite omega_sub_n to the j plus n over two. By going an extra n over 2, I'm going an extra Pi around so I'm getting the reflection

## 229 Key Property Opposites

There are two key properties of the nth roots of unity which we're going to utilize. The first holes for even n, the nth root satisfies the plus minus property. This was the key property for our divide and conquer approach. The first n over two of the nth roots of unity are opposite of the last n over two. In other words, omega sub n to the zero power equals negative omega sub n to the n over two. And in general, omega sub n to the jth power equals negative omega sub n to the n over two plus j. So the nth roots of unity look like a perfect choice for our divide and conquer approach, because they satisfy the plus minus property.

## 230 Key Property Squares

The second key property holds when n is a power of two. If we look at the nth roots squared, we get the n over second roots. So we take the jth of the nth roots, so omega sub n to the jth power, and we square it. So in polar, we're taking the point one comma two pi over n times J squared. So we simply double the angle. So we get the point one comma two pi over n over two times J, which is omega sub n over two to the Jth power. So the Jth of the nth roots squared is the Jth of the n over second roots. And similarly, if we take the n over second plus Jth of the nth roots and we square it, well, this is just the opposite of this. So when we square the negative of it we get the same thing. So why do we need this property that the nth root squared or the n over second roots? Well, we're going to take this polynomial A of x, and we want to evaluate at the nth roots. Now these nth roots satisfy the plus minus properties, so we can do a divide and conquer approach. But then, what do we need? We need to evaluate A even and A odd at the square of these nth roots, which will be the n over second roots. So this subproblem is of the exact same form as the original problem. In order to evaluate A of x at the nth roots, we need to evaluate these two subproblems, A even and A odd at the n over second roots. And then we can recursively continue this algorithm. Now we're all set to do our FFT algorithm, and the end points where we choose to evaluate the polynomial A of x are the nth roots of unity.