# Chapter 9

## Problem 1

Rewrite and test either the k-means or k nearest neighbors algorithm with $l_1$ distances substituted for $l_2$ distances.

Soln:

In this solution, we will implement k nearest neighbors algorithm with with $l_1$ distances substituted for $l_2$ distances. To change $l_2$, Euclidean to $l_1$ distance, we use the Cityblock	distance from Distances.jl in the code of KNN, provided in the book.

In [9]:
function knn_euclidean(X::Matrix{T}, Y::Matrix{T}, class::Vector{Int}, k::Int) where T <: Real
    testing = size(X, 2)
    predicted_class = zeros(Int, testing)
    distance = pairwise(Euclidean(), Y, X) #Using l-2 Eucleadian
    for i = 1:testing
        perm = selectperm(distance[:, i], 1:k) # k nearest neighbors
        predicted_class[i] = mode(class[perm]) # most common class
    end
    return predicted_class
end

knn_euclidean (generic function with 1 method)

In [10]:
function knn_cityblock(X::Matrix{T}, Y::Matrix{T}, class::Vector{Int}, k::Int) where T <: Real
    testing = size(X, 2)
    predicted_class = zeros(Int, testing)
    distance = pairwise(Cityblock(), Y, X) #Using l-1 cityblock instead of Eucleadian
    for i = 1:testing
        perm = selectperm(distance[:, i], 1:k) # k nearest neighbors
        predicted_class[i] = mode(class[perm]) # most common class
    end
    return predicted_class
end

knn_cityblock (generic function with 1 method)

Lets compare the both methods of classification on randomly generated data and their classes.

In [11]:
using Distances, StatsBase;
(training, testing, features) = (100, 10, 30);
X = randn(features, testing);
Y = randn(features, training);
(k, classes) = (3, 2);
class = rand(1:classes, training);
predicted_class_euclidean = knn_euclidean(X, Y, class, k);
predicted_class_cityblock = knn_cityblock(X, Y, class, k);
errors = countnz(predicted_class_cityblock - predicted_class_euclidean);
println(" classification mismatch = ", errors)

 classification mismatch = 1


Lets compare the both methods of classification on iris dataset for training data as top 130 rows and testing data as last remaining 20 rows after shuffling. The classification errors on the testing set (20 points) points are almost same.

In [12]:
using RDatasets,Distances, StatsBase;
srand(1234)
iris = dataset("datasets", "iris")
(m,n) = size(iris);
dc = Dict("setosa"=> 1, "versicolor"=> 2, "virginica"=> 3); 
function categorise_species(x::String)  
    return dc[x];        
end
Data = convert(Array, iris[:,1:4]);
classData =  convert(Array,map(categorise_species,iris[:Species]))
shuffleperm = shuffle(1:m);
Data = Data[shuffleperm,:];
classData = classData[shuffleperm,:];
training = transpose(Data[1:130, :])
testing = transpose(Data[131:150, :]);
classTraining = vec(classData[1:130, :]);
classTesting = vec(classData[131:150, :]);
predicted_class_euclidean = knn_euclidean(testing, training, classTraining, k);
predicted_class_cityblock = knn_euclidean(testing, training, classTraining, k);
ea = countnz(predicted_class_euclidean - classTesting);
ca = countnz(predicted_class_cityblock - classTesting);
println(" classification errors of knn_euclidean= ", ea);
println(" classification errors of knn_cityblock= ", ca)

 classification errors of knn_euclidean= 2
 classification errors of knn_cityblock= 2


## Problem 5

Describe and program a naive Bayes classication algorithm for Gaussian distributed features. Assume that the features are independently distributed.

Soln:

Using Bayes rule, we have:

$P(Y=y_k | X_1,X_2,...,X_n) = \frac{P(Y=y_k,X_1,X_2,...,X_n)}{\sum_{j}{P(Y=y_j,X_1,X_2,...,X_n)}}$

$\implies P(Y=y_k | X_1,X_2,...,X_n) = \frac{P(X_1,X_2,...,X_n | Y=y_k) P(Y=y_k)}{\sum_{j}{P(X_1,X_2,...,X_n|Y=y_j)P(Y=y_j)}}$

Assuming that the features are independently distributed given the class label, we can further simplify as:

$\implies P(Y=y_k | X_1,X_2,...,X_n) = \frac{P(Y=y_k)\prod_i{P(X_i| Y=y_k)}}{\sum_{j}{(P(Y=y_j)\prod_i{P(X_i| Y=y_j)})}}$

The class probability $\pi_k = Pr(Y = y_k)$ is typically estimated as the fraction of training cases falling in class $y_k$. In Gaussian Naive Bayes, the components $X_i$ of a feature vector $X$ are real values, which are assumed to follow Gaussian distribution given the class label, i.e.

$ X_i | Y=y_k \sim \mathcal{N}(\mu_{ik},\sigma_{ik})$

$\implies P(X_i = x| Y=y_k) = \frac{1}{\sqrt{2\pi\sigma_{ik}^{2}}}e^{-\frac{1}{2}(\frac{x-\mu_{ik}}{\sigma_{ik}})^2}$

And, the Gaussian Naive Bayes training algorithm is for each class label value $y_k$, estimate priors, $\pi_k = Pr(Y = y_k)$ and for each attribute $X_i$ estimate class conditional mean $\mu_{ik}$ and variance $\sigma_{ik}^2$.

The maximum likelihood estimates for the class conditional mean $\mu_{ik}$ and variance $\sigma_{ik}$ are:

$\hat{\mu_{ik}} = \frac{\sum_j{X_i^j\delta(Y^j=y_k)}}{\sum_j{\delta(Y^j=y_k)}} $

$\hat{\sigma_{ik}^2} = \frac{\sum_j{(X_i^j - \hat{\mu_{ik}})^2\delta(Y^j=y_k)}}{\sum_j{\delta(Y^j=y_k)}}$

Hence, the class conditional mean $\mu_{ik}$ is the average of all the ith feature values of X, if its class label is $y_k$. Similary, the class conditional variance $\sigma_{ik}^2$ is the variance of all the ith feature values of X, if its class label is $y_k$.
The following function estimate does the training and learning of the these parameters and returns the prior $\pi_k$, class conditional mean $\mu_{ik}$ and class conditional standard deviation $\sigma_{ik}$.
We are calculating running mean as well as running variance to avoid and reduce numerical overflow like [here](https://dsp.stackexchange.com/questions/811/determining-the-mean-and-standard-deviation-in-real-time).

In [96]:
function estimate(X::Matrix{Float64}, class::Vector{Int}, classes::Int)
    (cases, features) = size(X) # X[i, j] = count of feature j
    mu = zeros(Float64, features, classes)
    sigma_sq = zeros(Float64, features, classes)
    prior = zeros(classes)
    for j = 1:cases
        k = class[j]
        prev_freq = prior[k]
        prior[k] = prior[k] + 1.0
        for i = 1:features
            prev_mu = mu[i, k]
            mu[i, k] = mu[i,k] + ((X[j, i] - mu[i,k])/prior[k]) #running mean
            sigma_sq[i, k] = ((prev_freq*sigma_sq[i, k]) + (X[j, i] - prev_mu)*(X[j, i] - mu[i,k]))/prior[k] #running variance
        end
    end
    prior = prior / sum(prior)
    return (prior, mu, sqrt.(sigma_sq))
end

estimate (generic function with 2 methods)

The next part of Gaussian Naive Bayes is to predict the class label for a test instance $X^{new}$.

$Y^{new} \to argmax_{y_k}{P(Y=y_k)\prod_i{P(X_i^{new}| Y=y_k)}}$

$\implies Y^{new} \to argmax_{y_k}{\pi_k\prod_i{\mathcal{N}(X_i^{new},\mu_{ik},\sigma_{ik})}}$

The denominator of Bayes rule is omitted in this reckoning because it does not depend on $y_k$ and is the normalizing term. The following function calculates the log of the R.H.S. of the above term and returns the prediction, which maximizes that logarithmic posterior.

In [97]:
function predict(test_cases::Matrix{Float64}, prior::Vector{T}, mu::Matrix{T}, sigma::Matrix{T}) where T <: Real
    cases = size(test_cases, 1)
    features = size(test_cases, 2)
    prediction = zeros(cases)
    for i = 1:cases
        ln_posterior = log.(prior)
        for j = 1:features
            ln_posterior = ln_posterior + logpdf.(Normal.(mu[j, :], sigma[j, :]),test_cases[i,j])
        end 
        prediction[i] = indmax(ln_posterior)
    end
    return prediction
end

predict (generic function with 3 methods)

Lets check the functions for Gaussian Naive Bayes classification on randomly generated data and their classes.

In [98]:
using Distributions
(features, cases, classes) = (4, 100, 2);
X = rand(cases, features); # training data
class = rand(1:classes, cases); # classes of training data
(prior, mu, sigma) = estimate(X, class, classes);
test_case = rand(2,features); # test data
test_class = predict(test_case, prior, mu, sigma)

2-element Array{Float64,1}:
 2.0
 2.0

Lets compare the Gaussian Naive Bayes classification on iris dataset for training data as top 130 rows and testing data as last remaining 20 rows after shuffling. The classification errors on the testing set (20 points) is less than K-Nearest Neighbours methods.

In [111]:
using RDatasets, Distances, StatsBase, Distributions;
srand(1234)
iris = dataset("datasets", "iris")
(m,n) = size(iris);
dc = Dict("setosa"=> 1, "versicolor"=> 2, "virginica"=> 3); 
function categorise_species(x::String)  
    return dc[x];        
end
Data = convert(Array, iris[:,1:4]);
classData =  convert(Array,map(categorise_species,iris[:Species]))
shuffleperm = shuffle(1:m);
Data = Data[shuffleperm,:];
classData = classData[shuffleperm,:];
training = Data[1:130, :]
testing = Data[131:150, :];
classTraining = vec(classData[1:130, :]);
classTesting = vec(classData[131:150, :]);
(prior, mu, sigma) = estimate(training, classTraining, 3);
predicted_class_NB = predict(testing, prior, mu, sigma);
eg = countnz(predicted_class_NB - classTesting);
accuracy = countnz(predicted_class_NB - classTesting) / countnz(classTesting)
println("Classification errors of Gaussian Naive Bayes = ", eg);
println("Accuracy: $accuracy");

Classification errors of Gaussian Naive Bayes = 1
Accuracy: 0.05


# Chapter 10

## Problem 3

Explicitly calculate the finite Fourier transforms of the sequences $c_j = 1,
c_j = 1_{\{0\}}, c_j = (-1)^j$ , and $c_j = 1_{\{0,1,2,...,\frac{n}{2}-1\}}$ defined on $\{0,1,2,...,n-1\}$.
For the last two sequences assume that n is even.

Soln :

Periodic sequences of complex numbers $\{c_j\}_{j=-\infty}^{\infty}$ of period n constitute the natural domain of the finite Fourier transform. The transform of such a sequence
is defined by

$\hat{c_k} = \frac{1}{n} \sum_{j=0}^{n-1}c_je^{-2\pi i \frac{jk}{n}}$

where $i = \sqrt{-1}$ and $e^{is}$ is the exponential function with an imaginary argument. If we let $u_n = e^{2\pi i/n}$,

$\implies \hat{c_k} = \frac{1}{n} \sum_{j=0}^{n-1}c_ju_n^{-jk}$

Note that, $u_n^0 = 1$ and $u_n^n = 1$.

a) $c_j = 1$

The Finite Fourier Transform is calculated as,

$\hat{c_k} = \frac{1}{n} \sum_{j=0}^{n-1}c_ju_n^{-jk}$

$\implies \hat{c_k} = \frac{1}{n} \sum_{j=0}^{n-1}u_n^{-jk}$

For $k=0$, the terms in summation will all reduce to 1 as $u_n^{0} = 1$.

$\implies \hat{c_0} = \frac{1}{n} \sum_{j=0}^{n-1}1 = 1$

For $k \ne 0$, the terms in summation form a Geometric series with first term as $u_n^{0} = 1$ and ratio as $u_n^{-k}$, whose sum of n terms is given like [this](https://en.wikipedia.org/wiki/Geometric_progression#Geometric_series).

$\implies \hat{c_k} = \frac{1}{n} \sum_{j=0}^{n-1}u_n^{-jk}$

$\implies \hat{c_k} = \frac{1}{n} \frac{1 - u_n^{-nk}}{1 - u_n^{-k}}$

Note that, $u_n^0 = 1$ and $u_n^n = 1$. Hence, $u_n^{-nk} = 1$.

$\implies \hat{c_k} = \frac{1}{n} \frac{1 - 1}{1 - u_n^{-k}}  =0 $

Hence, the sequence $c_j = 1$ on $\{0,1,2,...,n-1\}$ has finite Fourier transform:

\begin{equation*}
\hat{c_k} = \begin{cases}
1 &\text{$k=0$}\\
0 &\text{$k \ne 0$}
\end{cases}
\end{equation*}

b) $c_j = 1_{\{0\}}$

This sequence $c_j = 1_{\{0\}}$ on $\{0,1,2,...,n-1\}$ represents the following:

\begin{equation*}
c_j = \begin{cases}
1 &\text{$k=0$}\\
0 &\text{$k \ne 0$}
\end{cases}
\end{equation*}

The Finite Fourier Transformation is calculated as,

$\hat{c_k} = \frac{1}{n} \sum_{j=0}^{n-1}c_ju_n^{-jk}$

$\implies \hat{c_k} = \frac{1}{n} u_n^{0}$

Note that, $u_n^0 = 1$.

$\implies \hat{c_k} = \frac{1}{n}$

Hence, the sequence $c_j = 1_{\{0\}}$ on $\{0,1,2,...,n-1\}$ has finite Fourier transform:

\begin{equation*}
\hat{c_k} = \begin{cases}
\frac{1}{n} &\text{$\forall k$}
\end{cases}
\end{equation*}

c) $c_j = (-1)^j$

The Finite Fourier Transform is calculated as,

$\hat{c_k} = \frac{1}{n} \sum_{j=0}^{n-1}c_ju_n^{-jk}$

$\implies \hat{c_k} = \frac{1}{n} \sum_{j=0}^{n-1}(-1)^ju_n^{-jk} = \frac{1}{n} \sum_{j=0}^{n-1}(-u_n^{-k})^{j}$

For $k \ne \frac{n}{2}$, the terms in summation form a Geometric series with first term as $(-1)^0u_n^{0} = 1$ and ratio as $-u_n^{-k}$, whose sum of n terms is given like [this](https://en.wikipedia.org/wiki/Geometric_progression#Geometric_series).

$\implies \hat{c_k} = \frac{1}{n} \sum_{j=0}^{n-1}(-u_n^{-k})^{j}$

$\implies \hat{c_k} = \frac{1}{n} \frac{1 - (-u_n^{-k})^n}{1 - (-u_n^{-k})}$

Note that, $u_n^0 = 1$ and $u_n^n = 1$. Hence, $u_n^{-nk} = 1$.

$\implies \hat{c_k} = \frac{1}{n} \frac{1 - (-1)^n}{1 + u_n^{-k}}$

Given, n is even. Hence, $(-1)^{n} = 1$.

$\implies \hat{c_k} = \frac{1}{n} \frac{1 - 1}{1 + u_n^{-k}}  =0 $

For $k=\frac{n}{2}$, the Geometric sum can't be applied as the denominator of the sum $1 + u_n^{-\frac{n}{2}}$ is 0 since $u_n^{-\frac{n}{2}} = e^{-\pi i} = -1$. 
So, calculating explicitly, 

$\hat{c_{\frac{n}{2}}} = \frac{1}{n} \sum_{j=0}^{n-1}(-1)^ju_n^{-j\frac{n}{2}}$

$\implies \hat{c_{\frac{n}{2}}} = \frac{1}{n} \sum_{j=0}^{n-1}(-u_n^{-\frac{n}{2}})^{j}$

The terms in summation will all reduce to $1$ as $-u_n^{-\frac{n}{2}} = -e^{-\pi i} = -(-1) = 1$.

$\implies \hat{c_{\frac{n}{2}}} = \frac{1}{n} \sum_{j=0}^{n-1}(1)^j = 1$

Hence, the sequence $c_j = 1$ on $\{0,1,2,...,n-1\}$ has finite Fourier transform:

\begin{equation*}
\hat{c_k} = \begin{cases}
1 &\text{$k=\frac{n}{2}$}\\
0 &\text{$k \ne \frac{n}{2}$}
\end{cases}
\end{equation*}

d) $c_j = 1_{\{0,1,2,...,\frac{n}{2}-1\}}$

This sequence $c_j = 1_{\{0,1,2,...,\frac{n}{2}-1\}}$ on $\{0,1,2,...,n-1\}$ represents the following:

\begin{equation*}
c_j = \begin{cases}
1 &\text{$k < \frac{n}{2}$}\\
0 &\text{$k \geq \frac{n}{2}$}
\end{cases}
\end{equation*}

Given, n is even. Without loss of generality, let $l=\frac{n}{2}$.
The Finite Fourier Transformation is calculated as,

$\hat{c_k} = \frac{1}{n} \sum_{j=0}^{n-1}c_ju_n^{-jk}$

$\implies \hat{c_k} = \frac{1}{n} \sum_{j=0}^{2l-1}c_ju_n^{-jk} = \frac{1}{n} \sum_{j=0}^{l-1}u_n^{-jk}$

For $k=0$, the terms in summation will all reduce to 1 as $u_n^{0} = 1$.

$\implies \hat{c_0} = \frac{1}{n} \sum_{j=0}^{l-1}1 = \frac{l}{n} = \frac{1}{2}$

For $k \ne 0$, the terms in summation form a Geometric series with first term as $u_n^{0} = 1$ and ratio as $u_n^{-k}$, whose sum of l terms is given like [this](https://en.wikipedia.org/wiki/Geometric_progression#Geometric_series).

$\implies \hat{c_k} = \frac{1}{n} \sum_{j=0}^{l-1}u_n^{-jk}$

$\implies \hat{c_k} = \frac{1}{n} \frac{1 - u_n^{-lk}}{1 - u_n^{-k}}$

Note that, $u_n^0 = 1$ and $u_n^n = 1$ and $u_n^{-l} = u_n^{-\frac{n}{2}} = e^{-\pi i} = -1$. Hence, $u_n^{-lk} = (-1)^k$.

$\implies \hat{c_k} = \frac{1}{n} \frac{1 - (-1)^k}{1 - u_n^{-k}}$

For $k \ne 0$ and k is even, $(-1)^k=1$,

$\implies \hat{c_k} = \frac{1}{n} \frac{1 - 1}{1 - u_n^{-k}}  =0 $

For $k \ne 0$ and k is odd, $(-1)^k=-1$,

$\implies \hat{c_k} = \frac{1}{n} \frac{1 + 1}{1 - u_n^{-k}}  = \frac{2}{n} \frac{1 - u_n^{k}}{(1 - u_n^{k})(1 - u_n^{-k})} $

Since, $(1 - u_n^{k})(1 - u_n^{-k}) = 1 + 1 - u_n^{k} - u_n^{-k} = 2(1 - \cos(\frac{2\pi k}{n}))$ as $u_n^{k} = e^{\frac{2\pi i k}{n}} = \cos(\frac{2\pi k}{n}) + i\sin(\frac{2\pi k}{n})$.

$\implies \hat{c_k} = \frac{2}{n} \frac{1 - u_n^{k}}{ (2 - u_n^{k} - u_n^{-k})} = \frac{1}{n} \frac{1 - \cos(\frac{2\pi k}{n}) - i\sin(\frac{2\pi k}{n})}{1 - \cos(\frac{2\pi k}{n})}$

Since, $\sin(2\theta) = 2\sin(\theta)\cos(\theta)$ and $\cos(2\theta) = 1 - 2\sin^2(\theta)$. 
Hence, $\sin(\frac{2\pi k}{n}) = 2\sin(\frac{\pi k}{n})\cos(\frac{\pi k}{n})$ and $1 - \cos(\frac{2\pi k}{n}) = 2\sin^2(\frac{\pi k}{n})$.

$\implies \hat{c_k} = \frac{1}{n} ( 1 - i\frac{\sin(\frac{2\pi k}{n})}{1 - \cos(\frac{2\pi k}{n})}) = \frac{1}{n} ( 1 - i\frac{\cos(\frac{\pi k}{n})}{\sin(\frac{\pi k}{n})}) $

$\implies \hat{c_k} = \frac{1}{n} ( 1 - i\cot(\frac{\pi k}{n})) $

Hence, the sequence $c_j = 1_{\{0,1,2,...,\frac{n}{2}-1\}}$ on $\{0,1,2,...,n-1\}$ has finite Fourier transform:

\begin{equation*}
\hat{c_k} = \begin{cases}
\frac{1}{2} &\text{$k=0$}\\
0 &\text{$k \ne 0$ and k is even}\\
\frac{1}{n} ( 1 - i\cot(\frac{k\pi}{n})) &\text{$k \ne 0$ and k is odd}\\
\end{cases}
\end{equation*}

## Problem 4

Show that the sequence $c_j = j$ on $\{0,1,2,...,n-1\}$ has finite Fourier
transform

\begin{equation*}
\hat{c_k} = \begin{cases}
\frac{n-1}{2} &\text{$k=0$}\\
-\frac{1}{2} + \frac{i}{2}\cot(\frac{k\pi}{n}) &\text{$k \ne 0$}
\end{cases}
\end{equation*}

Given $c_j = j$ on $\{0,1,2,...,n-1\}$,

the Finite Fourier Transformation is calculated as,

$\hat{c_k} = \frac{1}{n} \sum_{j=0}^{n-1}c_ju_n^{-jk}$

$\implies \hat{c_k} = \frac{1}{n} \sum_{j=0}^{n-1}ju_n^{-jk}$

For $k=0$, the terms in summation will all reduce to j as $u_n^{0} = 1$.

$\implies \hat{c_0} = \frac{1}{n} \sum_{j=0}^{n-1}j = \frac{1}{n}\frac{n(n-1)}{2}$

$\implies \hat{c_0} = \frac{n-1}{2}$

For $k \ne 0$, the terms in summation form a Arithmetic Geometric series with first term as $0*u_n^{0} = 0$, difference as $1$ and ratio as $u_n^{-k}$ and generic terms as $ju_n^{-jk}$, whose sum of n terms is given like [this](https://en.wikipedia.org/wiki/Arithmetico%E2%80%93geometric_sequence).

$\implies \hat{c_k} = \frac{1}{n} \sum_{j=0}^{n-1}ju_n^{-jk}$

Multiplying by ratio, $u_n^{-k}$ on both sides,

$\implies \hat{c_k}u_n^{-k} = \frac{1}{n} \sum_{j=0}^{n-1}ju_n^{-(j+1)k} = \frac{1}{n} \sum_{j=1}^{n}(j-1)u_n^{-jk}$

Taking the difference with previous summation,

$\implies (1-u_n^{-k})\hat{c_k} = \frac{1}{n} (\sum_{j=1}^{n-1}u_n^{-jk} - (n-1)u_n^{-nk})$

By Sum of Geometric Series as done in 4 a) above,

$\implies (1-u_n^{-k})\hat{c_k} = \frac{1}{n} (\frac{u_n^{-k} - u_n^{-nk}}{1 - u_n^{-k}} - (n-1)u_n^{-nk})$

$\implies \hat{c_k} = \frac{1}{n} (\frac{u_n^{-nk} - nu_n^{-nk}}{1 - u_n^{-k}} + \frac{u_n^{-k} - u_n^{-nk}}{(1 - u_n^{-k})^2})$

Note that, $u_n^0 = 1$ and $u_n^n = 1$. Hence, $u_n^{-nk} = 1$.

$\implies \hat{c_k} = \frac{1}{n} (\frac{1 - n}{1 - u_n^{-k}} + \frac{u_n^{-k} - 1}{(1 - u_n^{-k})^2})
= \frac{1}{n} (\frac{n - 1}{u_n^{-k} - 1} + \frac{1}{u_n^{-k}-1})$

$\implies \hat{c_k} = \frac{1}{n} \frac{n}{u_n^{-k} - 1}= \frac{-1}{1 - u_n^{-k}}$

Multiplying by the conjugate $1 - u_n^{k}$ on both numerator and denominator,

$\implies \hat{c_k} = -\frac{1 - u_n^{k}}{(1 - u_n^{k})(1 - u_n^{-k})} $

Since, $(1 - u_n^{k})(1 - u_n^{-k}) = 1 + 1 - u_n^{k} - u_n^{-k} = 2(1 - \cos(\frac{2\pi k}{n}))$ as $u_n^{k} = e^{\frac{2\pi i k}{n}} = \cos(\frac{2\pi k}{n}) + i\sin(\frac{2\pi k}{n})$.

$\implies \hat{c_k} = - \frac{1 - u_n^{k}}{ (2 - u_n^{k} - u_n^{-k})} = - \frac{1 - \cos(\frac{2\pi k}{n}) - i\sin(\frac{2\pi k}{n})}{2 - 2\cos(\frac{2\pi k}{n})}$

Since, $\sin(2\theta) = 2\sin(\theta)\cos(\theta)$ and $\cos(2\theta) = 1 - 2\sin^2(\theta)$. 
Hence, $\sin(\frac{2\pi k}{n}) = 2\sin(\frac{\pi k}{n})\cos(\frac{\pi k}{n})$ and $1 - \cos(\frac{2\pi k}{n}) = 2\sin^2(\frac{\pi k}{n})$.

$\implies \hat{c_k} = \frac{-1}{2} ( 1 - i\frac{\sin(\frac{2\pi k}{n})}{1 - \cos(\frac{2\pi k}{n})}) = \frac{-1}{2} ( 1 - i\frac{\cos(\frac{\pi k}{n})}{\sin(\frac{\pi k}{n})}) $

$\implies \hat{c_k} = \frac{-1}{2} ( 1 - i\cot(\frac{\pi k}{n})) $

$\implies \hat{c_k} = -\frac{1}{2} + \frac{i}{2}\cot(\frac{k\pi}{n})$

Hence, the sequence $c_j = j$ on $\{0,1,2,...,n-1\}$ has finite Fourier transform:

\begin{equation*}
\hat{c_k} = \begin{cases}
\frac{n-1}{2} &\text{$k=0$}\\
-\frac{1}{2} + \frac{i}{2}\cot(\frac{k\pi}{n}) &\text{$k \ne 0$}
\end{cases}
\end{equation*}

$QED$.

## Problem 6

Prove parts (b) and (c) of Proposition 10.2.1.

Soln:

b) The finite Fourier transform satises the rule:

$\hat{T_rc_k} = u_{n}^{-kr}\hat{c_k}$

Proof:

Given $c_j$, the Finite Fourier Transformation is calculated as,

$\hat{c_k} = \frac{1}{n} \sum_{j=0}^{n-1}c_ju_n^{-jk}$

The translate of the periodic sequence $c_j$ by index r is the periodic sequence $T_rc_j$ defined by $T_rc_j = c_{j-r}$. Thus, the operator $T_r$ translates a sequence r places to the right.

Given $T_rc_j$, the Finite Fourier Transformation is calculated as

$\hat{T_rc_k} = \frac{1}{n} \sum_{j=0}^{n-1}T_rc_ju_n^{-jk}$

Since, $T_rc_j = c_{j-r}$, substituting in the above,

$\implies \hat{T_rc_k} = \frac{1}{n} \sum_{j=0}^{n-1}c_{j-r}u_n^{-jk}$

Let, $j-r = l$ and hence, $j = l+r$. Now, l varies from $-r$ to $n-1-r$.

$\implies \hat{T_rc_k} = \frac{1}{n} \sum_{l=-r}^{n-1-r}c_{l}u_n^{-(l+r)k}$

$\implies \hat{T_rc_k} = u_n^{-kr} \frac{1}{n} \sum_{l=-r}^{n-1-r}c_{l}u_n^{-lk}$

$\implies \hat{T_rc_k} = u_n^{-kr} \frac{1}{n}(\sum_{l=-r}^{-1}c_{l}u_n^{-lk} + \sum_{l=0}^{n-1-r}c_{l}u_n^{-lk})$

Substitute, $m=n+l$ i.e. $l=m-n$ and $m$ varies from $n-r$ to $n-1$ in first part of summation.

$\implies \hat{T_rc_k} = u_n^{-kr} \frac{1}{n}(\sum_{m=n-r}^{n-1}c_{m-n}u_n^{-(m-n)k} + \sum_{l=0}^{n-1-r}c_{l}u_n^{-lk})$

Since, $c_j$ is a periodic sequence with period n, $c_{m-n}=c_{m}$ and $u_n$ also has period n as $u_n^{m-n} = u_n^{m}$ since $u_n^{n} = e^{2\pi i}=1$. 

$\implies \hat{T_rc_k} = u_n^{-kr} \frac{1}{n}(\sum_{m=n-r}^{n-1}c_{m}u_n^{-mk} + \sum_{l=0}^{n-1-r}c_{l}u_n^{-lk})$

Hence, $\sum_{l=-r}^{n-1-r}c_{l}u_n^{-lk} = \sum_{l=0}^{n-1}c_{l}u_n^{-lk}$.

$\implies \hat{T_rc_k} = u_n^{-kr} \frac{1}{n}\sum_{l=0}^{n-1}c_{l}u_n^{-lk}$

$\implies \hat{T_rc_k} = u_{n}^{-kr}\hat{c_k}$

$QED$.

c) The finite Fourier transform satises the rule:

$\hat{Rc_k} = R\hat{c_k}$

Proof:

Given $c_j$, the Finite Fourier Transformation is calculated as,

$\hat{c_k} = \frac{1}{n} \sum_{j=0}^{n-1}c_ju_n^{-jk}$

The reversion operator R takes a sequence $c_j$ into $Rc_j = c_{-j}$ .

Given $Rc_j$, the Finite Fourier Transformation is calculated as

$\hat{Rc_k} = \frac{1}{n} \sum_{j=0}^{n-1}Rc_ju_n^{-jk}$

Since, $Rc_j = c_{-j}$, substituting in the above,

$\implies \hat{Rc_k} = \frac{1}{n} \sum_{j=0}^{n-1}c_{-j}u_n^{-jk}$

Let, $j = -l$ and hence, $l = -j$. Now, l varies from $0$ to $-(n-1)$.

$\implies \hat{Rc_k} = \frac{1}{n} \sum_{l=-(n-1)}^{0}c_{l}u_n^{lk}$

Substitute, $m=n+l$, i.e. $l=m-n$ and hence, $m$ varies from $1$ to $n$ in the summation.

$\implies \hat{Rc_k} = \frac{1}{n} \sum_{m=1}^{n}c_{m-n}u_n^{(m-n)k}$

Since, $c_j$ is a periodic sequence with period n, $c_{m-n}=c_{m}$ and $u_n$ also has period n as $u_n^{m-n} = u_n^{m}$ since $u_n^{n} = e^{2\pi i}=1$. 

$\implies \hat{Rc_k} = \frac{1}{n} \sum_{m=1}^{n}c_{m}u_n^{mk} = \frac{1}{n} (\sum_{m=1}^{n-1}c_{m}u_n^{mk} + c_{n}u_n^{nk})$

Again, by periodicity $c_{n}=c_{0}$ and $u_n^{nk}=u_n^{nk}$,

$\implies \hat{Rc_k} = \frac{1}{n} (\sum_{m=1}^{n-1}c_{m}u_n^{mk} + c_{0}u_n^{0k}) = \frac{1}{n} \sum_{m=0}^{n-1}c_{m}u_n^{mk}$

$\implies \hat{Rc_k} = \frac{1}{n} \sum_{j=0}^{n-1}c_{j}u_n^{jk}$

However, $\frac{1}{n} \sum_{j=0}^{n-1}c_{j}u_n^{jk} = \hat{c_{-k}}$ by substituting $-k$ in DFT equation.

Hence, $\hat{Rc_k} = \hat{c_{-k}}$.

But $\hat{c_{-k}} = R\hat{c_{k}}$ as obtained by applying Reversion operator R on $\hat{c_{k}}$.

$\implies \hat{Rc_k} = \hat{c_{-k}} = R\hat{c_{k}}$

$\implies \hat{Rc_k} = R\hat{c_{k}}$

$QED$.