# Homework – Spectral Analysis 
## IEOR 135/290, Data-X: Applied Data Ventures
Author: Sudarshan Gopalakrishnan (in collaboration with Ikhlaq Sidhu)

UC Berkeley, B.S. EECS'21

Email: sudarshan.gopal@berkeley.edu



## Objective

In this homework, we will explore using spectral analysis to build a timeseries forecasting of the number of single family homes purchased in the United States. The dataset is sourced from the Federal Reserve of St. Louis, and contains information about the number of single family homes purchased from 1963 to 2020.

### Setting Up Imports and Dataset

In [None]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
from sklearn.metrics import mean_squared_error

%matplotlib inline

sns.set(rc={'figure.figsize':(14,6)})

In [None]:
df = pd.read_csv("housing_data.csv")
df["DATE"] = pd.to_datetime(df["DATE"])

date_mask = (df["DATE"] > "1963-01-01") & (df["DATE"] <= "1992-12-31")
data = df[date_mask]

### Your Dataset

In [None]:
plt.plot('DATE', 'HSN1F', data=data)

### Training/Testing Split

In [None]:
test_split = 0.4
test_index = int(len(data)*test_split)

In [None]:
train, test = data.iloc[:test_index], data.iloc[test_index:]

In [None]:
train.shape, test.shape, data.shape

### Question 1
Compute the fourier transformation of the HSN1F Value in the dataset, and store it in fft_hsn1f.

In [None]:
fft_hsn1f = np.fft.fft(data["HSN1F"])
fft_hsn1f[:10]

-------------------
As expected, you notice a real and an imaginary component in the above array. The plot below shows fft_hsn1f plotted vs 


**Plot:** Fourier Transformation of HSN1F vs. Sample Count

In [None]:
plt.plot(abs(fft_hsn1f))

### Question 2
What do you notice about this signal?

In [None]:
# Use this space to manipulate the signal for your analysis. You might consider logging the data for clarity

*Your answer here*

### Question 3

**Question 3a: Shifting**

Use the Swap Half Spaces technique to process the FFT signal. Generate plots and discuss your conclusions.

In [None]:
fft_shifted = ...

In [None]:
# Write your code here to generate plots; Optional: Consider logging the signal to see better visualize the signal

*Discuss your conclusions here*

**Question 3b: Folding**

Use the Fold Negative Frequencies technique to process the FFT Signal 

In [None]:
fft_folding = ...

In [None]:
# Write your code here to generate plots; Optional: Consider logging the signal to see better visualize the signal

*Discuss your conclusions here*

### Question 4

**Question 4a** 

Identify the features you want to use for building the predictor.

In [None]:
feature_samples = fft_hsn1f[:10].argsort()[-10:][::-1]
feature_samples

**Question 4b** 

Complete the function generate_sine_function to return a lambda function which defines the sine function that fits a particular feature.

*Hint:* Please refer to the function predictor to see how generate_sine_function is used.

In [None]:
def generate_sine_function(feature):
    return lambda x: ...

In [None]:
def predictor(x):
    functions = [generate_sine_function(feature) for feature in feature_samples]
    return sum([function(x) for function in functions])


In [None]:
y_prediction = predictor(data.index)
y_prediction_test = predictor(data.index[test_index:])

In [None]:
plt.plot(data["HSN1F"])
plt.plot(y_prediction)
plt.show()

In [None]:
print("Your timeseries prediction has an accuracy of {} on your test set".format(mean_squared_error(test["HSN1F"], y_prediction_test)))