<a href="https://colab.research.google.com/github/TiantianWang-Sara/Machine-Learning-Projects/blob/main/FIN553_2024_Graded_Project_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Due date:
October 11

# Problem statement

You are part of a trading team that makes investments based on [Fundamental Analysis](https://https://en.wikipedia.org/wiki/Fundamental_analysis).

Your job is to produce revenue growth forecasts for firms in a given industry based on financial charateristics of these firms and relevant economic data. Your forecasts are then used in a larger decision making system.

You have access to a dataset of 15,000 observations. Each observation $(X_i, y_i)$ consists of 4,000 economic and financial indicators that could presumably forecast revenue growth for the next quarter, and the corresponding revenue growth for that observation ($y_i$).

The corresponding data set is given below:

X.npy: https://drive.google.com/file/d/1SbC0xE1PPK0gL6J2yolIaQ07eoNVk2bM


y.py: https://drive.google.com/file/d/1HxnGlU_epSaiDppRCzkZAX8AJIkV0xvS


# Task
Construct a linear model to forecast revenue growth based on the data you have. You model will be evaluated based on the mean squared error between your predictions and the labels evaluated at a test set. The test set comes from the same distribution as the training set and the evaluation set.



# Deliverables
You should send me the entire code used to solve the problem. You can send me either a colab link or a single .py file.

As part of you solution, you should also deliver a function.


In [None]:
import jax.numpy as jnp
from google.colab import drive
import jax
import optax
import numpy as np

drive.mount('/content/drive')

%cd /content/drive/MyDrive/Colab Notebooks

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
/content/drive/MyDrive/Colab Notebooks


In [None]:
X=jnp.load("X.npy")
y=jnp.load("y.npy")

In [None]:
def stack(θ):
  return jnp.array([θ] * n_models)

def f(X, θ):
  return jnp.dot(X, θ)

def loss(θ,λ):
  return jnp.mean((f(X, θ) - y)**2 + λ*(abs(θ)).sum())

def sample():
  batch_size = 10000
  idx = np.random.choice(N, batch_size)
  return X[idx], y[idx]

grad_loss = jax.grad(loss)

@jax.jit
def update(θ, opt_state, X, y, α, beta, beta2,λ):
  grad = grad_loss(θ,λ)
  optimizer = optax.adam(α, beta, beta2)
  updates, opt_state = optimizer.update(grad, opt_state)
  θ = optax.apply_updates(θ, updates)
  return θ, opt_state

α = 0.001
beta = 0.9
beta2 = 0.999

In [None]:
N = X.shape[0]
idx = np.random.choice(N, int(N*0.2),replace=False)

X_test = X[idx]
y_test = y[idx]

X = jnp.delete(X, idx, axis=0)
y = jnp.delete(y, idx)

In [None]:
# Finding best lambda, skip if takes long time

@jax.jit
@jax.vmap
def mse(θ):
  return jnp.mean((f(X_test, θ) - y_test)**2)

update = jax.jit(jax.vmap(update, in_axes=(0, 0, None, None, None, None, None, 0)))

θ = jnp.zeros(X.shape[1])

n_models = 100

opt_state = optax.adam(α, beta, beta2).init(θ)
θ = stack(θ)
opt_state = jax.tree.map(stack, opt_state)
N = X.shape[0]

low = 0
high = 1

for i in range(10):
  λ = np.array(np.sort(np.hstack([np.random.uniform(low, high, n_models-2),[low,high]])))
  θ = jnp.zeros(X.shape[1])
  opt_state = optax.adam(α, beta, beta2).init(θ)
  θ = stack(θ)
  opt_state = jax.tree.map(stack, opt_state)
  for iteration in range(10000):
    Xi, yi = sample()
    θ, opt_state = update(θ, opt_state, Xi, yi, α, beta, beta2,λ)
  mse_number = mse(θ)
  idx= np.argmin(mse_number)
  optimal = λ[idx]
  if idx == 0:
    low = λ[idx]
    high = λ[idx+1]
  elif idx == 99:
    low = λ[idx-1]
    high = λ[idx]
  else:
    low = λ[idx-1]
    high = λ[idx+1]
  print(optimal)

0.0
0.001020432577526862
0.0011034309819248753
0.0011124879062249372
0.0011130614350518665
0.0011130614463100207
0.0011130613895146226
0.0011130613857435827
0.0011130613856957631
0.0011130613856949578


In [None]:
X=jnp.load("X.npy")
y=jnp.load("y.npy")
θ = jnp.zeros(X.shape[1])
opt_state = optax.adam(α, beta, beta2).init(θ)

@jax.jit
def update(θ, opt_state, X, y, α, beta, beta2,λ):
  grad = grad_loss(θ,λ)
  optimizer = optax.adam(α, beta, beta2)
  updates, opt_state = optimizer.update(grad, opt_state)
  θ = optax.apply_updates(θ, updates)
  return θ, opt_state

try:
  λ = optimal
except:
  λ = 0.0005 # After testing, we estimate this value will be the best.

print(λ)

def update(θ, opt_state, X, y, α, beta, beta2,λ):
  grad = grad_loss(θ,λ)
  optimizer = optax.adam(α, beta, beta2)
  updates, opt_state = optimizer.update(grad, opt_state)
  θ = optax.apply_updates(θ, updates)
  return θ, opt_state

for iteration in range(10000):
  Xi, yi = sample()
  θ, opt_state = update(θ, opt_state, Xi, yi, α, beta, beta2,λ)

  if iteration % 1000 == 0:
    print(θ)

0.0011130613856949578
[-0.00099999 -0.00099999  0.00099999 ... -0.00099999  0.00099999
 -0.00099999]
[-0.8613272  -0.5595428   0.88259137 ... -0.00617992  0.0072826
 -0.00466918]
[-1.5502613  -0.11304063  1.6991305  ... -0.00245418  0.00575801
 -0.00268312]
[-2.1400676e+00  7.3420894e-01  2.3306179e+00 ... -4.8147759e-04
  3.8450016e-03 -1.4240554e-03]
[-2.6283228e+00  1.6936095e+00  2.8356528e+00 ...  1.6224492e-04
  1.9775473e-03 -3.0377480e-05]
[-3.0043218e+00  2.6291127e+00  3.2306387e+00 ...  2.2768902e-04
  7.8674476e-04 -2.4365055e-04]
[-3.2749345e+00  3.4762940e+00  3.4778061e+00 ...  2.9893732e-04
  9.8539836e-05 -1.1413362e-04]
[-3.4665723e+00  4.2396989e+00  3.5513141e+00 ...  2.7967207e-04
 -1.6171644e-04  2.5095843e-04]
[-3.6056838e+00  4.9246049e+00  3.3389261e+00 ...  2.3610330e-04
 -4.3844229e-05  1.4751423e-04]
[-3.7113843e+00  5.5821447e+00  2.8458681e+00 ... -8.2906510e-05
 -6.5397347e-05  2.8547319e-04]


In [None]:
def f(X):
  prediction = jnp.dot(X,θ)
  return prediction



that returns the predictions of you model for a new dataset X.
