GitHub - utsavbgp/new

Branches Tags
Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README		README
eval.jl		eval.jl
sgd.jl		sgd.jl
Repository files navigation

Description of the model:

Step 1: Predicting the log return of each signal at all dates using the information till previous days.

x[t] = a_1 * x[t-1] + a_2 * x[t-2] + ....... + a_p * x[t-p] + c + b_1 * e[t-1] + b_2 * e[t-2] + ....... + b_q * e[t-q] 

where,
x[t] = log retrun of the signal at time t.
e[t] = actual_x[t] - predicted_x[t], denotes the prediction error at time instant t
c    = intercept
This is a ARMA(p,q) (Auto-regressive Moving Average Model). The parameters are estimated using Stochastic Gradient Descent assuming L2-norm are the loss function. 
Please Note: Either of p or q can be set to zero but they cannot be set to zero together simultaneously

In each iteration the update step is:
a_i = a_i - learning_rate * (x_predicted[t] - x_actual[t]) * x[t-i]
b_j = b_j - learning_rate * (x_predicted[t] - x_actual[t]) * e[t-j]

After making the predictions a log bias correction term to added to remove the bias which results when predictions are made in log domain. For more details refer: 
http://www.vims.edu/people/newman_mc/pubs/Newman1993.pdf

logBiasCorrection = (Mean Squared Error in the log domain)/2
therefore,
x[t] = x[t] + logBiasCorrection

Step 2: Assigning the weights.
To account for the risk we consider balancing_factor. 

The signals with positive returns are assigned weights in proportion of their predictions, which sum up to balancing_factor.
The signal with negative predictions are assigned weights in inverse ratio of their predicted magnitude and these weights sum up to 1 - balancing factor.

Description of various module:

#########################################################################################################################################################################
@Function: hypothesis
find the value of the hypothesis function trans(theta) * x

@param theta: Array of theta vector, i.e., [a_1, a_2, ....., a_p, c, b_1, b_2, ......, b_q]
@param     x: Array consisting of previous values of signal and previous prediction errors, i.e., [x[t-1], x[t-2], ......., x[t-p], 1, e[t-1], e[t-2], ........., e[t-q]]

@output returns trans(theta) * x
#########################################################################################################################################################################
@Function: gradient
compute the stochastic gradient of L2-norm loss function

@param theta: Array of theta vector, i.e., [a_1, a_2, ....., a_p, c, b_1, b_2, ......, b_q]
@param     x: Array consisting of previous values of signal and previous prediction errors, i.e., [x[t-1], x[t-2], ......., x[t-p], 1, e[t-1], e[t-2], ........., e[t-q]]
@param     y: Float value containing actual value of the signal

@function calls: hypothesis

@output returns an Array of gradients for each predictor variables x[t-1], x[t-2], ...x[t-p], intercept, e[t-1], e[t-2], ......., e[t-q]
##########################################################################################################################################################################
@Function: updateWeights
Update the theta vector using stochastic gradient descent

@param theta: Array of theta vector, i.e., [a_1, a_2, ....., a_p, c, b_1, b_2, ......, b_q]
@param     x: Array consisting of previous values of signal and previous prediction errors, i.e., [x[t-1], x[t-2], ......., x[t-p], 1, e[t-1], e[t-2], ........., e[t-q]]
@param     y: Float value containing actual value of the signal
@param learningRate

@function calls: gradient

@output returns updated theta vector theta = theta - learningRate * gradient(theta, x, y)
##########################################################################################################################################################################
@Function: getPred
Computes the predicted values of the log returns of the input signal

@param   signalCol: Data Array containing actual values of log return of the signal on each date
@param  windowSize: Order of the Auto-Regressive Model, i.e., the value of p in the model equation
@param learningRate
@param   intercept: Boolean variable to select if intercept is part of model equation, i.e., if set to false the model equation won't have c in it.
@param errorWindow: Order of the Moving-Average Model, i.e., the value of q in the model equation
@param 	   logBias: Boolean Operator to select if log bias correction has to be used

@function calls: hypothesis: calculate the predicted value of the log returns of the signal
@function calls: updateWeights

@output predicted log returns on each date
##########################################################################################################################################################################
@Function getPredSignals
Computes the predicted log returns of each signal

@param  		dt: Data Frame of signals
@param  windowSize: Order of the Auto-Regressive Model, i.e., the value of p in the model equation
@param learningRate
@param   intercept: Boolean variable to select if intercept is part of model equation, i.e., if set to false the model equation won't have c in it.
@param errorWindow: Order of the Moving-Average Model, i.e., the value of q in the model equation
@param 	   logBias: Boolean Operator to select if log bias correction has to be used

@function calls: getPred: compute the predicted log returns of each signal

@output Data Frame of predicted values of the signals on each date
##########################################################################################################################################################################
@Function evalWeights
Evaluate the weights of various signals of the portfolio

@param 		  signalVec: Array of values of log returns of all signals on a date
@param balancing_Factor

@output returns the array of weights of all signals on the date
##########################################################################################################################################################################
@Function main
A wrapper function which calls getPredSignals and evalWeights

@param  		dt: Data Frame of signals
@param  windowSize: Order of the Auto-Regressive Model, i.e., the value of p in the model equation
@param learningRate
@param   intercept: Boolean variable to select if intercept is part of model equation, i.e., if set to false the model equation won't have c in it.
@param errorWindow: Order of the Moving-Average Model, i.e., the value of q in the model equation
@param 	   logBias: Boolean Operator to select if log bias correction has to be used
@param balancing_Factor

@function calls: getPredSignals
@function calls: evalWeights

@output returns the weight of each signal at all dates
##########################################################################################################################################################################

Function eval.py has been modified to eval.jl
To run the code:
__init__("path_of_the_file")