A practice implementation of a linear regression machine learning algorithm for predicting network traffic capacity utilization.
This will be a linear regression algorithm to predict a resulting capacity index value
y, given features of
Goal: Accurately predict the y value (+/- 5) given just those two figures, based on training set.
/resources/ contains the training set, and
/src/ holds the code. The results it is giving in its current state are inaccurate, so if you have insights into how this can be improved, I'd love to hear your thoughts as an Issue on the repo.
The training set is comprised of two features, and the resulting "capacity index" in the last column of each row.
main.mis run to kick off the algorithm.
addCustomFeatures.mgenerates new features based on the input features, with which we can get a more complex polynomial, and as a result, higher granularity and a better fit to the training set.
computeCostMulti.mis our squared-error cost function.
featureNormalize.mnormalizes our features, since the difference in scale between our input features is significant.
gradientDescentMulti.mruns our gradient descent.
hypothesis.mis our actual hypothesis. This really needn't be broken out but was done in case of being able to try making sweeping changes later on in the work.
plotData.mis used for making visual representations of the data, so I can use the same settings across multiple figures quickly.
This Github repository is related to a project described in this blog post: Predicting Server Capacity with Linear Regression ML.