Performance Comparison of Supervised Classification Linear Machine Learning Algorithms in Scikit-Learn Applied to Algorithmic Trading using Python
With the advent of big data, data collected in huge amounts from such diverse fields as the sciences, healthcare, the government and financial markets, machine learning has become more and more mainstream. The relatively low cost of computer processing power and data storage has also fueled this popularity. Machine learning affects a large portion of the population daily, most notably with the recommendation engines of Amazon and Netflix. Another newsworthy application of machine learning is the self-driving car (from either Tesla or Google). Big data is also transforming the way investment and trading decisions are being made. The computer can analyze huge amounts of data and with sophisticated mathematics, determine patterns (hopefully profitable) which are impossible for humans to determine. Python has a library dedicated to machine learning algorithms, it’s called scikit-learn. This paper will describe and compare the supervised classification machine learning algorithms in scikit-learn as they pertain to algorithmic trading. Each applicable algorithms will be tested on the same instrument (EUR/USD, 1-minute granularity), the same dataset, parameters optimized as appicable, and the results will then be compared and analyzed.