This project combines unsupervised and supervised Machine Learning techniques with Financial Theory and Statistics, to build a diversified stock portfolio using stocks from the S&P500 Index.
Data:
- S&P_500_End_of_Year_2020_Fundamental_FINAL.xlsx
Software/Deliverables:
- Constructing a Diversified Stock Portfolio using Machine Learning Techniques - FULL VERSION.ipynb
- Classification Models - EVALUATION.ipynb
- Clustering Algorithms - EVALUATION.ipynb
- main.py
Runs on macOS Big Sur (Version 11.1) and Windows 10 (or later)
- Possible Backwards Compatibility
This project requires you to have the following tools installed:
- Python v3 (https://www.python.org/downloads/)
- Anaconda (https://www.anaconda.com/products/distribution)
- A Python IDE of your choice
After installing anaconda, install streamlit:
The following python libraries should be installed:
- yfinance 0.1.70 or later (https://pypi.org/project/yfinance/)
- PyPortfolioOpt (https://pyportfolioopt.readthedocs.io/en/latest/)
- cvxpy (https://www.cvxpy.org/install/)
- arch (https://pypi.org/project/arch/)
- matplotlib (https://pypi.org/project/matplotlib/)
- sklearn (https://pypi.org/project/scikit-learn/)
- numpy (https://pypi.org/project/numpy/)
- pandas (https://pypi.org/project/pandas/)
- plotly (https://plotly.com/python/getting-started/)
You should be able to install most, if not all the above libraries using 'pip install (LIBRARY_NAME)'
Constructing a Diversified Stock Portfolio using Machine Learning Techniques - FULL VERSION.ipynb
- Contains the main investigation carried out in this priject
- Save the 'S&P_500_End_of_Year_2020_Fundamental_FINAL.xlsx' file in a location of your choice
- Go to the second block of code where the pandas reads the excel file into a DataFrame and change the line of code so the filepath is directed to the location where you saved the excel file.
- Run the code
Clustering Algorithms - EVALUATION.ipynb
- Contains the evaluation of the clusering algorithms
- Save the 'S&P_500_End_of_Year_2020_Fundamental_FINAL.xlsx' file in a location of your choice
- Go to the second block of code where the pandas reads the excel file into a DataFrame and change the line of code so the filepath is directed to the location where you saved the excel file.
- Run the code
Classification Models - EVALUATION.ipynb
- Contains the evaluation of the classification models
main.py
- Contains the user interface which demonstrates the practical use of our investigation
- It simulates the 2021 trading year (from 04/01/2021 to 31/12/2021)
- The data is from yfinance
- The stocks determined from the 'Constructing a Diversified Stock Portfolio using Machine Learning Techniques - FULL VERSION.ipynb' have been used in this application
- Set up streamlit
- Open terminal and change the directory to the location where you downloaded 'main.py'
- Run 'streamlit run main.py'
- Input an initial amount on the sidebar (this has to be a minimum of 100 dollars)
- Select allocation strategy
- Click on the 'Generate Portfolio' button
- You should see the current date, current porfolio value, profit/loss (which is initially N/A) and a pie chart of the initial allocation.
- Click the 'Rebalance Portfolio' button and see the changes on the pie chart of the current holdings and scroll down to see a graph of the portfolio history.
- Click the 'Rebalance Portfolio' another 12 times to simulate through the whole year.
- Clear the cache before re-running the program.