Skip to content
No description, website, or topics provided.
HTML
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
EDGAR-reports-Text-Analysis adding scraping notebook, cikmapping notbook and some minor changes Apr 12, 2019
analysisEdgar edgar Apr 10, 2019
data add 10k raw Apr 13, 2019
models-no-sen Add trained models Apr 12, 2019
modelsgww Add models and plots Apr 12, 2019
modelsgwwneg Add models and plots Apr 12, 2019
modelsgwwpos Add models and plots Apr 12, 2019
modelsneg Add trained models Apr 12, 2019
modelspos Add trained models Apr 12, 2019
plots-no-sen Add plots Apr 12, 2019
plotsgww Add models and plots Apr 12, 2019
plotsgwwneg Add models and plots Apr 12, 2019
plotsgwwpos Add models and plots Apr 12, 2019
plotsneg Add plots Apr 12, 2019
plotspos Add plots Apr 12, 2019
sentiment-scores adding scraping notebook, cikmapping notbook and some minor changes Apr 12, 2019
.gitignore Add data Apr 6, 2019
AllSecTickers.csv Add data Apr 6, 2019
README.md Update README.md Apr 12, 2019
generate-predictions-csv.ipynb Add details Apr 12, 2019
generate_cikmapping_csv.ipynb adding scraping notebook, cikmapping notbook and some minor changes Apr 12, 2019
generating_portfolio.ipynb Remove keys Apr 12, 2019
lstmstocks-gww.ipynb Remove keys Apr 12, 2019
lstmstocks-nosen.ipynb Add csv and nosen nb Apr 12, 2019
lstmstocks-sentiments.ipynb Remove keys Apr 12, 2019
predicted_adj_close_50.csv Add csv and nosen nb Apr 12, 2019
scraping_cleaning_sec.ipynb minor changes Apr 12, 2019
stocker.py Add data Apr 6, 2019
tickers_100.csv added ticker mapping file and SEC_Sentiments Apr 12, 2019

README.md

Predicting Stable Portfolios using Machine Learning / Deep Learning

This repo contains notebooks for the 4 modules described in the report linked here.

Modules:

SEC Scraper

Note: We have already scraped S&P 500 SEC filings. For testing purpose we are scraping the filings for just 2 tickers here as an example.

  • scraping_cleaning_sec.ipynb notebook is used to scrape all data from SEC EDGAR website. It also takes care of cleaning and parsing HTML into raw text files and checkpoints its progress so extraction can resume later.

Sentiment Analyzer

Notebooks for this module are located in EDGAR-reports-Text-Analysis.

  • Sentiment_Analysis_SEC.ipynb contains all the code extract sentiment scores from SEC filings.
  • Sentiment_Stocks_Visualization.ipynb is used to generate stock trends vs. the calculated sentiments.

Stock Predictor

Note that feature engineering code is common across these notebooks.

  • lstmstocks-sentiments.ipynb is used to train models with and without sentiments. This notebook also performs all the feature engineering such as windowing, cascading, merging and interpolation.\

  • lstmstocks-gww.ipynb notebook constructs, trains and saves three models (no sentiments, positive and negative sentiments) for a single stock which can be changed.

  • generate-predictions-csv.ipynb generates predicted dataframes using saved models from earlier traing.

Portfolio Generator and Optimizer

  • generating_portfolio.ipynb constructs stable portfolios using metrics such as correlation, covariance, Sharpe ratio and volatility. Also creates related visualizations that help select stable portfolios.

Libraries and requirements:

  • Jupyter
  • NLTK VADER
  • Keras
  • Scikit-learn
  • fastai datepart
  • Pandas
  • Numpy
  • BeautifulSoup
  • Stocker
  • Quandl
  • Plotly
  • Tensorboard
You can’t perform that action at this time.