Skip to content

isomiki/sttm-trader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

STTM Trader

ML model and tool which looks at topics in the news and makes predictions of stock price movements which can be used for trading. The project is an extension of STTM, an approach which detects the association between topics in the news and stock price movements. This project extends that by using the association as signal and suggesting trades.

I also made a trading simulation (backtest) to assess the performance. In my experience it's very noisy, but there are some promising results when using Kommersant news and data before 2022 (Russo-Ukrainian war).

Inspired by STTM. Article link: https://peerj.com/articles/cs-1156/

How to run it:

  1. Setup

    Python

    • Install python and pip.

    • Then create a virtual python environment inside this project:

      python3.9 -m venv .venv

      Note: the project was made using Python 3.9, but using any python3 should also work.

    • Activate the virtual environment:

      source .venv/bin/activate

    • Install requirements:

      pip install -r requirements.txt

    MyStem

    • Download Yandex MyStem for your system. Download link: https://yandex.ru/dev/mystem/

    • Place it in core/lib/ and name it mystem.

    • On Linux/macOS make sure to mark it executable: chmod +x mystem.

    Data

Place raw news and market series data in data/raw/. The data can be found online or by contacting me.

Expected structure of data/raw/:

data/raw
├── market_series
│   ├── AFKS.xlsx
│   ├── AFLT.xlsx
│   ├── ALRS.xlsx
│   ...
└── news
    └── kommersant
        ├── 2016-01-01
        │   ├── 0001__2016-01-01.txt
        │   ├── 0002__2016-01-01.txt
        │   ├── 0003__2016-01-01.txt
        │   ...
        ├── 2016-01-02
        ├── 2016-01-04
        ...

Make sure the news file names have sequence numbers padded to have 4 digits, otherwise they will not be parsed in the correct order. You can use the bash script data/fix_file_names to fix this.

  1. Config

    Open core/config.py and define your training years range, depending on your data. The system will automatically find the dates with overlapping news and market data.

  2. Training

    Run the training scripts. Just run them in sequence and check if the data they generate seems clean or malformed.

    • python -m core.run_news_preprocessing processes your news data and stores output in data/preprocessed_news/.

    • python -m core.run_lda_modeling builds LDA models and stores output in and data/models/.

    • python -m core.run_sttm_training builds the STTM index and stores output in a few directories under data/.

  3. Trading simulation

    • Open core/run_backtest.py and around the start, configure which training dataset you want to use and on which data you want to run the test (change _TRAINED_DATASET_START_YEAR, _TRAINED_DATASET_END_YEAR and _TEST_YEAR).

    • python -m core.run_backtest runs the backtest and stores data in data/backtest_results/. It will also store run logs in logs/.

  4. Results and statistics

    • Open core/run_results_calc.py and specify the years from which you want to consider results (edit _RESULTS_YEARS).

    • python -m core.run_results_calc calculates statistics for your results - mean weekly returns, annualized returns, z-score and p-value. Output is stored in data/aggregated_results/.


The original repo published by the authors is here.

About

ML-based tool for predicting stock movements from news. Trade simulator is included.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages