Full option chain scraper

Iterates through specified tickers and scrapes entire option chain for those tickers
Store option chain data in MDB
Currently only works on weeklies
Scraper uses multiprocessing for higher throughput. Set workers to > 1 in settings.yml

Motivation

This is a fork of https://github.com/melder/iv_scraper but it scrapes entire option chain of a small set of stocks (instead of scraping 4 option chains closest to current stock price for all stocks that offers options).

The idea is to construct a series of volume / open interest over time to attempt to isolate "smart" money plays. In this case I'm trying to look into cryptocurrency related stocks / option chains because crypto is ripe and legal for manipulation. However this can be configured to track any collection of stocks.

Since option chains can be very large it isn't recommended to track more than a couple dozen at a time

Requirements

Python 3.12.2
pyenv + pipenv
redis7 server
mongodb7 (feel free to fork if you prefer implementing a different DB)
robinhood account with options enabled
Optional: server with high # of CPU cores for fast scraping (6 core server throughput of 2k options / minute)

Installation

Install pyenv / python 3.12.2 / pipenv
Clone, init environment, init submodules:

git clone git@github.com:melder/full_option_chain_scraper.git
cd full_option_chain_scraper
git submodule update --init --recursive
pipenv install

In the config directory create settings.yml / vendors.yml and set them up with the appropriate values
To test

$ pipenv shell
$ python scraper.py scrape

Commands

scrape - self explanitory
scrape-force - scrapes regardless of market open status
populate-exprs - retrieving expirations is API heavy so cache is utilized. This is intended to prepopulate before scraping to avoid rate limit slowdowns
purge-exprs - Deletes expiration date cache on weekly / monthly expiration days

Configuration

settings.yml

namespace: option_chain_scraper
workers: 3
crypto_tickers:
  - MSTR
  - COIN
  - BITO
  - MARA
  - CLSK
  - RIOT

vendors.yml

hood:
  login: melder@example.com
  password: password
  my2fa: ABCD
  pickle_name: rick

redis:
  host: localhost
  port: 6379
  password: pass # optional

mongo:
  host: localhost
  port: 27017
  database: production
  username: user        # optional
  password: pass        # optional
  auth_source: auth_db  # optional

MDB indexes

Recommended MDB indexes

[
  { v: 2, key: { _id: 1 }, name: '_id_' },
  { v: 2, key: { ticker: 1 }, name: 'ticker_1' },
  { v: 2, key: { expiration: 1 }, name: 'expiration_1' },
  { v: 2, key: { scraper_timestamp: 1 }, name: 'scraper_timestamp_1' },
  {
    v: 2,
    key: { ticker: 1, scraper_timestamp: 1 },
    name: 'ticker_1_scraper_timestamp_1',
    unique: true
  }
]

crontab example

# FYI some distros don't support CRON_TZ

CRON_TZ=America/New_York

*/2  9-16    * * 1-5 ec2-user cd ~/full_option_chain_scraper; pipenv run python scraper.py scrape
2    16      * * 1-5 ec2-user cd ~/full_option_chain_scraper; pipenv run python scraper.py scrape-force
*    1       * * 1-5 ec2-user cd ~/full_option_chain_scraper; pipenv run python scraper.py purge-exprs
1    2       * * 1-5 ec2-user cd ~/full_option_chain_scraper; pipenv run python scraper.py populate-exprs

Notes:

Scraper will only scrape when market is open (unless using scrape-force command)
Expiration dates cache will only be purged on the day of expiration. Since it is a very lightweight operation, it is set to run every minute for an hour to greatly reduce odds that cache persists and messes things up.

Background Jobs

Note: scrape command performs all the queueing and then launching the workers (in sequence). But if you have personal preference on how to run the job processor scrape can be substituted with the following:

# queuing: 
python queue_jobs.py

# run workers
# X = number of parallel workers

rq worker-pool -b <namespace> -n X

# osx:

OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES rq worker-pool -b <namespace> -n 5

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
config		config
helpers		helpers
models		models
workers		workers
.gitignore		.gitignore
.gitmodules		.gitmodules
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
queue_jobs.py		queue_jobs.py
scraper.py		scraper.py
wsgi.py		wsgi.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Full option chain scraper

Motivation

Requirements

Installation

Commands

Configuration

settings.yml

vendors.yml

MDB indexes

crontab example

Background Jobs

About

Releases

Packages

Languages

melder/full_option_chain_scraper

Folders and files

Latest commit

History

Repository files navigation

Full option chain scraper

Motivation

Requirements

Installation

Commands

Configuration

settings.yml

vendors.yml

MDB indexes

crontab example

Background Jobs

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages