Skip to content
Extracts clues from the J! Archive website. Added options for using multithreading and gevent.
Branch: master
Clone or download
Pull request Compare This branch is 12 commits ahead of whymarrh:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


Python crawler for jeopardy games on J! Archive.


git clone
cd jeopardy-parser
pip install -r requirements.txt

This crawler provides 2 kind of output file formats: json and html. You can define the format you want with -o html or -o json.

If you want to download all seasons up to date, run python or python uses multithreading and gevent while uses multiprocessing. Generally speaking, the former is faster than the latter. If you want to download a specific season in html files, say season 34, run python -s 34 -o html.

Sample json output file is included here. For each clue, we have the following attributes:

  • Jtype:
    • "single": single jeopardy. Prices for the corresponding clue should be either 200, 400, 600, 800 or 1000.
    • "double": daily doubles. Prices various.
    • "placeholder": clue was missing from J! Archive website. All other fields are defined as null.
  • price
  • prompt
  • solution
  • parsed_solution

Each game contains the following fields:

  • keys: rounds in this game. If a game has keys equal to [1, 2], then it means that game onlys has Jeopardy! Round and Double Jeopardy! Round.
  • 1: stands for Jeopardy! Round
  • 2: stands for Double Jeopardy! Round. Might be missing for some games.
  • 3: stands for Final Jeopardy! Round. Might be missing for some games.
You can’t perform that action at this time.