Hotpot QA Setup

Dataset creation tools for long-context HotpotQA evaluation using distractor articles to test language models on reasoning across large documents.

Quick Start

Get started with creating long-context HotpotQA datasets in minutes:

1. Install Dependencies

# Install uv package manager (if not already installed)
pip install uv

# Create virtual environment and install project dependencies
uv sync

2. Download Required Data

Download HotpotQA questions and Wikipedia dump:

# Download HotpotQA dev set and Wikipedia articles
uv run hotpot-download --raw-dir ./data --processed-dir ./data

This downloads:

hotpot_dev_fullwiki_v1.json - HotpotQA questions with Wikipedia links
enwiki-*-processed.tar.bz2 - Wikipedia articles in JSON format

3. Process into Long-Context Datasets

Create datasets with varying context sizes (8k, 32k, 128k tokens):

# Process full dataset (may take hours for first run due to Wikipedia indexing)
uv run hotpot-process \
  --hotpot-path ./data/hotpot_dev_fullwiki_v1.json \
  --wikipedia-path ./data/enwiki-*-processed.tar.bz2 \
  --output-dir ./data

# For testing with limited questions
uv run hotpot-process \
  --hotpot-path ./data/hotpot_dev_fullwiki_v1.json \
  --wikipedia-path ./data/enwiki-*-processed.tar.bz2 \
  --output-dir ./data \
  --max-questions 10

4. Output Files

The processing script generates three JSON files in the output directory:

long_hotpot_8k.json - Questions with ~8k token contexts
long_hotpot_32k.json - Questions with ~32k token contexts
long_hotpot_128k.json - Questions with ~128k token contexts

Each question includes supporting facts plus distractor articles to test long-context reasoning.

Example Usage

# Complete workflow
uv sync
uv run hotpot-download --raw-dir ./data --processed-dir ./data
uv run hotpot-process \
  --hotpot-path ./data/raw/hotpot_dev_fullwiki_v1.json \
  --wikipedia-path ./data/processed/enwiki-20171001-pages-meta-current-withlinks-processed \
  --output-dir ./data/processed \
  --tokenizer nltk

Detailed Installation

This project uses uv for fast Python dependency management.

To create a virtual environment and install dependencies:

uv sync

Detailed Usage

Downloading Data

Download HotpotQA datasets and the Wikipedia dump:

uv run hotpot-download --raw-dir ./data/raw --processed-dir ./data/processed

--raw-dir: Directory to save raw downloaded data (HotpotQA JSON files and Wikipedia dump).
--processed-dir: Directory for processed data (reserved for future use in pipeline).

This replaces the legacy scripts/00-download.sh for cross-platform compatibility and package integration.

Processing Command Options

uv run hotpot-process --help

Key options:

--hotpot-path: Path to HotpotQA JSON file (required)
--wikipedia-path: Path to Wikipedia dump (.bz2 or directory, required)
--output-dir: Output directory for JSON files (default: data)
--tokenizer: Tokenizer type - "simple" (default) or "nltk"
--max-questions: Limit number of questions to process (useful for testing)

Legacy Scripts

scripts/00-download.sh: Bash script for downloading data. Use the CLI above instead for better compatibility.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
hotpot		hotpot
src/hotpot_qa		src/hotpot_qa
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
shell.nix		shell.nix
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Hotpot QA Setup

Quick Start

1. Install Dependencies

2. Download Required Data

3. Process into Long-Context Datasets

4. Output Files

Example Usage

Detailed Installation

Detailed Usage

Downloading Data

Processing Command Options

Legacy Scripts

About

Uh oh!

Releases

Packages

Languages

License

loeeeee/hotpot-qa-setup

Folders and files

Latest commit

History

Repository files navigation

Hotpot QA Setup

Quick Start

1. Install Dependencies

2. Download Required Data

3. Process into Long-Context Datasets

4. Output Files

Example Usage

Detailed Installation

Detailed Usage

Downloading Data

Processing Command Options

Legacy Scripts

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages