HyPE (Hierarchical Category Path-Enhanced Generative Retrieval)

HyPE enhances explainability by generating hierarchical category paths step-by-step before decoding docid.
It employs hierarchical category paths as explanation, progressing from broad to specific semantic categories.

Demo Webpage

You can explore HyPE’s explainable retrieval on the demo page.

Dataset

The dataset directory contains two main datasets: NQ320K and MSMARCO, along with a backbone category hierarchy.

Each dataset directory includes path-augmented datasets that have been linked with linearized hierarchical category paths derived from the backbone category hierarchy.

In each dataset's directory, the backbone_file directory contains path-augmented datasets for:

Original training set
Query generation (QG) set
Indexing training set (documents)

Final path-augmented datasets can be generated within this directory for various docid types, such as atomic_docid and keyword_docid.

This structure allows for flexible dataset creation and manipulation based on different docid types and hierarchical paths.

Training

With HyPE/src/ours/execute_shell/baseline_shell.sh, the model can be trained with the following command:

cd HyPE/src/ours/execute_shell
bash baseline_shell.sh

We can modify the baseline_shell.sh file to change the dataset, model, and other hyperparameters.

Requirements

transformers == 4.35.2
sentence-transformers==2.5.1
marisa-trie==1.2.0
torch==2.0.1
pytorch_lightning==2.1.0

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
asset		asset
dataset		dataset
src/ours		src/ours
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HyPE (Hierarchical Category Path-Enhanced Generative Retrieval)

Demo Webpage

Dataset

Training

Requirements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HyPE (Hierarchical Category Path-Enhanced Generative Retrieval)

Demo Webpage

Dataset

Training

Requirements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages