Skip to content

xmc-aalto/mips-negative-sampling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Meta-classifier free negative sampling for extreme multilabel classification.

The code is adapted from the source codes of LightXML [1].

Requirements

  • tokenizers==0.7.0
  • numpy==1.18.5
  • pandas==1.0.4
  • tqdm==4.46.1
  • scipy==1.4.1
  • transformers==2.11.0
  • scikit_learn==0.23.2
  • torch==1.5.1
  • faiss-gpu
  • apex

Datasets

The datasets can be downloaded from the following links:

Please place the datasets in the data folder.

Train and evaluation

Run the following commands for training and evaluation using MIPS-s method on Amazon-670 and Wikipedia-500K:

python src/main.py --epoch 20 --dataset amazon670k --swa --batch 16 --max_len 128 --hidden_dim 400 --model_type mips --num_neg_mips 5 --nlist 818 --nprobe_eval 350

python src/main.py --epoch 10 --dataset wiki500k --swa --batch 32 --max_len 128 --hidden_dim 500 --model_type mips --num_neg_mips 5 --nlist 707 --nprobe_eval 256

References

[1] Jiang, Ting, et al., Lightxml: Transformer with dynamic negative sampling for high-performance extreme multi-label text classification, AAAI, 2021.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages