Even the Simplest Baseline Needs Careful Re-investigation: A Case Study on XML-CNN

This is the code for the NAACL 2022 paper "Even the Simplest Baseline Needs Careful Re-investigation: A Case Study on XML-CNN". The repository is used to reproduce the experimental results in our paper. However, results may be slightly different because of the randomness and the environment. If you find our work useful, please consider citing the following paper:

@inproceedings{SC2022,
  author = {Si-An Chen and Jie-Jyun Liu and Tsung-Han Yang and
Hsuan-Tien Lin and Chih-Jen Lin},
  title = {Even the Simplest Baseline Needs Careful Re-investigation: A Case Study on {XML-CNN}},
  booktitle = {Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)},
  month = jul,
  year = 2022,
}

Please feel free to contact Si-An Chen if you have any questions about the code/paper.

Datasets

All datasets used in our experiments can be downloaded from here. Each dataset contains 5 files:

Xf.txt: vocabulary set of the bag-of-word (BOW) features used in Extreme Multi-Label Repository. We use this set to generate train_rv.txt and test_rv.txt.
train.txt, test.txt: training set and test set obtained from AttentionXML.
train_rv.txt, test_rv.txt: training set and test set with reduced vocabulary set (Xf.txt). More details can be found the Appendix in our paper.

Explanation of directories and files

XML-CNN/: The code for the SIGIR 2017 paper "Deep learning for extreme multi-label text classification" by Liu et al.
config/: The config files used in our experiments.
libmultilabel/: The main experiment code. Modified from an older version of LibMultiLabel.
libmultilabel/networks/: The source code of different network architectures used in our experiments.
main.py: The script for training and testing.
search_params.py: The script for hyperparameter tuning.
clean_tab.py, remove_vocab.py: The scripts for data pre-processing.

How to run the experiments

Download the datasets and place them in data/.
Run the command with a specified config file (provided in the following sections):

# train
python main.py --config [CONFIG_FILE]

# test
python main.py --config [CONFIG_FILE] --eval --checkpoint_path [CHECKPOINT_PATH]

Experiment Result (Table 4 and Table 14)

Method:

Kim: Kim-CNN
XML: XML-CNN

CNN sweeping direction (CNN):

E: embeddings
W: words

Dynamic Max-pooling (DM):

NA: Not applicable
Eq. (7): dynamic max-pooling in Liu et al.'s implementation
Eq. (6): dynamic max-pooling in Liu et al.'s paper

Method/CNN/DM	P@1	P@3	P@5	NDCG@1	NDCG@3	NDCG@5	Config
Kim/E/NA	45.38	34.02	27.72	45.38	36.72	33.04	Cfg
Kim/W/NA	75.83	61.08	50.19	75.83	64.75	58.93	Cfg
XML/E/Eq. (7)	75.96	60.56	49.23	75.96	64.31	58.20	Cfg
XML/W/Eq. (7)	58.09	45.19	37.06	58.09	48.30	43.81	Cfg
XML/E/Eq. (6)	63.03	48.31	39.32	63.03	51.92	46.88	Cfg
XML/W/Eq. (6)	75.73	61.82	50.82	75.73	65.31	59.54	Cfg

Experiment Result (Table 7 and Table 15)

EUR-Lex

Loss/Hidden layer/Max-pooling	P@1	P@3	P@5	NDCG@1	NDCG@3	NDCG@5	Config
CE/N/standard	72.78	59.84	49.94	72.78	59.84	49.94	Cfg
BCE/N/standard	80.93	66.38	55.34	80.93	66.38	55.34	Cfg
BCE/N/dynamic	77.67	64.94	53.29	77.88	64.58	53.38	Cfg
BCE/Y/standard	76.40	62.78	51.88	76.56	62.92	51.84	Cfg
BCE/Y/dynamic	77.98	65.11	53.90	78.94	65.77	54.15	Cfg

Wiki10-31K

Loss/Hidden layer/Max-pooling	P@1	P@3	P@5	NDCG@1	NDCG@3	NDCG@5	Config
CE/N/standard	80.70	64.83	55.43	80.70	64.83	55.43	Cfg
BCE/N/standard	82.78	68.07	57.63	82.78	68.07	57.63	Cfg
BCE/N/dynamic	83.15	70.32	59.91	83.37	70.64	60.16	Cfg
BCE/Y/standard	80.89	67.89	58.17	81.73	68.82	58.65	Cfg
BCE/Y/dynamic	84.19	71.55	61.14	84.70	71.80	61.03	Cfg

AmazonCat-13K

Loss/Hidden layer/Max-pooling	P@1	P@3	P@5	NDCG@1	NDCG@3	NDCG@5	Config
CE/N/standard	91.01	75.07	60.50	92.85	76.90	61.76	Cfg
BCE/N/standard	93.31	78.02	62.93	93.41	78.11	62.95	Cfg
BCE/N/dynamic	93.63	78.55	63.42	93.65	78.56	63.41	Cfg
BCE/Y/standard	94.73	79.64	63.95	94.73	79.64	63.94	Cfg
BCE/Y/dynamic	94.79	80.04	64.49	94.78	80.03	64.52	Cfg

Amazon-670K

Loss/Hidden layer/Max-pooling	P@1	P@3	P@5	NDCG@1	NDCG@3	NDCG@5	Config
CE/N/standard	27.14	24.70	22.70	27.23	24.65	22.70	Cfg
BCE/N/standard	33.38	29.99	27.47	33.38	29.99	27.47	Cfg
BCE/N/dynamic	34.58	30.89	28.24	34.61	30.91	28.25	Cfg
BCE/Y/standard	33.62	30.15	27.62	33.86	30.27	27.69	Cfg
BCE/Y/dynamic	35.53	31.82	29.03	35.69	31.89	29.08	Cfg

Name		Name	Last commit message	Last commit date
Latest commit History 527 Commits
XML-CNN		XML-CNN
config		config
libmultilabel		libmultilabel
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
clean_tab.py		clean_tab.py
main.py		main.py
remove_vocab.py		remove_vocab.py
requirements.txt		requirements.txt
requirements_parameter_search.txt		requirements_parameter_search.txt
search_params.py		search_params.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Even the Simplest Baseline Needs Careful Re-investigation: A Case Study on XML-CNN

Datasets

Explanation of directories and files

How to run the experiments

Experiment Result (Table 4 and Table 14)

Experiment Result (Table 7 and Table 15)

EUR-Lex

Wiki10-31K

AmazonCat-13K

Amazon-670K

About

Releases

Packages

Contributors 2

Languages

sian-chen/xml-cnn-study-code

Folders and files

Latest commit

History

Repository files navigation

Even the Simplest Baseline Needs Careful Re-investigation: A Case Study on XML-CNN

Datasets

Explanation of directories and files

How to run the experiments

Experiment Result (Table 4 and Table 14)

Experiment Result (Table 7 and Table 15)

EUR-Lex

Wiki10-31K

AmazonCat-13K

Amazon-670K

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages