#AS for detailed processing about data, please refer to this link.
Code for paper "["An Alignment and Matching Network with Hierarchical Visual Features for Multimodal Named Entity and Relation Extraction"]".
To run the codes, you need to install the requirements:
pip install -r requirements.txt
To extract visual object images, we first use the NLTK parser to extract noun phrases from the text and apply the visual grouding toolkit to detect objects. Detailed steps are as follows: link.
==========
You can download the Twitter2015 data via this link
and Twitter2017 data via this link. Please place them in data/NER_data.
You can download the MRE data via Google Drive. Please place it in data/RE_data.
The expected structure of files is:
HMNeT
|-- data
| |-- NER_data
| | |-- twitter2015 # text data
| | | |-- train.txt
| | | |-- valid.txt
| | | |-- test.txt
| | | |-- twitter2015_train_dict.pth # {imgname: [object-image]}
| | | |-- ...
| | |-- twitter2015_images # raw image data
| | |-- twitter2015_aux_images # object image data
| | |-- twitter2017
| | |-- twitter2017_images
| | |-- twitter2017_aux_images
| |-- RE_data
| | |-- img_org # raw image data
| | |-- img_vg # object image data
| | |-- txt # text data
| | |-- ours_rel2id.json # relation data
|-- models # models
| |-- bert_model.py
| |-- modeling_bert.py
|-- modules
| |-- metrics.py # metric
| |-- train.py # trainer
|-- processor
| |-- dataset.py # processor, dataset
|-- logs # code logs
|-- run.py # main
|-- run_ner_task.sh
|-- run_re_task.sh
The data path and GPU related configuration are in the run.py. To train ner model, run this script.
bash run_twitter15.sh
bash run_twitter17.shTo train re model, run this script.
bash run_re_task.shTo test ner model, you can use the tained model and set load_path to the model path, then run following script :
python -u run.py \
--dataset_name="twitter15/twitter17" \
--bert_name="bert-base-uncased" \
--seed=1234 \
--only_test \
--max_seq=128 \
--use_prompt \
--use_contrastive\
--use_matching\
--prompt_len=12 \
--sample_ratio=1.0 \
--load_path='ckpt/ner/twitter15/17/best_model.pth'
To test re model, you can use the tained model and set load_path to the model path, then run following script:
python -u run.py \
--dataset_name="MRE" \
--bert_name="bert-base-uncased" \
--seed=1234 \
--only_test \
--max_seq=80 \
--use_prompt \
--use_contrastive\
--use_matching\
--prompt_len=12 \
--sample_ratio=1.0 \
--load_path='ckpt/re/best_model.pth
The acquisition of Twitter15 and Twitter17 data refer to the code from UMT, many thanks.
The acquisition of MNRE data for multimodal relation extraction task refer to the code from MEGA, many thanks.
This article extends the work of [HVPNeT] and references some code from HVPNeT, many thaks.