-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cannot reproduce the RichpediaMEL results? #2
Comments
Hi, Zhiwei. I retrained the model with RichpediaMEL dataset, and everything seems fine. Based on the training logs you provided, I notice that the loss appears to be much larger than usual. In my training, after the first epoch, the Train/loss_epoch is around 3.19.
Is the issue of not being able to reproduce the results limited to RichpediaMEL, or does it apply to all datasets? |
Hi, Pengfei. Only limited to RichpediaMEL, the other two datasets can get results close to the original text. |
In addition, I see that many attr fields in the dataset are empty. Is this field not used in the end? |
That's strange. I've checked the MD5 of the files, and they appear to match the ones on my training server. Can you please check the learning rate during training? It seems that after the second epoch, the loss no longer exhibits significant changes.
For some entities, I couldn't retrieve suitable attributes from Wikidata (possibly due to a network issue), so I left them blank. In the implementation, the attributes are concatenated with the entity's name. Lines 55 to 56 in 59ef385
|
I need to print the learning rate after each round, right? I also found that the losses did not change much after the second round. |
Okay, that means attr is not used in the current dataset, right? |
You could log the learning rate without hassle by using PyTorch Lightning callbacks. You simply need to add it to the trainer's callbacks. import os
import torch
import pytorch_lightning as pl
from pytorch_lightning.callbacks import ModelCheckpoint, EarlyStopping, LearningRateMonitor
from codes.utils.functions import setup_parser
from codes.model.lightning_mimic import LightningForMIMIC
from codes.utils.dataset import DataModuleForMIMIC
if __name__ == '__main__':
args = setup_parser()
pl.seed_everything(args.seed, workers=True)
torch.set_num_threads(1)
data_module = DataModuleForMIMIC(args)
lightning_model = LightningForMIMIC(args)
logger = pl.loggers.CSVLogger("./runs", name=args.run_name, flush_logs_every_n_steps=30)
ckpt_callbacks = ModelCheckpoint(monitor='Val/mrr', save_weights_only=True, mode='max')
early_stop_callback = EarlyStopping(monitor="Val/mrr", min_delta=0.00, patience=3, verbose=True, mode="max")
lr_callback = LearningRateMonitor(logging_interval='step')
trainer = pl.Trainer(**args.trainer,
deterministic=True, logger=logger, default_root_dir="./runs",
callbacks=[ckpt_callbacks, early_stop_callback, lr_callback])
trainer.fit(lightning_model, datamodule=data_module)
trainer.test(lightning_model, datamodule=data_module, ckpt_path='best')
I'm not sure what you mean by "not used." Our intention is to utilize the attributes to enhance the representation of entities. Therefore, we concatenate the flattened key-value attributes with the entity's name as textual input. |
I will give feedback this afternoon or evening. |
What I mean is that I saw that the attr field is empty, indicating that attr is not used. In the code, I saw that there is indeed a part where attr is spliced. |
No, I did use attributes. However, due to network issues or the absence of suitable attributes, some entities have empty or missing |
Ok, I understand. |
In addition, would you mind provide the Figure 4 datasets (10% and 20% for RichpediaMEL and WikiDiverse), and the numerical results, I need to draw my own histogram, but I don't know the specific value of your histogram. |
Hi Pengfei. I have provided the training logs with learning rate logs. Please also pay attention to the question I mentioned above about the dataset and numerical results of Figure 4, looking forward to the discussion. |
Hi, Pengfei. Anything update? |
Hi, sorry for the late response. I have reviewed your log file, and the learning rate appears to be fine. I attempted to retrain the model using the code and original data we uploaded, and the loss and evaluation results match our reported findings. Could you please check the configuration file If you want to reproduce the reported results right now, I have uploaded a model checkpoint here (password: KDD2023richpedia). In the low-resource setting, we only utilized the first 10% and 20% of the training data for each dataset, following the order in the training data file. This means that if you want to access the low-resource training data, you only need to control the amount of training data used. Please add a new line after Line 44 in 59ef385
train_data = train_data[:int(len(train_data) * 0.1)] # or 0.2 Then you can obtain either 10% or 20% of the training data we used. Regarding the numerical results you've requested, I will update them in the readme file in the next few days. Please stay tuned. |
Hi Pengfei. First, I uploaded the yaml file information I used, and I did not make any modifications except the path; secondly, for the running environment, I created it through conda alone, and the environment information is exactly the same as your requirements.txt. run_name: RichpediaMEL
seed: 43
pretrained_model: '/checkpoint/clip-vit-base-patch32'
lr: 1e-5
data:
num_entity: 160933
kb_img_folder: /data/RichpediaMEL/kb_image
mention_img_folder: /data/RichpediaMEL/mention_image
qid2id: /data/RichpediaMEL/qid2id.json
entity: /data/RichpediaMEL/kb_entity.json
train_file: /data/RichpediaMEL/RichpediaMEL_train.json
dev_file: /data/RichpediaMEL/RichpediaMEL_dev.json
test_file: /data/RichpediaMEL/RichpediaMEL_test.json
batch_size: 128
num_workers: 8
text_max_length: 40
eval_chunk_size: 6000
eval_batch_size: 20
embed_update_batch_size: 512
model:
input_hidden_dim: 512
input_image_hidden_dim: 768
hidden_dim: 96
dv: 96
dt: 512
TGLU_hidden_dim: 96
IDLU_hidden_dim: 96
CMFU_hidden_dim: 96
trainer:
accelerator: 'gpu'
devices: 1
max_epochs: 20
num_sanity_val_steps: 0
check_val_every_n_epoch: 2
log_every_n_steps: 30 All environmental information is: absl-py 1.4.0
aiohttp 3.8.5
aiosignal 1.3.1
antlr4-python3-runtime 4.9.3
async-timeout 4.0.3
attrs 23.1.0
cachetools 5.3.1
certifi 2023.7.22
charset-normalizer 3.2.0
click 8.1.7
filelock 3.12.3
frozenlist 1.4.0
fsspec 2023.9.0
google-auth 2.22.0
google-auth-oauthlib 1.0.0
grpcio 1.57.0
huggingface-hub 0.16.4
idna 3.4
importlib-metadata 6.8.0
joblib 1.3.2
Markdown 3.4.4
MarkupSafe 2.1.3
multidict 6.0.4
numpy 1.24.4
oauthlib 3.2.2
omegaconf 2.2.3
packaging 23.1
Pillow 9.3.0
pip 23.2.1
protobuf 4.24.2
pyasn1 0.5.0
pyasn1-modules 0.3.0
pyDeprecate 0.3.2
pytorch-lightning 1.7.7
PyYAML 6.0.1
regex 2023.8.8
requests 2.31.0
requests-oauthlib 1.3.1
rsa 4.9
sacremoses 0.0.53
setuptools 68.0.0
six 1.16.0
tensorboard 2.14.0
tensorboard-data-server 0.7.1
tokenizers 0.12.1
torch 1.11.0
torchmetrics 0.11.0
tqdm 4.66.1
transformers 4.18.0
typing_extensions 4.7.1
urllib3 1.26.16
Werkzeug 2.3.7
wheel 0.38.4
yarl 1.9.2
zipp 3.16.2 |
Thanks for information about how to run the low-resource setting experiments. I am very much looking forward to your numerical results, thank you for your efforts. In addition, regarding the reproduction of dataset RichpediaMEL, I think whether there may be some differences between the code you reproduced and the code uploaded, because I ran it twice on this dataset and the results were exactly the same as I above upload. |
I can reproduce the results with the code we shared and the data we uploaded to OneDrive. Is there anything difference about the pretrained model? I saw you change the path. I use the one form huggingface. SHA256: a63082132ba4f97a80bea76823f544493bffa8082296d62d71581a4feff1576f |
I download the pretrained clip from https://huggingface.co/openai/clip-vit-base-patch32/tree/main, I will replace the pytorch_model.bin with the link you provided, upload the results tomorrow morning. |
But I found that the CLIP weighted link address I downloaded actually came out exactly the same as the one you provided after clicking pytorch_model.bin. |
Hi Pengfei. I may need further help from you, because I still have difficulty reproducing the results of dataset RichpediaMEL, even though I have used the CLIP pre-training URL you gave me (actually the same pre-trained model I used previous), I will upload it below my running logs on three datasets. |
This is very strange. The other two datasets work fine, only RichpediaMEL has an issue. Maybe you could double-check the RichpediaMEL.tar file you downloaded? I will share an online Wandb report later to show that everything is normal on my end. RichpediaMEL.tar |
I download the RichpediaMEL dataset from you provided: https://mailustceducn-my.sharepoint.com/:u:/g/personal/pfluo_mail_ustc_edu_cn/ERikbOQuoWFHrA_AizcuCbgB8PBOiRqCV4U0lZfxUN-6kg?e=speIdh |
Could you please try upgrading transformers to version 4.27.1? I notice that the version of transformers might have an impact on the results, although I'm not sure what's causing the differences in results. pip install transformers==4.27.1 --upgrade |
Let me check. |
The Wandb report is here. |
You use the transformers==4.27.1 right? |
Yes, in the Wandb report run, I used torch==1.11.0 and transformers==4.27.1. Other packages are the same as the requirements. I attempted to downgrade transformers to 4.18.0 and noticed that it did lead to a performance drop. I have no idea why this occurred. |
If the performance degradation is due to transformers, then this should not be within the scope of our discussion. As long as the results can be reproduced, everything is good. I'll re-run and give my reproduction results. |
Hi, Pengfei. self.tokenizer = CLIPProcessor.from_pretrained(self.args.pretrained_model).tokenizer The code used by wandb is: self.tokenizer = CLIPProcessor.from_pretrained(self.args.pretrained_model, local_files_only=True).tokenizer But I think this is not the main problem, because after I added the local_files_only=True parameter, I found the result was the same. Then, I created a requirements.txt environment that is exactly the same as wandb provided, and the running results are exactly the same as mine before, indicating that the difference in results is not caused by environmental problems. So, I need to confirm now, is the RichpediaMEL dataset you are using the version you uploaded? Because now all the code and environment information are completely consistent, the performance difference is difficult to accept. |
The parameter |
You can ignore the .pkl files, I found a difference between kb_image and mention_images. |
Can you provide the MD5 values for kb_image.zip and mention_images.zip? I directly extracted these two ZIP files. |
Wait a few minutes, I deleted the original file after decompressing it, and I need to download it again. |
I can't think of any other reason why it is difficult to reproduce, because the size of the .zip file is the same, but the size after decompression is different? |
I checked your running log on wandb , and your loss is obviously much lower than what I reproduced. |
It seems all the files are normal. The difference in folder sizes may be due to differences in how the operating system organizes files. |
Perhaps you can try changing some hyperparameters, such as the random seed, learning rate, and batch size, to see if they have an impact on the loss. If you have access to other servers, maybe you can try configuring the environment and running it on other servers. I don't know what's causing the inability to reproduce the results. All the results on my end are normal. |
I can try it on other machines, but judging from my experience running your code, as long as the random seed is fixed, the results will be exactly the same every time. |
I think it is necessary to give some new content. I originally ran the code on the V100 32G graphics card. Now I have tried it on the A6000 and found that the final result of the model is almost the same as that of the V100. Have you made any other modifications? Because the hyperparameters I use are completely consistent with the yaml you provided. |
As you can see on Wandb, I confirm that there are no changes made to the code about model and data processing. We have verified that the pre-trained model, environment configuration, code, and data are all consistent. I don't think the CUDA version and NVIDIA driver version should have such a significant impact. |
This is a very vexing question because I can't think of any other reason that could cause this problem. |
My thought is, perhaps you can make some adjustments to the parameters and then observe if there's a decrease in loss within a few iterations (compared to your previous loss). Maybe consider replacing the optimizer with stochastic gradient descent? I'm not sure. |
I'll try adjusting the parameters tomorrow and keep in touch at any time. |
I discovered that the folder for the mention images should be named |
You need to manually modify the following line in the config file for RichpediaMEL. MIMIC/config/richpediamel.yaml Line 10 in 59ef385
|
It is indeed like this, a very subtle problem, thank you very much, I will re-run the code and give the final result. |
As a reminder, would you mind uploading the numerical results of Figure 4? |
I have just updated the detailed results in the README file. |
Great, Thx. |
Hi, Pengfei. I have reproduced the results, thanks for your solution. Good luck. I will close the issue. |
我似乎无法复现wikimel中的结果,我的超参数文件使用的是作者在GitHub中提供的文件,能麻烦您告诉我您在复现的过程中有什么需要注意的吗 |
Hi, Pengfei. Nice work. I find I cannot reproduce the RichpediaMEL dataset result,, I use the same yaml as you provided, can you help me? attachment is the training logs.
richpediamel.txt
The text was updated successfully, but these errors were encountered: