Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't load wiki archive knowbert_wiki_wordnet_model.tar.gz. #4

Closed
JingjunYi opened this issue Sep 14, 2023 · 2 comments
Closed

Can't load wiki archive knowbert_wiki_wordnet_model.tar.gz. #4

JingjunYi opened this issue Sep 14, 2023 · 2 comments
Labels

Comments

@JingjunYi
Copy link

When i run the training code, there is a mistake, can you help me solve this, thanks a lot.
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 381 column 5 (char 15191)

Error log:
(scenetext) [yjj23@gpu2 KnowledgeMiningWithSceneText-main]$ CUDA_VISIBLE_DEVICES=0 python main.py -c configs/train_knowbert_attention_bottle.toml
[2023-09-14 23:54:00,256][RANK=00][I]: unknown_args=[] [main.py:114]
[2023-09-14 23:54:03,193][RANK=00][I]: Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex . [/home/yjj23/anaconda3/envs/scenetext/lib/python3.8/site-packages/pytorch_pretrained_bert/modeling.py:230]
/home/yjj23/anaconda3/envs/scenetext/lib/python3.8/site-packages/sklearn/utils/linear_assignment_.py:18: FutureWarning: The linear_assignment_ module is deprecated in 0.21 and will be removed from 0.23. Use scipy.optimize.linear_sum_assignment instead.
warnings.warn(
[2023-09-14 23:54:04,196][RANK=00][I]: instantiating class <class 'allennlp.data.token_indexers.token_indexer.TokenIndexer'> from params {'type': 'characters_tokenizer', 'tokenizer': {'type': 'word', 'word_splitter': {'type': 'just_spaces'}}, 'namespace': 'entity'} and extras set() [/home/yjj23/SceneText/allennlp-master/allennlp/common/from_params.py:256]
[2023-09-14 23:54:04,196][RANK=00][I]: type = characters_tokenizer [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:251]
[2023-09-14 23:54:04,196][RANK=00][I]: instantiating class <class 'allennlp.data.tokenizers.tokenizer.Tokenizer'> from params {'type': 'word', 'word_splitter': {'type': 'just_spaces'}} and extras set() [/home/yjj23/SceneText/allennlp-master/allennlp/common/from_params.py:256]
[2023-09-14 23:54:04,196][RANK=00][I]: tokenizer.type = word [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:251]
[2023-09-14 23:54:04,196][RANK=00][I]: instantiating class <class 'allennlp.data.tokenizers.word_tokenizer.WordTokenizer'> from params {'word_splitter': {'type': 'just_spaces'}} and extras set() [/home/yjj23/SceneText/allennlp-master/allennlp/common/from_params.py:256]
[2023-09-14 23:54:04,197][RANK=00][I]: instantiating class <class 'allennlp.data.tokenizers.word_splitter.WordSplitter'> from params {'type': 'just_spaces'} and extras set() [/home/yjj23/SceneText/allennlp-master/allennlp/common/from_params.py:256]
[2023-09-14 23:54:04,197][RANK=00][I]: tokenizer.word_splitter.type = just_spaces [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:251]
[2023-09-14 23:54:04,197][RANK=00][I]: instantiating class <class 'allennlp.data.tokenizers.word_splitter.JustSpacesWordSplitter'> from params {} and extras set() [/home/yjj23/SceneText/allennlp-master/allennlp/common/from_params.py:256]
[2023-09-14 23:54:04,197][RANK=00][I]: tokenizer.start_tokens = None [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:251]
[2023-09-14 23:54:04,197][RANK=00][I]: tokenizer.end_tokens = None [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:251]
[2023-09-14 23:54:04,197][RANK=00][I]: instantiating class <class 'allennlp.data.token_indexers.token_characters_indexer.TokenCharactersIndexer'> from params {'namespace': 'entity'} and extras set() [/home/yjj23/SceneText/allennlp-master/allennlp/common/from_params.py:256]
[2023-09-14 23:54:04,197][RANK=00][I]: namespace = entity [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:251]
[2023-09-14 23:54:04,197][RANK=00][I]: start_tokens = None [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:251]
[2023-09-14 23:54:04,197][RANK=00][I]: end_tokens = None [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:251]
[2023-09-14 23:54:04,197][RANK=00][I]: min_padding_length = 0 [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:251]
/home/yjj23/SceneText/allennlp-master/allennlp/data/token_indexers/token_characters_indexer.py:47: UserWarning: You are using the default value (0) of min_padding_length, which can cause some subtle bugs (more info see allenai/allennlp#1954). Strongly recommend to set a value, usually the maximum size of the convolutional layer size when using CnnEncoder.
warnings.warn("You are using the default value (0) of min_padding_length, "
[2023-09-14 23:54:04,281][RANK=00][I]: start logging [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:230]
[2023-09-14 23:54:04,281][RANK=00][I]: OUTPUT_DIR: ./outputs/vit_knowbert_bottle_0914235404 [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:231]
[2023-09-14 23:54:04,281][RANK=00][I]: TB_DIR: ./outputs/vit_knowbert_bottle_0914235404/others/tb_logs [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:232]
[2023-09-14 23:54:04,283][RANK=00][I]: configs:
{'ACCUM_ITERS': 32,
'ATTENTION_DEV': False,
'BATCH_SIZE_PERGPU': 8,
'BERT_BOTTLE_CHECKPOINT_PATH': 'pretrained/BERT_pretrained_on_bottle.pth',
'BERT_MLM_CHECKPOINT_PATH': 'pretrained/BERT_pretrained_mlm.pth',
'DATASET_TYPE': 'bottle',
'DEBUG': False,
'DEVICE': 'cuda',
'DISTRIBUTED': False,
'DYNACONF_INCLUDE': ['train_base.toml', 'bottle.toml'],
'EFFECTIVE_BATCH_SIZE': 256,
'EMBEDDING_PATH': '',
'EMBEDDING_PATH_FASTTEXT': '/data1/yjj/SceneText/bottle/fasttext',
'EMBEDDING_PATH_GLOVE': '/data1/yjj/SceneText/bottle/glove/glove_300',
'FREEZE_VIT_KNOWBERT': False,
'GOOGLE_OCR_PATH': '/data1/yjj/SceneText/bottle/google_ocr',
'HEAD_TYPE': 18,
'IMG_ONLY': False,
'INTERACTION': {},
'INTERACTION_MODEL': True,
'IS_TESTING_LR_RANGE': False,
'LOAD_DOTENV': True,
'LOCAL_RANK': 0,
'LOG_EVERY_STEP': 100,
'LR': 3e-05,
'LR_COSINE_T0': 1000,
'LR_COSINE_T_MULT': 1,
'LR_NO_RESTARTS': False,
'LR_WARMUP_STEP': 1000,
'MASTER_ADDR': '127.0.0.1',
'NUM_CLASS': 20,
'NUM_EPOCHS': 50,
'NUM_EPOCH_FREEZE': 40,
'NUM_T': 25,
'NUM_WORKERS': 8,
'OUTPUT_DIR': './outputs/vit_knowbert_bottle_0914235404',
'POSITION_EMBEDDING': False,
'PRETRAINED_BERT': 'Wikipedia',
'PRETRAINED_VISION': 'ImageNet',
'PRETRAINED_WHOLE_MODEL': 'None',
'RANK': 0,
'ROOT_PATH': '/data1/yjj/SceneText/bottle',
'SAVE_MODEL_EVERY_STEP': 16000,
'SEED': 42,
'SGD_MOMENTUM': 0.9,
'SGD_WEIGHT_DECAY': 0,
'TB_DIR': './outputs/vit_knowbert_bottle_0914235404/others/tb_logs',
'TEST_ONLY': False,
'TEXT_BACKBONE': 'knowbert',
'TEXT_PATH': '/data1/yjj/SceneText/bottle/texts',
'TOKEN_ONLY': False,
'UNFREEZE_ALL_STEP': 3080,
'USE_AMP': True,
'USE_BBOX_EMBEDDING': False,
'USE_CATEGORY': False,
'USE_GOOGLE_OCR': True,
'USE_MULTISTEP': False,
'USE_NUM_T': True,
'USE_PADDLE_OCR': False,
'USE_TIMM': True,
'VALIDATION_EVERY_STEP': 400,
'VISION_BACKBONE': 'vit',
'VISION_BOTTLE_CHECKPOINT_PATH': 'best_trained_model.pth',
'VIT_IMAGENET_CHECKPOINT_PATH': 'pretrained/ViT-B_16.npz',
'VKAC_DROPOUT': 0.0,
'WIKIDATA_PATH': '/data1/yjj/SceneText/bottle/wikidata_result',
'WORLD_SIZE': 1} [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:233]
[2023-09-14 23:54:04,283][RANK=00][I]: cfg.local_rank=0, cfg.rank=0, cfg.world_size=1, cfg.distributed=False [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:234]
[2023-09-14 23:54:04,284][RANK=00][I]: loading datasets... [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:107]
[2023-09-14 23:54:04,332][RANK=00][I]: len(trainset): 12325 [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:141]
[2023-09-14 23:54:04,332][RANK=00][I]: len(valset): 6163 [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:142]
[2023-09-14 23:54:04,332][RANK=00][I]: len(train_loader): 1541 [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:179]
[2023-09-14 23:54:04,332][RANK=00][I]: len(val_loader): 771 [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:180]
[2023-09-14 23:54:04,333][RANK=00][I]: new cfg.unfreeze_all_step = 62640 [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:289]
[2023-09-14 23:54:04,333][RANK=00][I]: new cfg.LR_COSINE_T0 = 3042 [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:290]
[2023-09-14 23:54:08,174][RANK=00][I]: archive_file = https://allennlp.s3-us-west-2.amazonaws.com/knowbert/models/knowbert_wiki_wordnet_model.tar.gz [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:251]
[2023-09-14 23:54:08,175][RANK=00][I]: overrides = None [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:251]
[2023-09-14 23:54:09,389][RANK=00][I]: https://allennlp.s3-us-west-2.amazonaws.com/knowbert/models/knowbert_wiki_wordnet_model.tar.gz not found in cache, downloading to /tmp/tmpxakfp5fi [/home/yjj23/SceneText/allennlp-master/allennlp/common/file_utils.py:222]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1935373195/1935373195 [03:18<00:00, 9755375.98B/s]
[2023-09-14 23:57:29,130][RANK=00][I]: copying /tmp/tmpxakfp5fi to cache at /home/yjj23/.allennlp/cache/f9ae390d324418b5fd7be2cdd2344e53aa911e6f442647664e60409fe3997116.aa3ff7c3c096d56d836ce729ee5ca504b205dd51ab08b4523a4f87edcc2e6cc7 [/home/yjj23/SceneText/allennlp-master/allennlp/common/file_utils.py:235]
[2023-09-14 23:57:33,610][RANK=00][I]: creating metadata file for /home/yjj23/.allennlp/cache/f9ae390d324418b5fd7be2cdd2344e53aa911e6f442647664e60409fe3997116.aa3ff7c3c096d56d836ce729ee5ca504b205dd51ab08b4523a4f87edcc2e6cc7 [/home/yjj23/SceneText/allennlp-master/allennlp/common/file_utils.py:239]
[2023-09-14 23:57:33,614][RANK=00][I]: removing temp file /tmp/tmpxakfp5fi [/home/yjj23/SceneText/allennlp-master/allennlp/common/file_utils.py:245]
[2023-09-14 23:57:34,031][RANK=00][I]: loading archive file https://allennlp.s3-us-west-2.amazonaws.com/knowbert/models/knowbert_wiki_wordnet_model.tar.gz from cache at /home/yjj23/.allennlp/cache/f9ae390d324418b5fd7be2cdd2344e53aa911e6f442647664e60409fe3997116.aa3ff7c3c096d56d836ce729ee5ca504b205dd51ab08b4523a4f87edcc2e6cc7 [/home/yjj23/SceneText/allennlp-master/allennlp/models/archival.py:175]
[2023-09-14 23:57:34,032][RANK=00][I]: extracting archive file /home/yjj23/.allennlp/cache/f9ae390d324418b5fd7be2cdd2344e53aa911e6f442647664e60409fe3997116.aa3ff7c3c096d56d836ce729ee5ca504b205dd51ab08b4523a4f87edcc2e6cc7 to temp dir /tmp/tmp6rttolqm [/home/yjj23/SceneText/allennlp-master/allennlp/models/archival.py:182]
[2023-09-14 23:57:50,807][RANK=00][W]: _jsonnet not loaded, treating /tmp/tmp6rttolqm/config.json as json [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:21]
Traceback (most recent call last):
File "main.py", line 121, in
main()
File "main.py", line 117, in main
return train_knowbert.main()
File "/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py", line 294, in main
model = NetWithAttention(cfg)
File "/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/model/vit_knowbert_interaction_timm.py", line 56, in init
self.knowbert = ModelArchiveFromParams.from_params(params=params)
File "/home/yjj23/SceneText/kb-master/kb/include_all.py", line 50, in from_params
archive = load_archive(archive_file)
File "/home/yjj23/SceneText/allennlp-master/allennlp/models/archival.py", line 214, in load_archive
config = Params.from_file(os.path.join(serialization_dir, CONFIG_NAME), overrides)
File "/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py", line 459, in from_file
file_dict = json.loads(evaluate_file(params_file, ext_vars=ext_vars))
File "/home/yjj23/anaconda3/envs/scenetext/lib/python3.8/json/init.py", line 357, in loads
return _default_decoder.decode(s)
File "/home/yjj23/anaconda3/envs/scenetext/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/home/yjj23/anaconda3/envs/scenetext/lib/python3.8/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 381 column 5 (char 15191)
[2023-09-14 23:57:50,820][RANK=00][I]: removing temporary unarchived model dir at /tmp/tmp6rttolqm [/home/yjj23/SceneText/allennlp-master/allennlp/models/archival.py:237]

@github-actions
Copy link

Hi! This is your first issue. Welcome!

@Leojc
Copy link
Collaborator

Leojc commented Sep 15, 2023

It seems an error occur when decoding this file /tmp/tmp6rttolqm/config.json . You can open and see what's wrong. Or maybe download it again manually.

@Leojc Leojc added the staled label Jan 20, 2024
@Leojc Leojc closed this as completed Jan 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants