Can't load wiki archive knowbert_wiki_wordnet_model.tar.gz. #4

JingjunYi · 2023-09-14T16:10:11Z

When i run the training code, there is a mistake, can you help me solve this, thanks a lot.
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 381 column 5 (char 15191)

Error log:
(scenetext) [yjj23@gpu2 KnowledgeMiningWithSceneText-main]$ CUDA_VISIBLE_DEVICES=0 python main.py -c configs/train_knowbert_attention_bottle.toml
[2023-09-14 23:54:00,256][RANK=00][I]: unknown_args=[] [main.py:114]
[2023-09-14 23:54:03,193][RANK=00][I]: Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex . [/home/yjj23/anaconda3/envs/scenetext/lib/python3.8/site-packages/pytorch_pretrained_bert/modeling.py:230]
/home/yjj23/anaconda3/envs/scenetext/lib/python3.8/site-packages/sklearn/utils/linear_assignment_.py:18: FutureWarning: The linear_assignment_ module is deprecated in 0.21 and will be removed from 0.23. Use scipy.optimize.linear_sum_assignment instead.
warnings.warn(
[2023-09-14 23:54:04,196][RANK=00][I]: instantiating class <class 'allennlp.data.token_indexers.token_indexer.TokenIndexer'> from params {'type': 'characters_tokenizer', 'tokenizer': {'type': 'word', 'word_splitter': {'type': 'just_spaces'}}, 'namespace': 'entity'} and extras set() [/home/yjj23/SceneText/allennlp-master/allennlp/common/from_params.py:256]
[2023-09-14 23:54:04,196][RANK=00][I]: type = characters_tokenizer [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:251]
[2023-09-14 23:54:04,196][RANK=00][I]: instantiating class <class 'allennlp.data.tokenizers.tokenizer.Tokenizer'> from params {'type': 'word', 'word_splitter': {'type': 'just_spaces'}} and extras set() [/home/yjj23/SceneText/allennlp-master/allennlp/common/from_params.py:256]
[2023-09-14 23:54:04,196][RANK=00][I]: tokenizer.type = word [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:251]
[2023-09-14 23:54:04,196][RANK=00][I]: instantiating class <class 'allennlp.data.tokenizers.word_tokenizer.WordTokenizer'> from params {'word_splitter': {'type': 'just_spaces'}} and extras set() [/home/yjj23/SceneText/allennlp-master/allennlp/common/from_params.py:256]
[2023-09-14 23:54:04,197][RANK=00][I]: instantiating class <class 'allennlp.data.tokenizers.word_splitter.WordSplitter'> from params {'type': 'just_spaces'} and extras set() [/home/yjj23/SceneText/allennlp-master/allennlp/common/from_params.py:256]
[2023-09-14 23:54:04,197][RANK=00][I]: tokenizer.word_splitter.type = just_spaces [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:251]
[2023-09-14 23:54:04,197][RANK=00][I]: instantiating class <class 'allennlp.data.tokenizers.word_splitter.JustSpacesWordSplitter'> from params {} and extras set() [/home/yjj23/SceneText/allennlp-master/allennlp/common/from_params.py:256]
[2023-09-14 23:54:04,197][RANK=00][I]: tokenizer.start_tokens = None [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:251]
[2023-09-14 23:54:04,197][RANK=00][I]: tokenizer.end_tokens = None [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:251]
[2023-09-14 23:54:04,197][RANK=00][I]: instantiating class <class 'allennlp.data.token_indexers.token_characters_indexer.TokenCharactersIndexer'> from params {'namespace': 'entity'} and extras set() [/home/yjj23/SceneText/allennlp-master/allennlp/common/from_params.py:256]
[2023-09-14 23:54:04,197][RANK=00][I]: namespace = entity [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:251]
[2023-09-14 23:54:04,197][RANK=00][I]: start_tokens = None [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:251]
[2023-09-14 23:54:04,197][RANK=00][I]: end_tokens = None [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:251]
[2023-09-14 23:54:04,197][RANK=00][I]: min_padding_length = 0 [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:251]
/home/yjj23/SceneText/allennlp-master/allennlp/data/token_indexers/token_characters_indexer.py:47: UserWarning: You are using the default value (0) of min_padding_length, which can cause some subtle bugs (more info see allenai/allennlp#1954). Strongly recommend to set a value, usually the maximum size of the convolutional layer size when using CnnEncoder.
warnings.warn("You are using the default value (0) of min_padding_length, "
[2023-09-14 23:54:04,281][RANK=00][I]: start logging [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:230]
[2023-09-14 23:54:04,281][RANK=00][I]: OUTPUT_DIR: ./outputs/vit_knowbert_bottle_0914235404 [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:231]
[2023-09-14 23:54:04,281][RANK=00][I]: TB_DIR: ./outputs/vit_knowbert_bottle_0914235404/others/tb_logs [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:232]
[2023-09-14 23:54:04,283][RANK=00][I]: configs:
{'ACCUM_ITERS': 32,
'ATTENTION_DEV': False,
'BATCH_SIZE_PERGPU': 8,
'BERT_BOTTLE_CHECKPOINT_PATH': 'pretrained/BERT_pretrained_on_bottle.pth',
'BERT_MLM_CHECKPOINT_PATH': 'pretrained/BERT_pretrained_mlm.pth',
'DATASET_TYPE': 'bottle',
'DEBUG': False,
'DEVICE': 'cuda',
'DISTRIBUTED': False,
'DYNACONF_INCLUDE': ['train_base.toml', 'bottle.toml'],
'EFFECTIVE_BATCH_SIZE': 256,
'EMBEDDING_PATH': '',
'EMBEDDING_PATH_FASTTEXT': '/data1/yjj/SceneText/bottle/fasttext',
'EMBEDDING_PATH_GLOVE': '/data1/yjj/SceneText/bottle/glove/glove_300',
'FREEZE_VIT_KNOWBERT': False,
'GOOGLE_OCR_PATH': '/data1/yjj/SceneText/bottle/google_ocr',
'HEAD_TYPE': 18,
'IMG_ONLY': False,
'INTERACTION': {},
'INTERACTION_MODEL': True,
'IS_TESTING_LR_RANGE': False,
'LOAD_DOTENV': True,
'LOCAL_RANK': 0,
'LOG_EVERY_STEP': 100,
'LR': 3e-05,
'LR_COSINE_T0': 1000,
'LR_COSINE_T_MULT': 1,
'LR_NO_RESTARTS': False,
'LR_WARMUP_STEP': 1000,
'MASTER_ADDR': '127.0.0.1',
'NUM_CLASS': 20,
'NUM_EPOCHS': 50,
'NUM_EPOCH_FREEZE': 40,
'NUM_T': 25,
'NUM_WORKERS': 8,
'OUTPUT_DIR': './outputs/vit_knowbert_bottle_0914235404',
'POSITION_EMBEDDING': False,
'PRETRAINED_BERT': 'Wikipedia',
'PRETRAINED_VISION': 'ImageNet',
'PRETRAINED_WHOLE_MODEL': 'None',
'RANK': 0,
'ROOT_PATH': '/data1/yjj/SceneText/bottle',
'SAVE_MODEL_EVERY_STEP': 16000,
'SEED': 42,
'SGD_MOMENTUM': 0.9,
'SGD_WEIGHT_DECAY': 0,
'TB_DIR': './outputs/vit_knowbert_bottle_0914235404/others/tb_logs',
'TEST_ONLY': False,
'TEXT_BACKBONE': 'knowbert',
'TEXT_PATH': '/data1/yjj/SceneText/bottle/texts',
'TOKEN_ONLY': False,
'UNFREEZE_ALL_STEP': 3080,
'USE_AMP': True,
'USE_BBOX_EMBEDDING': False,
'USE_CATEGORY': False,
'USE_GOOGLE_OCR': True,
'USE_MULTISTEP': False,
'USE_NUM_T': True,
'USE_PADDLE_OCR': False,
'USE_TIMM': True,
'VALIDATION_EVERY_STEP': 400,
'VISION_BACKBONE': 'vit',
'VISION_BOTTLE_CHECKPOINT_PATH': 'best_trained_model.pth',
'VIT_IMAGENET_CHECKPOINT_PATH': 'pretrained/ViT-B_16.npz',
'VKAC_DROPOUT': 0.0,
'WIKIDATA_PATH': '/data1/yjj/SceneText/bottle/wikidata_result',
'WORLD_SIZE': 1} [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:233]
[2023-09-14 23:54:04,283][RANK=00][I]: cfg.local_rank=0, cfg.rank=0, cfg.world_size=1, cfg.distributed=False [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:234]
[2023-09-14 23:54:04,284][RANK=00][I]: loading datasets... [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:107]
[2023-09-14 23:54:04,332][RANK=00][I]: len(trainset): 12325 [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:141]
[2023-09-14 23:54:04,332][RANK=00][I]: len(valset): 6163 [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:142]
[2023-09-14 23:54:04,332][RANK=00][I]: len(train_loader): 1541 [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:179]
[2023-09-14 23:54:04,332][RANK=00][I]: len(val_loader): 771 [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:180]
[2023-09-14 23:54:04,333][RANK=00][I]: new cfg.unfreeze_all_step = 62640 [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:289]
[2023-09-14 23:54:04,333][RANK=00][I]: new cfg.LR_COSINE_T0 = 3042 [/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py:290]
[2023-09-14 23:54:08,174][RANK=00][I]: archive_file = https://allennlp.s3-us-west-2.amazonaws.com/knowbert/models/knowbert_wiki_wordnet_model.tar.gz [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:251]
[2023-09-14 23:54:08,175][RANK=00][I]: overrides = None [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:251]
[2023-09-14 23:54:09,389][RANK=00][I]: https://allennlp.s3-us-west-2.amazonaws.com/knowbert/models/knowbert_wiki_wordnet_model.tar.gz not found in cache, downloading to /tmp/tmpxakfp5fi [/home/yjj23/SceneText/allennlp-master/allennlp/common/file_utils.py:222]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1935373195/1935373195 [03:18<00:00, 9755375.98B/s]
[2023-09-14 23:57:29,130][RANK=00][I]: copying /tmp/tmpxakfp5fi to cache at /home/yjj23/.allennlp/cache/f9ae390d324418b5fd7be2cdd2344e53aa911e6f442647664e60409fe3997116.aa3ff7c3c096d56d836ce729ee5ca504b205dd51ab08b4523a4f87edcc2e6cc7 [/home/yjj23/SceneText/allennlp-master/allennlp/common/file_utils.py:235]
[2023-09-14 23:57:33,610][RANK=00][I]: creating metadata file for /home/yjj23/.allennlp/cache/f9ae390d324418b5fd7be2cdd2344e53aa911e6f442647664e60409fe3997116.aa3ff7c3c096d56d836ce729ee5ca504b205dd51ab08b4523a4f87edcc2e6cc7 [/home/yjj23/SceneText/allennlp-master/allennlp/common/file_utils.py:239]
[2023-09-14 23:57:33,614][RANK=00][I]: removing temp file /tmp/tmpxakfp5fi [/home/yjj23/SceneText/allennlp-master/allennlp/common/file_utils.py:245]
[2023-09-14 23:57:34,031][RANK=00][I]: loading archive file https://allennlp.s3-us-west-2.amazonaws.com/knowbert/models/knowbert_wiki_wordnet_model.tar.gz from cache at /home/yjj23/.allennlp/cache/f9ae390d324418b5fd7be2cdd2344e53aa911e6f442647664e60409fe3997116.aa3ff7c3c096d56d836ce729ee5ca504b205dd51ab08b4523a4f87edcc2e6cc7 [/home/yjj23/SceneText/allennlp-master/allennlp/models/archival.py:175]
[2023-09-14 23:57:34,032][RANK=00][I]: extracting archive file /home/yjj23/.allennlp/cache/f9ae390d324418b5fd7be2cdd2344e53aa911e6f442647664e60409fe3997116.aa3ff7c3c096d56d836ce729ee5ca504b205dd51ab08b4523a4f87edcc2e6cc7 to temp dir /tmp/tmp6rttolqm [/home/yjj23/SceneText/allennlp-master/allennlp/models/archival.py:182]
[2023-09-14 23:57:50,807][RANK=00][W]: _jsonnet not loaded, treating /tmp/tmp6rttolqm/config.json as json [/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py:21]
Traceback (most recent call last):
File "main.py", line 121, in
main()
File "main.py", line 117, in main
return train_knowbert.main()
File "/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/train_knowbert.py", line 294, in main
model = NetWithAttention(cfg)
File "/home/yjj23/SceneText/KnowledgeMiningWithSceneText-main/model/vit_knowbert_interaction_timm.py", line 56, in init
self.knowbert = ModelArchiveFromParams.from_params(params=params)
File "/home/yjj23/SceneText/kb-master/kb/include_all.py", line 50, in from_params
archive = load_archive(archive_file)
File "/home/yjj23/SceneText/allennlp-master/allennlp/models/archival.py", line 214, in load_archive
config = Params.from_file(os.path.join(serialization_dir, CONFIG_NAME), overrides)
File "/home/yjj23/SceneText/allennlp-master/allennlp/common/params.py", line 459, in from_file
file_dict = json.loads(evaluate_file(params_file, ext_vars=ext_vars))
File "/home/yjj23/anaconda3/envs/scenetext/lib/python3.8/json/init.py", line 357, in loads
return _default_decoder.decode(s)
File "/home/yjj23/anaconda3/envs/scenetext/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/home/yjj23/anaconda3/envs/scenetext/lib/python3.8/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 381 column 5 (char 15191)
[2023-09-14 23:57:50,820][RANK=00][I]: removing temporary unarchived model dir at /tmp/tmp6rttolqm [/home/yjj23/SceneText/allennlp-master/allennlp/models/archival.py:237]

The text was updated successfully, but these errors were encountered:

github-actions · 2023-09-14T16:10:50Z

Hi! This is your first issue. Welcome!

Leojc · 2023-09-15T06:45:40Z

It seems an error occur when decoding this file /tmp/tmp6rttolqm/config.json . You can open and see what's wrong. Or maybe download it again manually.

Leojc added the staled label Jan 20, 2024

Leojc closed this as completed Jan 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't load wiki archive knowbert_wiki_wordnet_model.tar.gz. #4

Can't load wiki archive knowbert_wiki_wordnet_model.tar.gz. #4

JingjunYi commented Sep 14, 2023

github-actions bot commented Sep 14, 2023

Leojc commented Sep 15, 2023

Can't load wiki archive knowbert_wiki_wordnet_model.tar.gz. #4

Can't load wiki archive knowbert_wiki_wordnet_model.tar.gz. #4

Comments

JingjunYi commented Sep 14, 2023

github-actions bot commented Sep 14, 2023

Leojc commented Sep 15, 2023