Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about fine-tuning on squad dataset #24

Closed
curtis0982 opened this issue Mar 26, 2020 · 3 comments
Closed

Question about fine-tuning on squad dataset #24

curtis0982 opened this issue Mar 26, 2020 · 3 comments

Comments

@curtis0982
Copy link

I downloaded this model and I tried to use this model.
I choosed the squad 2.0 dataset to fine-tune.
When I tried to fine-tune the model on the command line,
the program just stopped working and the command line seemed to exit the program.
The output is like this:


(env_tf115) D:\python_code\NLP\electra>python run_finetuning.py --data-dir "D:\python_code\NLP\electra\datadir" --model-name electra_small --hparams {"model_size": "small", "task_names": ["squad"], "num_trials": 2}
2020-03-26 22:00:19.349133: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
{"model_size": "small", "task_names": ["squad"], "num_trials": 2}
================================================================================
Config: model=electra_small, trial 1/2
================================================================================
answerable_classifier True
answerable_uses_start_logits True
answerable_weight 0.5
beam_size 20
data_dir D:\python_code\NLP\electra\datadir
debug False
do_eval True
do_lower_case True
do_train True
doc_stride 128
double_unordered True
embedding_size 128
eval_batch_size 32
gcp_project None
init_checkpoint D:\python_code\NLP\electra\datadir\models\electra_small
iterations_per_loop 1000
joint_prediction True
keep_all_models True
layerwise_lr_decay 0.8
learning_rate 0.0001
log_examples False
max_answer_length 30
max_query_length 64
max_seq_length 512
model_dir D:\python_code\NLP\electra\datadir\models\electra_small\finetuning_models\squad_model
model_hparam_overrides {}
model_name electra_small
model_size small
n_best_size 20
n_writes_test 5
num_tpu_cores 1
num_train_epochs 2.0
num_trials 2
predict_batch_size 32
preprocessed_data_dir D:\python_code\NLP\electra\datadir\models\electra_small\finetuning_tfrecords\squad_tfrecords
qa_eval_file <built-in method format of str object at 0x0000027494EB70B0>
qa_na_file <built-in method format of str object at 0x0000027494EB1AE0>
qa_na_threshold -2.75
qa_preds_file <built-in method format of str object at 0x0000027494EB70F0>
raw_data_dir <built-in method format of str object at 0x0000027494EAECA8>
results_pkl D:\python_code\NLP\electra\datadir\models\electra_small\results\squad_results.pkl
results_txt D:\python_code\NLP\electra\datadir\models\electra_small\results\squad_results.txt
save_checkpoints_steps 1000000
task_names ['squad']
test_predictions <built-in method format of str object at 0x0000027494EAD8F0>
tpu_job_name None
tpu_name None
tpu_zone None
train_batch_size 32
use_tfrecords_if_existing True
use_tpu False
vocab_file D:\python_code\NLP\electra\datadir\models\electra_small\vocab.txt
vocab_size 30522
warmup_proportion 0.1
weight_decay_rate 0.01
write_distill_outputs False
write_test_outputs False

Loading dataset squad_train
Existing tfrecords not found so creating

(env_tf115) D:\python_code\NLP\electra>

My computer setting:
Windows 10
Cuda 10.0.130
Cudnn 7.6.3
Tensorflow 1.15 GPU

I've put the pre-trained model file under "datadir\models\electra_small" directory
and squad dataset under "datadir\finetuning_data\squad" directory.
Does anyone know why the fine-tuning is not working?

@clarkkev
Copy link
Collaborator

Huh, that is strange for it to exit with no error message. Can you add some print statements to find exactly at which point it exits?

@curtis0982
Copy link
Author

curtis0982 commented Mar 27, 2020

I add some print statements in preprocessing.py to see what's going on:
` utils.log("Loading dataset", dataset_name)

n_examples = None

if (self._config.use_tfrecords_if_existing and

    tf.io.gfile.exists(metadata_path)):

  n_examples = utils.load_json(metadata_path)["n_examples"]

print("      n_examples = utils.load_json(metadata_path)[n_examples]\n")

if n_examples is None:

  utils.log("Existing tfrecords not found so creating")

  examples = []

  print("\n examples = [] \n")

  for task in tasks:
    print("\ntask:",task,"\n")

    task_examples = task.get_examples(split)

    print("\n task_examples = task.get_examples(split)\n")

    examples += task_examples

    print("\nexamples += task_examples\n")

  if is_training:
    random.shuffle(examples)

    print("\n random.shuffle(examples) \n")

  utils.mkdir(tfrecords_path.rsplit("/", 1)[0])

  print("\n utils.mkdir(tfrecords_path.rsplit("/", 1)[0]) \n")

  n_examples = self.serialize_examples(
      examples, is_training, tfrecords_path, batch_size)

  print("\n n_examples = self.serialize_examples( \n")

  utils.write_json({"n_examples": n_examples}, metadata_path)`

The output is like this:
Loading dataset squad_train
n_examples = utils.load_json(metadata_path)[n_examples]

Existing tfrecords not found so creating

examples = []

task: Task(squad)

(env_tf115) D:\python_code\NLP\electra>

It seems that what is not working properly is "task_examples = task.get_examples(split)"
I'm trying to figure it out.

To see what is going on with .get_examples method
I checked qa_tasks.py
I found out that this line is not working:
input_data = json.load(f)["data"]

@curtis0982
Copy link
Author

curtis0982 commented Mar 28, 2020

Updated:
In my case
Some code in finetune/qa/qa_tasks.py is not functioning properly.
'with tf.io.gfile.GFile(os.path.join(
self.config.raw_data_dir(self.name),
split + ("-debug" if self.config.debug else "") + ".json"), "r") as f:'
I printed
os.path.join(self.config.raw_data_dir(self.name),split + ("-debug" if self.config.debug else "") + ".json
to see if the path is right.

My squad datadir located "D:\python_code\NLP\electra\datadir\finetuning_data\squad"
,so the squad training set path should be
"D:\python_code\NLP\electra\datadir\finetuning_data\squad\train.json"
But it is actually "squad\train.json"
So I changed this line
self.config.raw_data_dir(self.name)
to my data_dir path.
The program started making tfrecord files, so I think this issue has been solved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants