You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.
We often need to experiment with different configurations. If AllenNLP can automatically perform several training tasks in turn, it will save us some time. Yesterday I added a subcommand 'group-train', which is a wrapper of the train command ,and did a simple test, I found it’s very useful..
Command execution process
I just use allennlp-as-a-library-example for test.
experiments dir: venue_classifier.json, venue_classifier_boe.json, venue_classifier_boe_adam
Input: allennlp group-train /home/wxy/PycharmProjects/allennlp-as-a-library/experiments -s ./group_save --include-package my_library
process:
1.Before the training begins, create a training_progress.json file that will be used to record configuration files and whether its training has been completed. {"venue_classifier_boe_adam": false, "venue_classifier.json": false, "venue_classifier_boe.json": false}
2.Use a ’for‘ loop to train each json file in turn.
3.Modify the training_progress.json file after a training task is completed. {"venue_classifier_boe_adam": true, "venue_classifier.json": false, "venue_classifier_boe.json": false}
4.If the training is interrupted somewhere, it recovers training at the first 'false' according to the training_progress.json file and the training in the back is normal.
5.After all the training is completed, there are three dirs in the serialization_dir :ven_classifier,venue_classifier_boe,venue_classifier_boe_adam. Their names correspond to the configuration files. And training_progress.json file: {"venue_classifier_boe_adam": true, "venue_classifier.json": true, "venue_classifier_boe.json": true}
Additional context
1.Because I am not familiar with the test module, I tested my code with allennlp-as-a-library-example.
2.If the training is interrupted due to an abnormality, the subsequent training will not continue, so skipping the abnormal training is a better choice. But I don't know how to do this.
3.This code I uploaded to the allennlp repository I forked down.
Do you have any suggestions for my code? @matt-gardner
The text was updated successfully, but these errors were encountered:
I'm glad you found a good way to do this that works well for you! I think there are a lot of different ways to solve this basic problem, and we so far have taken the position that these kinds of scripts or commands should be outside the main library. Maybe some day we'll put something in here, but because there are so many different setups a person could have (a single GPU, multiple CPUs, multiple GPUs, some cloud infrastructure, a cluster with a job queue...), it's hard to make a general solution. I think the right thing to do is leave a link to your solution, like you did, so that others with a similar situation can use it if they find it useful.
We often need to experiment with different configurations. If AllenNLP can automatically perform several training tasks in turn, it will save us some time. Yesterday I added a subcommand 'group-train', which is a wrapper of the train command ,and did a simple test, I found it’s very useful..
Command execution process
I just use allennlp-as-a-library-example for test.
experiments dir: venue_classifier.json, venue_classifier_boe.json, venue_classifier_boe_adam
Input: allennlp group-train /home/wxy/PycharmProjects/allennlp-as-a-library/experiments -s ./group_save --include-package my_library
process:
1.Before the training begins, create a training_progress.json file that will be used to record configuration files and whether its training has been completed.
{"venue_classifier_boe_adam": false, "venue_classifier.json": false, "venue_classifier_boe.json": false}
2.Use a ’for‘ loop to train each json file in turn.
3.Modify the training_progress.json file after a training task is completed.
{"venue_classifier_boe_adam": true, "venue_classifier.json": false, "venue_classifier_boe.json": false}
4.If the training is interrupted somewhere, it recovers training at the first 'false' according to the training_progress.json file and the training in the back is normal.
5.After all the training is completed, there are three dirs in the serialization_dir :ven_classifier,venue_classifier_boe,venue_classifier_boe_adam. Their names correspond to the configuration files. And training_progress.json file:
{"venue_classifier_boe_adam": true, "venue_classifier.json": true, "venue_classifier_boe.json": true}
Additional context
1.Because I am not familiar with the test module, I tested my code with allennlp-as-a-library-example.
2.If the training is interrupted due to an abnormality, the subsequent training will not continue, so skipping the abnormal training is a better choice. But I don't know how to do this.
3.This code I uploaded to the allennlp repository I forked down.
Do you have any suggestions for my code?
@matt-gardner
The text was updated successfully, but these errors were encountered: