-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What's the training settings for track_1 and track_2 model #4
Comments
Hi, The We did not use curriculum learning or anything like this, although it might help a bit. |
Hi, |
On a P100 it should take a couple of days. |
I used the following command to train a model on track_1. python3 arnold.py --exp_name track_1 --main_dump_path $PWD/dumped \
--frame_skip 3 --action_combinations "attack+move_lr;turn_lr;move_fb" \
--network_type "dqn_rnn" --recurrence "lstm" --n_rec_layers 1 --hist_size 4 --remember 1 \
--labels_mapping "" --game_features "target,enemy" --bucket_size "[10, 1]" --dropout 0.5 \
--speed "on" --crouch "off" --map_ids_test 1 --manual_control 1 \
--scenario "deathmatch" --wad "deathmatch_rockets" --gpu_id 0 The training process runs up to 10119600 iterations while the log shows that the best performance model is best-120000.pth, achieving a frag score of 62. I also note that the variance of the frags score is quite large between different iterations. Is that normal? |
What is the K/D ratio? The number of frags isn't necessarily very relevant because a bot can get quite a good number of frags by shooting randomly. Did you visualize the agent to have a look at how it behaves? Regarding the variance between different iterations, this is normal yes. Variance should decrease if you increase the evaluation time, but the number of frags usually oscillates quite a lot (as opposed to the K/D which is usually more stable). |
Hello flyers. Did you reproduce the Pretrain model? I don't know if it is caused by my parameter setting. After three days of training, the track1 result is not very high, K/D ratio less than 1 @flyers |
First of all thanks for releasing the source code for the successful training of Doom agents.
The pretrained models contain the winner model of the last year's competition. May I know what's the exact training setting for those two models? Does it requires some curriculum learning stages?
Thanks very much.
The text was updated successfully, but these errors were encountered: