Reproducing the recent SL model featured in README.md #18

infphilo · 2022-01-02T15:38:56Z

@liuruoze, let me first thank you for your effort to develop a practical version of AlphaStar!

I am really interested in reproducing your recent SL-based agent, which you featured in README.md via several video clips. I’d like to know which StarCraft II version (e.g., 3.16.1) and training values (batch size and the number of epochs) you used when building the SL model.

Thank you again for your amazing work - hopefully, I can contribute to your work, too!

liuruoze · 2022-01-04T01:52:43Z

Hi, thanks for your attention to this project.

Actually, the results on the README.MD can be trained by our provided scripts and the code in our project. The hyper-parameters are set the same as defined in this project.

To use it, you should use the main (master) version, not the v1.05 release. The main version will be the up-to-publish v1.06 version.

Note that the replays used to train are self-played by us on the Simple64. We will also provide them in a few days (no more than two weeks).

As soon as the data are available, we will notify you (by @infphilo ) inside this issue.

infphilo · 2022-01-04T02:58:19Z

Thank you @liuruoze for the detailed response! I look forward to reproducing the model featured on README with your provided replay files (and with your code).

infphilo · 2022-01-21T18:48:28Z

Hello @liuruoze,

I just wanted to check with you to see if you had a chance to provide the replay data set you generated.

Thank you!
@infphilo

liuruoze · 2022-01-24T02:59:35Z

Hi, @infphilo sorry for replying late.

These days I was stuck in the fine-tuning of RL training of AlphaStar (It's very exhausting, plus we have added a new training way of multi-processing plus multi-thread to do it, making it harder to optimize). However, we will provide all the things (replays and an introduction to use) no later than the Chinese Spring Festival (this is the final due, we will not be beyond this deadline again).

Sorry again for the lateness. And thanks for your attention! I will @ you when all is ready.

liuruoze · 2022-01-28T07:11:24Z

Hi, @infphilo replays are published, you can check that in "https://github.com/liuruoze/Useful-Big-Resources/blob/main/replays/simple64_replays.7z", have a good time!

liuruoze · 2022-01-28T07:46:28Z

The v_1.07 is also published, you can find an introduction to use supervised learning and these replays in "https://github.com/liuruoze/mini-AlphaStar/tree/v_1.07". Good luck!

infphilo · 2022-01-28T19:38:45Z

@liuruoze Thank you for sharing the replays and additional information! I'll try to reproduce your excellent work and let you know how it goes. Have a wonderful Chinese Spring Festival!

infphilo · 2022-02-13T02:08:30Z

Hello @liuruoze

I hope you had a great Chinese Spring Festival!

Thanks to your help and your hard work, I believe that I was able to produce some results by training the supervised model on your provided replay information. I used the same hyperparameters (NUM_EPOCHS = 10, LEARNING_RATE = 1e-4, STEP_SIZE = 30, GAMMA = 0.2, WEIGHT_DECAY = 1e-5). My trained model seems to perform worse than what you had on the README.md, perhaps because I might have used a different version of mini-AlphaStar.

Nonetheless, I also trained the supervised model on 1,000 replays (SC2 version: 3.16.1, Protoss vs. Teran, mmr: 3,500, map: AbyssalReef), but the model doesn’t seem to perform well. Actions, selected units, and targets generated from the SL model don’t seem to make sense. These results may be in part due to overfitting.

I have two questions:

Would it be possible for you to provide me with the weights of the supervised model and/or the weights of the reinforcement model? It might help me figure out what I did incorrectly.
Can you also share with me some strategies on how to train models and hyperparameter settings?

Thank you again for your wonderful work and dedication to mAS.
@infphilo

liuruoze · 2022-02-14T11:34:21Z

Yes, of course. Could you please provide your email box? I will send the SL and RL models to you.
Actually, training on AbyssalRef is considerably harder than on Simple64. However, I think your problem is more due to underfitting than overfitting. A larger batch size and a long training time may alleviate the problem.

infphilo · 2022-02-14T11:48:11Z

Thank you! My email address is infphilo@gmail.com
Also, thank you for your suggestion, which I'll give a try.

liuruoze · 2022-02-17T00:08:52Z

Hi, @infphilo . We have sent you the email. Please check that.

Another trick we have tried and think is useful: you can increase your WEIGHT_DECAY to a larger value. A larger weight decay may be more effective. The mini-AlphaStar has much fewer parameters than AlphaStar. Increasing the weight decay can add the robustness of the model to overfitting.

liuruoze · 2022-03-02T01:50:47Z

Hi. It seems that there are no more questions. I will close this issue. If you find any new problems, please open a new issue to discuss them.

liuruoze closed this as completed Mar 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducing the recent SL model featured in README.md #18

Reproducing the recent SL model featured in README.md #18

infphilo commented Jan 2, 2022

liuruoze commented Jan 4, 2022

infphilo commented Jan 4, 2022

infphilo commented Jan 21, 2022

liuruoze commented Jan 24, 2022

liuruoze commented Jan 28, 2022

liuruoze commented Jan 28, 2022

infphilo commented Jan 28, 2022

infphilo commented Feb 13, 2022

liuruoze commented Feb 14, 2022

infphilo commented Feb 14, 2022

liuruoze commented Feb 17, 2022

liuruoze commented Mar 2, 2022

Reproducing the recent SL model featured in README.md #18

Reproducing the recent SL model featured in README.md #18

Comments

infphilo commented Jan 2, 2022

liuruoze commented Jan 4, 2022

infphilo commented Jan 4, 2022

infphilo commented Jan 21, 2022

liuruoze commented Jan 24, 2022

liuruoze commented Jan 28, 2022

liuruoze commented Jan 28, 2022

infphilo commented Jan 28, 2022

infphilo commented Feb 13, 2022

liuruoze commented Feb 14, 2022

infphilo commented Feb 14, 2022

liuruoze commented Feb 17, 2022

liuruoze commented Mar 2, 2022