Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问作者设计的斗地主状态和动作数量分别是多少呢,每一个状态对应的动作奖励是多少呢? #14

Open
peterwangx opened this issue Aug 13, 2020 · 3 comments

Comments

@peterwangx
Copy link

请问作者设计的斗地主状态和动作数量分别是多少呢,每一个状态对应的动作奖励是多少呢?

@qq456cvb
Copy link
Owner

斗地主的状态量时通过组合数学中 N choose K with duplicates问题(54张牌,取17张,不同花色认为是同一张牌;下一个人37再取17张,同理)求解的,网上应该有现成的求解算法和求解器,动作数量大约有13000+,在utils里有计算。动作奖励是sparse的,只有最后赢了会有+1 reward,否则-1 reward.

@loserZhang
Copy link

这个算法最后的效果怎么样呢,和真人高手水平比哪个强

@qq456cvb
Copy link
Owner

由于我们没有大量的分布式资源去训练和利用类似DeepMind starcraft那种arena pool的训练方式,真人高手还是强一些,但可以和普通玩家打得有来有回。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants