Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

相应的论文 #1

Open
Jksnck opened this issue Jul 15, 2022 · 3 comments
Open

相应的论文 #1

Jksnck opened this issue Jul 15, 2022 · 3 comments

Comments

@Jksnck
Copy link

Jksnck commented Jul 15, 2022

作者大大,有没有代码相应的论文啊,想学习一下。刚入门Qlearning,还不是太熟悉

@sumizomechou
Copy link
Owner

这只是我刚学习Q学习时结合我研究方向的一个小demo,可能都不一定对……
要学习的话还是推荐看莫烦大佬的教学,讲的非常棒:
https://mofanpy.com/tutorials/machine-learning/reinforcement-learning/

@Jksnck
Copy link
Author

Jksnck commented Jul 15, 2022

好的,谢谢作者大大了。还有就是请问一下在运用Qlearning来解决有约束问题的目标函数时,您是怎样来处理那些约束的?(望解答)

@sumizomechou
Copy link
Owner

sumizomechou commented Jul 15, 2022

Qlearning最关键的地方就是奖励函数的设计了,我的理解是这样的:奖励函数和目标函数的关系为前者是后者的微观表达,并且奖励函数可以通过推导最终得到目标函数。关于目标函数的约束在奖励函数的表现,我看过的示例一般都用一个较大的奖励值或惩罚值来体现,比如说游戏中达成目标时会获得一个较高的正奖励值,失败之类的给一个负奖励值。以人在玩游戏时的情况类比就是人在接近胜利情绪会逐渐上升,胜利了会很高兴,失败了会失落愤怒等。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants