We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
作者大大,有没有代码相应的论文啊,想学习一下。刚入门Qlearning,还不是太熟悉
The text was updated successfully, but these errors were encountered:
这只是我刚学习Q学习时结合我研究方向的一个小demo,可能都不一定对…… 要学习的话还是推荐看莫烦大佬的教学,讲的非常棒: https://mofanpy.com/tutorials/machine-learning/reinforcement-learning/
Sorry, something went wrong.
好的,谢谢作者大大了。还有就是请问一下在运用Qlearning来解决有约束问题的目标函数时,您是怎样来处理那些约束的?(望解答)
Qlearning最关键的地方就是奖励函数的设计了,我的理解是这样的:奖励函数和目标函数的关系为前者是后者的微观表达,并且奖励函数可以通过推导最终得到目标函数。关于目标函数的约束在奖励函数的表现,我看过的示例一般都用一个较大的奖励值或惩罚值来体现,比如说游戏中达成目标时会获得一个较高的正奖励值,失败之类的给一个负奖励值。以人在玩游戏时的情况类比就是人在接近胜利情绪会逐渐上升,胜利了会很高兴,失败了会失落愤怒等。
No branches or pull requests
作者大大,有没有代码相应的论文啊,想学习一下。刚入门Qlearning,还不是太熟悉
The text was updated successfully, but these errors were encountered: