Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What Is the Point of Using "n" in UCB Formula? #38

Closed
Parnia opened this issue Oct 1, 2018 · 1 comment
Closed

What Is the Point of Using "n" in UCB Formula? #38

Parnia opened this issue Oct 1, 2018 · 1 comment

Comments

@Parnia
Copy link

Parnia commented Oct 1, 2018

Hello,

In "ucb1.py" in the "rl" folder for solving the bandits problem, what is the point of using "n=total times of playing" in the UCB formula which is: mean + np.sqrt(2*np.log(n) / nj) ?
I tested the two following formulas (without "n" ) instead and they worked totally fine:
mean + np.sqrt(2 / nj)
and even
mean + (1 / nj)
I also tested them with different total number of plays, but the final results of the agents were so similar.
I would be grateful if you elaborate on the usage of n in the formula.

Best,
Parnia

@lazyprogrammer
Copy link
Owner

Post your course-related questions on the course Q&A

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants