Skip to content

What Is the Point of Using "n" in UCB Formula? #38

@Parnia

Description

@Parnia

Hello,

In "ucb1.py" in the "rl" folder for solving the bandits problem, what is the point of using "n=total times of playing" in the UCB formula which is: mean + np.sqrt(2*np.log(n) / nj) ?
I tested the two following formulas (without "n" ) instead and they worked totally fine:
mean + np.sqrt(2 / nj)
and even
mean + (1 / nj)
I also tested them with different total number of plays, but the final results of the agents were so similar.
I would be grateful if you elaborate on the usage of n in the formula.

Best,
Parnia

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions