Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smart choices on important options #2

Open
mchsyu opened this issue May 15, 2018 · 1 comment
Open

Smart choices on important options #2

mchsyu opened this issue May 15, 2018 · 1 comment

Comments

@mchsyu
Copy link

mchsyu commented May 15, 2018

In the subsection, under the title A.2 importance features, of the appendix, the article mentioned that based on Table 3, Harmonica concluded that the Initial learning rate of the small network and for the large network is in the range from 0.001 to 0.1. (At stage 1-3, 04. Initial learning rate *05. Initial learning rate (Detail 1))

My question is: how can we conclude this statement from the 4th, 5th and 6th options (Initial learning rate) ? For example, If "-1" stands for "T", (x_4, x_5, x_6)= (-1, -1, -1) means the initial learning rate=0.3. Do I take this right?

If I do, then since Table 3 suggests x_4* x_5 is important, I might get one of the ranges, >= 0.1, [0.01, 0.1], [0.001, 0.01], or <= 0.001.

The paper seemed to locate none of them.

@callowbird
Copy link
Owner

Thanks. In table 3, the feature 1-5 is X4, which has negative weight, showing that we want X4=1, i.e., learning rate <0.01.
Then, the feature 1-3 is X4X5, with positive weight, showing that we want X4X5=-1, so X5=-1. That is, [0.001,0.01].

Such inference is not necessary in the algorithm, as we simply enumerate all possibilities of all selected hyperparameters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants