Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue in visualizer.py when config_on_hypercube=True #59

Closed
ndangtt opened this issue Nov 21, 2017 · 6 comments
Closed

Issue in visualizer.py when config_on_hypercube=True #59

ndangtt opened this issue Nov 21, 2017 · 6 comments

Comments

@ndangtt
Copy link

ndangtt commented Nov 21, 2017

In visualizer.py:

  • function generate_marginal, for continuous parameter, I guess that "grid" should be built within [0,1] (line 187) if "config_on_hypercube=True", before it is passed to "fanova.marginal_mean_variance_for_values", since fanova is trained on [0,1] scale data points in this case.

  • And the same issue in function generate_pairwise_marginal

@Krxsy
Copy link

Krxsy commented Nov 22, 2017

@ndangtt I didn't encounter any issue when I tested it.
Could you send me a minimal working example, please?

@ndangtt
Copy link
Author

ndangtt commented Nov 22, 2017

test.zip

Please find the example in the attachment. I called fanova through the PIMP package. You can find detailed information in test/Readme.txt :)

@AndreBiedenkapp
Copy link

Hi Nguyen,
I encountered the same problem when incorporating the latest fANOVA changes into PIMP.
For that reason I opened #56. When I realized that the fANOVA issues were just caused by having the Parameter input being squeezed on a unit-hypercube I changed the PIMP fANOVA preprocessing code such that the data fANOVA sees are not on a unit-hypercube anymore.

The relevant changes to PIMP are only in the following function:
https://github.com/automl/ParameterImportance/blob/466918ac7ddb8631ba4d3a778d55155db4c7a36f/pimp/evaluator/fanova.py#L43-L69

Other than that @Krxsy changed how the grid is generated. It first is generated on [0, 1] and than for plotting purposes, all values are transformed back into the original parameter ranges. After that she closed #56.

If these latest changes still cause you trouble let me know and we'll see what is going wrong.
Best,
André

@ndangtt
Copy link
Author

ndangtt commented Nov 22, 2017

Hi André (and Christina),

Thank you! You're right. I missed the latest change in that function. I've also realized that since I didn't use instance features, the _preprocess function was not called. Now the plot is generated!

However, I am a little bit confused now. Why do you no longer give [0,1] data into fanova? If I understand it correctly, other tools in PIMP use the model trained on [0,1] data, so wouldn't it be inconsistent if fanova analysis use the non-normalized one?

I saw that in PIMP you now set config_on_hypercube=False, so the code when config_on_hypercube=True inside visualizer.py is actually not triggered. But suppose that config_on_hypercube=True, and you give [0,1] data to fanova, then in visualizer.py -> generate_marginal, I guess that the grid should be first generated in the range of [0,1] instead of [lower_bound,upper_bound], and after the mean and std are calculated, grid is then transformed back to the original range for plotting. Or perhaps I misunderstood it?

Kind regards,
Nguyen

@Krxsy
Copy link

Krxsy commented Nov 22, 2017

No problem. Thanks for letting us know :-)

@Krxsy Krxsy closed this as completed Nov 22, 2017
@AndreBiedenkapp
Copy link

Hi @ndangtt,
internally fANOVA also uses the same ConfigSpace object as PIMP. That means, that in the end everything will still end up on [0, 1] when reading in the X and y matrices and ultimately computing the marginals.
So already putting everything into fANOVA on [0, 1] just meant that there didn't have to be a second transformation step. However this caused issues with how the fANOVA code handles the plotting.
The bug originally showed up, because the grid for prediction was not on [0, 1] as the retransformation step was executed too early. This caused the plots to just be empty, as the Prediction was outside of the plotted area.

The change for PIMP to not input the already transformed data was chosen such that the internal flow of how things are handled in fANOVA doesn't change. As PIMP uses the same ConfigSpace object that fANOVA uses, the transformation and retransformation will result in the correct values. It just now is more straight forward to plot the results in fANOVA.

Best,
André

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants