-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Results not saved due to issues with get_digest #194
Comments
This is a good question and it may be related to the problem in doc generation of PR #174 |
I could not reproduce your problem:
and I obtain something like :
|
Maybe a MWE from my side: from moabb.analysis.results import get_digest
class Classifier:
def __init__(self, kernel):
self.kernel = kernel
still_working_1 = Classifier(kernel=lambda x: x**2)
still_working_2 = Classifier(kernel=lambda x: 5*x)
print()
for ppl in [still_working_1, still_working_2]:
print(get_digest(ppl)) This will print (on my machine):
I know the issue is caused by wiping memory addresses during |
Yes, the |
Yes I do have a custom LDA classifier inheriting from the scikit estimators that shows the same behaviour. And I know that the issue comes from from moabb.analysis.results import get_string_rep
class Classifier:
def __init__(self, kernel):
self.kernel = kernel
still_working_1 = Classifier(kernel=lambda x: x**2)
still_working_2 = Classifier(kernel=lambda x: 5*x)
for ppl in [still_working_1, still_working_2]:
print(repr(ppl))
print(get_string_rep(ppl)) My points are:
|
Interesting, I'm wondering if the lambda expression isn't playing a role here. Anyway, I agree with you on the second point : MOABB should not pool the results of both classifier. For the first point, I'm not sure about your concern. There is a Do you have an idea on how could we avoid the pooling part? |
I do agree that
Not really for now, as I usually avoid using cached results whenever possible and am not aware of the specific behaviour. But we could raise an error / a warning whenever a classifier that contains lambda functions in their constructor is used with |
Yes, we could raise a warning if there is a lambda function. Could be interesting to use GridSearchCV for evaluating your different kernel instead of creating separate pipelines? Something along the lines of:
|
That would be possible when I only want to report performance of the best performing kernel. But this way I do not know which kernel actually worked best, and could not compare the classification performance of two different kernels. |
The
get_digest
function yields duplicate hash values even if two classifier objects have different internal state via different lambda functions.moabb/moabb/analysis/results.py
Line 111 in fd0ffe6
Short example:
As of now, after running moabb the results file / dataframe would contain 2 scores for
pipeline_b
as theget_digest
returns the equivalent hash for both pipelines. (The memory addresses identifying the different lambda functions are wiped, but I do not really know why)Is there a reason moabb does not simply use the pipeline description string? Or just the hash thereof? I.e.:
The text was updated successfully, but these errors were encountered: