Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Target Transformation with reversivle transformers leads to faulty scoring #236

Closed
1 task done
samihamdan opened this issue Sep 7, 2023 · 2 comments
Closed
1 task done
Labels
bug Something isn't working

Comments

@samihamdan
Copy link
Collaborator

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

Using z-scoring leads to a wrong scoring as probably we evaluate the correctly inverse-transformed prediction to a scored ground truth. You can see that as r2_corr seems fine but r2 shows a high error as its scale sensitive.
See the following image.

Expected Behavior

scoring with inversible scorers scores against the original ground truth

Steps To Reproduce

image

Environment

anyio==4.0.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.2.3
asttokens==2.4.0
async-lru==2.0.4
attrs==23.1.0
Babel==2.12.1
backcall==0.2.0
beautifulsoup4==4.12.2
bleach==6.0.0
certifi==2023.7.22
cffi==1.15.1
charset-normalizer==3.2.0
comm==0.1.4
contourpy==1.1.0
cycler==0.11.0
debugpy==1.6.7.post1
decorator==5.1.1
defusedxml==0.7.1
executing==1.2.0
fastjsonschema==2.18.0
fonttools==4.42.1
fqdn==1.5.1
idna==3.4
ipykernel==6.25.2
ipython==8.15.0
ipython-genutils==0.2.0
ipywidgets==8.1.0
isoduration==20.11.0
jedi==0.19.0
Jinja2==3.1.2
joblib==1.3.2
json5==0.9.14
jsonpointer==2.4
jsonschema==4.19.0
jsonschema-specifications==2023.7.1
julearn==0.3.0
jupyter==1.0.0
jupyter-console==6.6.3
jupyter-events==0.7.0
jupyter-lsp==2.2.0
jupyter_client==8.3.1
jupyter_core==5.3.1
jupyter_server==2.7.3
jupyter_server_terminals==0.4.4
jupyterlab==4.0.5
jupyterlab-pygments==0.2.2
jupyterlab-widgets==3.0.8
jupyterlab_server==2.24.0
kiwisolver==1.4.5
MarkupSafe==2.1.3
matplotlib==3.7.2
matplotlib-inline==0.1.6
mistune==3.0.1
nbclient==0.8.0
nbconvert==7.8.0
nbformat==5.9.2
nest-asyncio==1.5.7
notebook==7.0.3
notebook_shim==0.2.3
numpy==1.25.2
overrides==7.4.0
packaging==23.1
pandas==2.0.3
pandocfilters==1.5.0
parso==0.8.3
patsy==0.5.3
pexpect==4.8.0
pickleshare==0.7.5
Pillow==10.0.0
platformdirs==3.10.0
prometheus-client==0.17.1
prompt-toolkit==3.0.39
psutil==5.9.5
ptyprocess==0.7.0
pure-eval==0.2.2
pycparser==2.21
Pygments==2.16.1
pyparsing==3.0.9
python-dateutil==2.8.2
python-json-logger==2.0.7
pytz==2023.3.post1
PyYAML==6.0.1
pyzmq==25.1.1
qtconsole==5.4.4
QtPy==2.4.0
referencing==0.30.2
requests==2.31.0
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rpds-py==0.10.2
scikit-learn==1.3.0
scipy==1.11.2
seaborn==0.12.2
Send2Trash==1.8.2
six==1.16.0
sniffio==1.3.0
soupsieve==2.5
stack-data==0.6.2
statsmodels==0.14.0
terminado==0.17.1
threadpoolctl==3.2.0
tinycss2==1.2.1
tornado==6.3.3
traitlets==5.9.0
tzdata==2023.3
uri-template==1.3.0
urllib3==2.0.4
wcwidth==0.2.6
webcolors==1.13
webencodings==0.5.1
websocket-client==1.6.2
widgetsnbextension==4.0.8

Relevant log output

No response

Anything else?

No response

@samihamdan samihamdan added the bug Something isn't working label Sep 7, 2023
@fraimondo
Copy link
Contributor

Here's where the Extended Scorer transforms the y (always)

y_true = (
estimator
.steps[-1][-1] # last est
.transform_target(X_trans, y)
)

This is where the scorers are "wrapped" only if the extend parameter is true:

def _extend_scorer(scorer, extend):
if extend:
return _ExtendedScorer(scorer)
return scorer

This is where the check_scoring passes the wrap_score parameter as the extend parameter to _extend_scorer

def check_scoring(
estimator: EstimatorLike,
scoring: Union[ScorerLike, str, Callable, List[str], None],
wrap_score: bool
) -> Union[None, ScorerLike, Callable, Dict[str, ScorerLike]]:
"""Check the scoring.
Parameters
----------
estimator : EstimatorLike
estimator to check the scoring for
scoring : Union[ScorerLike, str, Callable]
scoring to check
wrap_score : bool
Does the score needs to be wrapped
to handle non_inverse transformable target pipelines.
"""
if scoring is None:
return scoring
if isinstance(scoring, str):
scoring = _extend_scorer(get_scorer(scoring), wrap_score)
if callable(scoring):
return _extend_scorer(
sklearn_check_scoring(estimator, scoring=scoring),
wrap_score)
if isinstance(scoring, list):
scorer_names = typing.cast(List[str], scoring)
scoring_dict = {score: _extend_scorer(get_scorer(score), wrap_score)
for score in scorer_names}
return _check_multimetric_scoring( # type: ignore
estimator, scoring_dict
)

This is where check_scoring is called in run_cross_validation:

julearn/julearn/api.py

Lines 348 to 350 in dba3071

scoring = check_scoring(pipeline, scoring,
wrap_score=wrap_score
)

Here are the two lines that set wrap_score to True, based on the presence of a target transformer:

wrap_score = expanded_models[-1]._added_target_transformer

wrap_score = pipeline_creator._added_target_transformer

So we always use the extended scorer, even if the y transformer is reversible. And in this specific case, scikit-learn transforms the y_pred back to the original space and julearn transforms the y_true to the transformed space, comparing bananas with potatoes.

@harveybi
Copy link

harveybi commented Oct 4, 2023

Also want to report something I observed before: Although I got the wrong scaled metrics when z-score target, but I found the Pearson correlation values for z-score target or not are the same. Is that expected? Since I found the metrics are always different when I z-score target or not by myself. Also example: https://chat.openai.com/share/f625997a-eb50-40af-9cbb-89d450cdb364

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants