Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: save Report.comparisons as JSON #4

Closed
PaulLerner opened this issue Jan 10, 2022 · 13 comments
Closed

feature request: save Report.comparisons as JSON #4

PaulLerner opened this issue Jan 10, 2022 · 13 comments
Labels
enhancement New feature or request

Comments

@PaulLerner
Copy link

Hi,

It’d be nice to be able to save a Report comparisons as a JSON file.
However, since it uses frozenset as keys, it is not JSON serializable.

Maybe you could add a method in https://github.com/AmenRa/ranx/blob/master/ranx/frozenset_dict.py to convert the _map to a JSON serializable dict, i.e. with str keys?
The str keys could be converted from the frozenset like: ', '.join(frozenset({'foo', 'bar'}))

@AmenRa
Copy link
Owner

AmenRa commented Jan 10, 2022

Hi, an export option for the Report class is already on my to-do list! :)

I will come back with a proposal so that we can discuss it before I implement the functionality.

@AmenRa AmenRa added the enhancement New feature or request label Jan 13, 2022
@AmenRa
Copy link
Owner

AmenRa commented Jan 14, 2022

Hey, sorry for the delay.

This is my proposal for the Report.to_dict function (I can add a Report.save_as_json function for convenience too):

{
    # metrics and model_names allows to read the report without
    # inspecting the json to discover the used metrics and
    # the compared models
    "metrics": ["metric_1", "metric_2", ...],
    "model_names": ["model_1", "model_2", ...],
    #
    "model_1": {
        "scores": {
            "metric_1": ...,
            "metric_2": ...,
            ...
        },
        "comparisons": {
            "model_2": {
                "metric_1": ...,  # p-value
                "metric_2": ...,  # p-value
                ...
            },
            ...
        },
        "win_tie_loss": {
            "model_2": {
                "W": ...,
                "T": ...,
                "L": ...,
            },
            ...
        },
    },
    ...
}

Let me know what you think. :)

@PaulLerner
Copy link
Author

Looks great (and there was not so much delay 😅)!

@AmenRa
Copy link
Owner

AmenRa commented Jan 14, 2022

I added Report.to_dict and Report.save.
I updated ranx on PyPi with these new features.

Closing.

@AmenRa AmenRa closed this as completed Jan 14, 2022
@PaulLerner
Copy link
Author

I’m getting a "TypeError: Object of type int64 is not JSON serializable" which is probably coming from numba or numpy

@AmenRa
Copy link
Owner

AmenRa commented Jan 25, 2022

Yeah, I know about that issue. I will look into it soon.

As a workaround, you can call report.to_dict() and save the dictionary as a JSON by yourself with the exact same code I wrote for the report.save function.

That issue it's kinda weird.

@PaulLerner
Copy link
Author

don’t you need to convert int64 to int in to_dict?

@PaulLerner
Copy link
Author

for example in transformers they use:

def denumpify_detensorize(metrics):
    """
    Recursively calls `.item()` on the element of the dictionary passed
    """
    if isinstance(metrics, (list, tuple)):
        return type(metrics)(denumpify_detensorize(m) for m in metrics)
    elif isinstance(metrics, dict):
        return type(metrics)({k: denumpify_detensorize(v) for k, v in metrics.items()})
    elif isinstance(metrics, np.generic):
        return metrics.item()
    elif is_torch_available() and isinstance(metrics, torch.Tensor) and metrics.numel() == 1:
        return metrics.item()
    return metrics

PaulLerner added a commit to PaulLerner/ranx that referenced this issue Jan 26, 2022
@PaulLerner
Copy link
Author

this fixes it but you probably want to deal with it some other way? If not I can open a PR PaulLerner@7e2218d

@AmenRa
Copy link
Owner

AmenRa commented Feb 1, 2022

I will look into it soon.

@AmenRa AmenRa reopened this Feb 1, 2022
@AmenRa
Copy link
Owner

AmenRa commented Feb 2, 2022

Fixed in 0.1.10. Sorry for the inconvenience.

@AmenRa AmenRa closed this as completed Feb 2, 2022
@PaulLerner
Copy link
Author

I’m still getting TypeError: Object of type int64 is not JSON serializable

@PaulLerner
Copy link
Author

oops, looks like I was on the wrong branch, sorry about that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants