Skip to content

New quality metrics overwrite old ones #3256

@jonahpearl

Description

@jonahpearl

Hi all — I recently decided to add the nearest_neighbors quality metric into my pipeline, but when I tried to compute just that metric, I was annoyed to find that it overwrote all the other previously calculated metrics. This seems like non-optimal behavior — imagine the user computes slow quality metric X, then wants to also add slow quality metric Y the next day, they will also have to re-compute X, or otherwise manually futz with the saved CSVs.

This behavior seems to be implemented here. Instead of creating a new df each time, why not check for an existing one, and merge it with any new metrics created? I understand that overwriting is perhaps a better default, in case the user has curated the units or otherwise changed pre-processing, and is trying to compute metrics de novo, but there could be a "keep_existing" kwarg or something that allows the user to specify not to overwrite what's already there.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    qualitymetricsDEPRECATER: use "metrics" instead Related to qualitymetrics module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions