Skip to content

Use metadata to validate inputs to BinaryClassifierEfficacy precision and recall metrics #727

@frances-h

Description

@frances-h

Problem Description

Currently, if a datetime column is passed to either of the BinaryClassifierEfficacy metrics but it is not already converted to a pd.datetime, the metric will error, even if the metadata provides us with a datetime format. In addition, the metric does not filter out columns not present in the metadata and does not warn if columns specified in the metadata are missing from the data.

Expected behavior

We should update the metrics so that we convert datetime columns to timestamps. If a datetime format is provided in the metadata, we should use it to convert the column. If a datetime format is not provided, we should still try to convert but show a warning that no datetime format was provided for that column.

Additionally, when data is passed to the metric we should remove any columns in the data that are not in the metadata. If extra columns are passed, we should show a warning and drop the columns before proceeding.
UserWarning: Some columns ('age', 'salary', 'capitals-gains') are not in the metadata. These columns will not be included in this metric's computation.

Alternatively, if the data is missing columns that are specified in the metadata, we should also show a warning:
UserWarning: Some columns ('age', 'salary', 'capital-gains') are in the metadata but they are not present in the data.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions