Understanding Pandas validation #60

schlich · 2022-07-12T17:28:23Z

Hello, apologies if this is the wrong place to ask this question.

I am stumped on how datatest's validation mechanism is passing the following example:

dt.validate(pd.DataFrame(), pd.DataFrame({"A": [1]})

The documentation states:

For validation, DataFrame objects using the default index type are treated as sequences.

Shouldn't I be getting the same result as dt.validate([], [1])? What am I missing?

The text was updated successfully, but these errors were encountered:

shawnbrown · 2022-07-13T03:43:49Z

Ah, thanks for posting this. Your confusion is entirely warranted--datatest should be raising an error in this case.

I will look to get a fix pushed out in the next couple of days. There are some logical corner cases that arise when comparing against empty containers (where it's not always obvious what error/difference should be raised) but this is clearly undesirable behavior.

In the short term, if you are trying to use datatest for something and you want an immediate/short-term fix, you can add a preceding check for column names. See below:

import pandas as pd
import datatest as dt

data = pd.DataFrame()
requirement = pd.DataFrame({"A": [1]})

dt.validate(data.columns, requirement.columns)  # <- Add this.
dt.validate(data, requirement)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Understanding Pandas validation #60

Understanding Pandas validation #60

schlich commented Jul 12, 2022

shawnbrown commented Jul 13, 2022

Understanding Pandas validation #60

Understanding Pandas validation #60

Comments

schlich commented Jul 12, 2022

shawnbrown commented Jul 13, 2022