Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation does not check column types #43

Open
Pennycook opened this issue Apr 25, 2024 · 1 comment
Open

Validation does not check column types #43

Pennycook opened this issue Apr 25, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@Pennycook
Copy link
Contributor

Expected behavior

We should either:

  • Check that columns which we expect to be numeric (like fom) are actually numeric, and raise an exception if the condition does not hold; or
  • Try to cast columns which we expect to be numeric to numeric (with to_numeric) and raise an exception if the conversion fails.

Actual behavior

Loading data from an outside source (like JSON) that stores FOM information as strings can lead to strange results, because:

  • The maximum FOM is determined by its position in a lexicographic order.
  • Efficiency is calculated relative to this FOM, resulting in efficiencies outside of [0, 1].

Steps to reproduce the problem

Use string representations of the FOM instead of numeric ones.

Specifications

Tested with the tip of main.

@Pennycook Pennycook added the bug Something isn't working label Apr 25, 2024
@Pennycook
Copy link
Contributor Author

After some more investigation, it's now clear that we validate that columns are convertible to numeric values (with _require_numeric) but don't actually perform the conversion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant