Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check input data consistency #401

Open
bstabler opened this issue Mar 30, 2021 · 3 comments
Open

Check input data consistency #401

bstabler opened this issue Mar 30, 2021 · 3 comments
Labels
Usability Changes that improve usability

Comments

@bstabler
Copy link
Contributor

Idea Level-of-effort Notes Priority
Check input data consistency Days Plan to check all the key relationships. Check primary key table joins across input tables - HH home zone vs. TAZ/MAZ land use file, MAZtoTAP file TAPs vs. TAP skims, TAZs in the land use file vs. TAZ skims, etc. High
@bettinardi
Copy link

I'm supportive of this, given that it is a configurable and expandable and customizable list of checks, similar to what ODOT has developed. I would not support a hard coded set of checks.

@bstabler
Copy link
Contributor Author

bstabler commented Mar 31, 2021

For reference, see what ODOT developed, https://github.com/RSGInc/SOABM/blob/master/template/inputChecker/config/inputs_checks.csv

I think we could do that. I think it will probably require the entire task budget, which I think would be ok since this is an important item. I think the solution would be both user configurable input validator using expressions + primary keys / table join checks for data structures already built into the framework. The expressions could be used to check input data consistency against what's required by the downstream submodel expressions. For example, all households have a household.type value that is acceptable, say in [1,2,3,4]. The codebook of valid input data values could be defined in a settings file such as input_validation.yaml and then made available to the input validator. The reason for hard wiring some of the checks - such as all households must be in a zone and that zone must be in the skims - is because this relationship is already assumed in the code and so it should be checked. Another way of saying this is why would a user change it? It doesn't make sense to change it given its required by the code.

@bettinardi
Copy link

Well... Just to be clear on my thumbs up - it only applies to a non-hard coded solution. I give a grumpy face thumbs down to using money on hard coding this check when it could be lumped into a flexible / configurable option.

I would / could support (thumbs-up) a flexible solution that eats the whole existing budget, but only if the rest of the team agreed. In summary, my vote is likely to skip this item and knock off some other input issues and come back to addressing this with a scope specific to setting up a flexible/configurable option.

@jfdman jfdman added the Usability Changes that improve usability label Dec 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Usability Changes that improve usability
Projects
None yet
Development

No branches or pull requests

3 participants