Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc perf #202

Draft
wants to merge 52 commits into
base: dev
Choose a base branch
from
Draft

Doc perf #202

wants to merge 52 commits into from

Conversation

zargot
Copy link
Collaborator

@zargot zargot commented Jun 9, 2023

No description provided.

zargot and others added 30 commits April 17, 2023 20:48
- use the 'typer' package for proper param handling
- add optional outdir parameter
fixes #149 

- use the 'typer' package for proper param handling
- add optional outdir parameter

The second part of the task, where valid tables should say "zero errors"
can be deferred to #74 & #160, which will be used in the validation tool
as well.
v1 table-attr mapping was naive and didn't account for multiple table
values in the version1Table field. Table/attr values are now being
matched with a set-intersection.

The previous fix/hack to compensate for this error is also being removed.
changes in schema v1.1.0:
- add CovidPublicHealthData table
- add SiteMeasure columns:
    - accessToAllOrg
    - accessToDetails
    - accessToLocalHA
    - accessToOtherProv
    - accessToPHAC
    - accessToProvHA
    - accessToPublic
    - accessToSelf
    - analysisDate
    - assayID
    - cphdID
    - date
    - dateType
    - fractionAnalyzed
    - index
    - reportDate
    - siteMeasureID
    - uWwMeasureID
- add WWMeasure columns:
    - accessToAllOrg
    - accessToDetails
    - accessToLocalHA
    - accessToOtherProv
    - accessToPHAC
    - accessToProvHA
    - accessToPublic
    - accessToSelf
    - assayID
    - cphdID
    - date
    - dateTime
    - dateType
    - siteMeasureID
fixes #161 

- fixes v1 table-attr mapping bug which happened due to naive
version1Table parsing
- updated schemas
fixes #116 

this is only done for v2, since v1 doesn't have any official boolean
parts right now
- add RuleId enum to rules module
- change rule id values to enum values where possible

The decision to put the enum in the rules module was made to make it
easier to make new (and keep track of old) rules, since it's all in the
same place. This does cause a minor issue where I have to use string
values instead of the enum in a few extra places to avoid circular
imports (since the rules module already depend on other modules), but
that should be worth it.
the set-op is unecessary, and the delete op is bound to go out of sync
due to not being in reverse order
fixes #126 

- added code change to rule_primitives.py
- updated validation-rule assets to reflect it
- updated schemas with the new meta
fixes #48 

- added RuleId enum to rules.py and replaced most rule id strings with
it
- made a refactor, see commit msg
add function suffix to make it clear that this spec is about a function
the column/row counts will help the user see what was actually validated
- add summarize-report-function.qmd
- add summarize_report section to module-functions.qmd
fixes #74 

adds spec:
- summary-report.qmd
- summarize-report-function.qmd, with new section in module_function.qmd
- summarize-tool.qmd
- validate-tool.qmd
a new summarize function & tool will replace this
- encapsulate rule counts in the overview
- rename _total to _all
reports.py:
- add table_info to validation report
- add error_kind helpers for summarize_report

rules.py:
- add _all rule for total counts in summary

summarization.py:
- add summarize_report function

validation.py:
- populate table_info for validation report

test_summarization.py:
- add tests for summarize_report
fixes #160 

see commits and their messages.
it will now support csv as well, and be more general
- assets:
    - update error messages
- reports:
    - add join_reports
    - rewrite error templates with multiple verbosity levels
- validation:
    - add verbosity level constant
- tests:
    - update error messages
- validate:
    - rewrite
zargot and others added 22 commits June 7, 2023 11:53
Implements the new spec for the validate tool.
 
See commit messages.
this is how the 'typer' package expects multiple values to be passed to
the same param (to form a list)
to conform with the python standard
tests:
- add a simple integration test for using the validate and summarize
  tools together with pipes

tools:
- factor out common functionality from validate.py to reportutils.py
- add summarize.py

summarization.py:
- use text for summarykey values so that it can be used as a CLI
  argument
see commit messages.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant