-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Covariates implementation and interface #209
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall.
Some comments/changes in the code.
Also, we need some usage examples.
@@ -11,4 +11,5 @@ twine==3.2.0 | |||
# varspark dependencies | |||
pandas==1.1.4 | |||
typedecorator==0.0.5 | |||
Jinja2==3.0.3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also update python/requirements.txt
with this dependency
dev/py-test.sh
Outdated
@@ -11,4 +11,5 @@ cd "$FWDIR" | |||
pushd python | |||
pytest -s -m spark | |||
pytest -s -m hail | |||
pytest -s -m covariates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is not need to separate covariates into a new test category - they are in essence 'hail' tests.
Let's add them to hail tests.
dev/py-test.sh
Outdated
@@ -11,4 +11,5 @@ cd "$FWDIR" | |||
pushd python | |||
pytest -s -m spark | |||
pytest -s -m hail | |||
pytest -s -m covariates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is not need to separate covariates into a new test category - they are in essence 'hail' tests.
Let's add them to hail tests.
row_exprs=dict(), | ||
col_key=[], | ||
entry_exprs=dict(e=x)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doc string should be moved to the top of the function (it's being used to generate automatic documentation).
Also the description of ':param covariates' needs to be updated.
row_exprs=dict(), | ||
col_key=[], | ||
entry_exprs=dict(e=x)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doc string should be moved to the top of the function (it's being used to generate automatic documentation).
Also the description of ':param covariates' needs to be updated.
# self.assertTrue(top_five_ranking) | ||
|
||
if __name__ == '__main__': | ||
unittest.main(verbosity=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do just one test (on data/chr22_1000_full_pheno.csv
) and move it to test_hail.py
Note that setupClass
has special function in the test and should normally be only used to initialise resources common for all the tests.
} | ||
|
||
it.filter(tf => !tf.label.contains("cov__")) | ||
.map { tf => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extract "cov__" to a static constants.
} | ||
|
||
it.filter(tf => !tf.label.contains("cov__")) | ||
.map { tf => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extract "cov__" to a static constants.
val splitCount = brSplitCount.value | ||
|
||
it.filter { tf => tf.label.contains("cov__") } | ||
.map { tf => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly speaking it should be tf.label.startsWith
val splitCount = brSplitCount.value | ||
|
||
it.filter { tf => tf.label.contains("cov__") } | ||
.map { tf => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly speaking it should be tf.label.startsWith
Implements the interface and usage by the model of the covariates.
Resolves #193.