-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feat] Deprecate nextclade json parser, add github actions, and update task names #2
Conversation
23e7e4d
to
48f0d4a
Compare
scripts/summary.py
Outdated
def make_wgs_horizon_output(results_df: pd.DataFrame, project_name: str) -> None: | ||
"""Make wgs horizon report.""" | ||
results_df["report_to_epi"] = "" | ||
results_df["Run_Date"] = str(date.today()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supposed to be "Run_Date" instead of "run_date"?
ref_genome_length = len(ref_genome.seq) | ||
|
||
Ns = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added initialization to prevent mypy type errors. Also, good practice to declare it in the outer scope here since we use it outside the following if
statements.
num_non_ambiguous_bases = np.NaN | ||
coverage = np.NaN | ||
aligned_bases = np.NaN # type: ignore | ||
Ns = np.NaN # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how to handle the type issues where we assign some_variable = np.NaN
. I get an error from mypy error: Incompatible types in assignment (expression has type "float", variable has type "int")
.
This pull request now also partially addresses #3 where |
…:CDPHE-bioinformatics/RSV into feat/deprecate_nextclade_json_parser/dlp
@@ -17,156 +18,39 @@ workflow RSV_illumina_pe_summary { | |||
File concat_seq_results_py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be concatenate_seq_results_py or maybe should we shorten it everwhere?
@molly-hetheringtonrauth , I tested this in Terra and it works and results match what we expect. I don't think we need a individual meeting about these changes and can just merge them in if that works for you. I'll be making more changes after this pull request to add versioning and fix any Horizon issues anyway. |
This PR:
Aim, context, and functionality 🎯
We are removing the nextclade json parser in favor of including the Nextclade results in the summary file. We are also improving WDL task names and variables.
Workflow Changes ✅
Upstream effects
None.
Input changes
None.
Output changes
nextclade_json_parser.py
has been removed and the the output file it produced as well.summary.py
now includes the columns needed that were originally output bynextclade_json_parser.py
.Downstream effects
None yet as we haven't included RSV results in production datasets or Horizon.
Testing 🛠️
None at the moment.
Developer Checklist 👷♀️
Reviewer Checklist 🔍