Skip to content

Commit

Permalink
ingest: Upload Nextclade TSV output
Browse files Browse the repository at this point in the history
Keep a copy of the full Nextclade TSV output from ingest on S3
since we won't necessarily join all columns with the metadata output.
  • Loading branch information
joverlee521 committed Jun 24, 2024
1 parent 10e9d40 commit f33157f
Show file tree
Hide file tree
Showing 3 changed files with 20 additions and 0 deletions.
1 change: 1 addition & 0 deletions ingest/Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ rule upload_all:
input:
sequences=expand("fauna/s3/sequences_{segment}.done", segment=config["segments"]),
metadata="fauna/s3/metadata.done",
nextclade="fauna/s3/nextclade.done",

include: "rules/ingest_fauna.smk"
include: "rules/merge_segment_metadata.smk"
Expand Down
1 change: 1 addition & 0 deletions ingest/build-configs/ncbi/Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ rule upload_all_ncbi:
expand([
"{data_source}/s3/sequences_{segment}.done",
"{data_source}/s3/metadata.done",
"{data_source}/s3/nextclade.done",
], data_source=NCBI_DATA_SOURCES, segment=config["segments"]),


Expand Down
18 changes: 18 additions & 0 deletions ingest/rules/upload_to_s3.smk
Original file line number Diff line number Diff line change
Expand Up @@ -37,3 +37,21 @@ rule upload_metadata:
{params.s3_dst:q}/metadata.tsv.zst \
{params.cloudfront_domain} 2>&1 | tee {output.flag}
"""


rule upload_nextclade_tsv:
input:
nextclade="{data_source}/results/nextclade.tsv",
output:
flag="{data_source}/s3/nextclade.done",
params:
s3_dst=lambda wildcards: config["s3_dst"][wildcards.data_source],
cloudfront_domain=config.get("cloudfront_domain", ""),
shell:
"""
./vendored/upload-to-s3 \
--quiet \
{input.nextclade:q} \
{params.s3_dst:q}/nextclade.tsv.zst \
{params.cloudfront_domain} 2>&1 | tee {output.flag}
"""

0 comments on commit f33157f

Please sign in to comment.