-
Notifications
You must be signed in to change notification settings - Fork 12
Conversation
This reverts commit 2f6edb8.
I have been unable to find any fasta files despite searching all of our samples. Am I searching incorrectly? |
README.md
Outdated
@@ -232,6 +232,9 @@ Version numbers for this repo take the form X.Y.Z. | |||
- We increase X for a paradigm shift in how the pipeline is conceived. Example: adding a de-novo assembly step and then reassigning hits based on the assembled contigs. | |||
Changes to X or Y force recomputation of all results when a sample is rerun using idseq-web. Changes to Z do not force recomputation when the sample is rerun - the pipeline will lazily reuse existing outputs in AWS S3. | |||
|
|||
- 3.15.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be 3.15.0
, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work! Please fix the version number, and I made some requests for clarification.
Also, it's not clear how you tested this in RNA and DNA modes... are both represented in examples/generic_test_dag.json
?
idseq_dag/__init__.py
Outdated
@@ -1,2 +1,2 @@ | |||
''' idseq_dag ''' | |||
__version__ = "3.14.5" | |||
__version__ = "3.15.5" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3.15.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still hasn't changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this
…dag into rna-insert-metrics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good comments but version number still hasn't fully changed.
idseq_dag/__init__.py
Outdated
@@ -1,2 +1,2 @@ | |||
''' idseq_dag ''' | |||
__version__ = "3.14.5" | |||
__version__ = "3.15.5" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still hasn't changed
def list_s3_keys(s3_path_prefix): | ||
with botolock: | ||
rate_limit_boto() | ||
return _list_s3_keys(s3_path_prefix) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I am late to the party here but I have a number of concerns with the use of boto3 regarding the fact that it tends to implicitly use a global session object that cannot be shared between threads.
Is there a way to rewrite this without using boto3? If not, would you mind creating your own dedicated boto session for every client ant paginator creation?
You could look for instance here https://stackoverflow.com/questions/52820971/is-boto3-client-thread-safe but by no means is that the only place where its thread safety is questioned. The concern that the implicit global boto session should not be used from multiple threads is documented in the official boto docs. All code in s3.py is meant to be thread safe, and this example is even more important because the paginator is a long lived object (a generator).
If you must use boto3, it seems to me the botolock and the rate_limit_boto() functions are placed in the wrong place. Those should protect lines 92 and 93 above, where the first boto3 operations occur, not this line. It's also unclear to me if the paginator
Description
This change computes insert size metrics for all paired end DNA samples and all paired end RNA samples provided we have a gtf file for the host genome.
Version
Tests