-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add example of raw->spacetx format conversion for ISS Breast Data #464
Conversation
Codecov Report
@@ Coverage Diff @@
## master #464 +/- ##
=========================================
Coverage ? 85.17%
=========================================
Files ? 73
Lines ? 3009
Branches ? 0
=========================================
Hits ? 2563
Misses ? 446
Partials ? 0
Continue to review full report at Codecov.
|
examples/get_iss_breast_data.py
Outdated
return dict(zip(range(4), hyb_str)) | ||
|
||
def get_tile(self, fov: int, hyb: int, ch: int, z: int) -> FetchedTile: | ||
filename = 'slideA_' + str(fov + 1) + '_' + \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
string concatenation is a huge mess and very hard to read. consider switching to f-style python strings.
examples/get_iss_breast_data.py
Outdated
|
||
@property | ||
def ch_dict(self): | ||
ch_str = ['Cy3 5', 'Cy3', 'Cy5', 'FITC'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why this is better than just a literal dict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
who cares.
|
||
@property | ||
def hyb_dict(self): | ||
hyb_str = ['1st', '2nd', '3rd', '4th'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ummmm, dict(enumerate(hyb_str))
?
examples/get_iss_breast_data.py
Outdated
elif self.aux_type == 'dots': | ||
filename = 'slideA_' + str(fov + 1) + '_DO_' + 'Cy3.TIF' | ||
else: | ||
msg = 'invalid aux type: {}'.format(self.aux_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahhh, string concatenation!
examples/get_iss_breast_data.py
Outdated
|
||
|
||
def format_data(input_dir, output_dir): | ||
if not input_dir.endswith("/"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these should not be necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol i copied this from your code in get_iss.py. i can nuke it.
examples/get_iss_breast_data.py
Outdated
if not output_dir.endswith("/"): | ||
output_dir += "/" | ||
|
||
def add_codebook(experiment_json_doc): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmmm, i think we need a better way to build codebooks programmatically....
examples/get_iss_breast_data.py
Outdated
parser.add_argument("output_dir", type=FsExistsType()) | ||
|
||
args = parser.parse_args() | ||
s3_bucket = "s3://czi.starfish.data.public/browse/raw/20180820/iss_breast/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't get this. who's going to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ambrose is going to do this. josh is going to read this code. this is also an example to help collaborators understand how to convert their data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this message is confusing. i would just have the --help
output tell people where to get the data.
Rationale: the only time this message prints is if the command is running, and to do that, you kind of need the data already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Is it possible to generate a full experiment manifest from this? |
@ttung I think this is a WIP that I'm supposed to take over. I was looking through the code yesterday and your question was my main question: "how does one pass a directory with multiple fields of view and expect them to be parsed separately? Do I write A toy example of that would be very helpful, if you know how it's supposed to function. If that functionality is not supported, that would also be great to know, as I will try to add it. When I'm finished using this tool to upload the s3 data, I will add a README to it. |
@ttung what do you mean is it possible to generate a full experiment manifest from this. That is the entire point. Unless you have a different definition of experiment manifest than I do? The problem, is that we don't have a way of loading this full experiment manifest. That's why we need the multi-fov API ASAP! @ambrosejcarr parsing images that live in separate directories is easy; however, that's super unimportant to worry about now. regardless, the way you would handle it, is in TileFetcher.get_tile(...). You would implement the logic needed os.path.join self.input_dir to the right thing based on fov_num, hyb, ch, etc. a more interesting question, is what if you'd like build_experiment_json to write hybridization_fov_xxx.jsons to different directories. |
@ttung i addressed all ur comments in the PR. thx! |
@dganguli Ok, so if I wanted to use this tool as is, I should put all the existing images that need to be formatted into the same directory, and it will write a series of hybridization.json files into a single output directory? I was for some reason stuck on the prior that each FOV would live in its own directory and couldn't reconcile how this would work in that case. I think I'm straightened out. |
@ttung can u approve and merge? i'd like to get the corresponding merfish work done today as well. |
examples/get_iss_breast_data.py
Outdated
crp = img[40:1084, 20:1410] | ||
return crp | ||
|
||
def tile_data_handle(self) -> IO: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this is supposed to be a @property
. it worked before due to a fluke, but the new slicedimage api will require it.
you can make the change now or i can make it when i land the slicedimage change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix
examples/get_iss_breast_data.py
Outdated
parser.add_argument("output_dir", type=FsExistsType()) | ||
|
||
args = parser.parse_args() | ||
s3_bucket = "s3://czi.starfish.data.public/browse/raw/20180820/iss_breast/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this message is confusing. i would just have the --help
output tell people where to get the data.
Rationale: the only time this message prints is if the command is running, and to do that, you kind of need the data already.
This PR:
Introduces get_iss_breast_data.py, which converts raw data from s3://czi.starfish.data.public/browse/raw/20180820/iss_breast/ to spaceTx formatted data. This will be our first real multi-fov dataset.
re-names things. image -> tile, image_fetcher -> tile_fetcher
I tested this PR in a notebook locally. I downloaded the 1 FOV from he breast data formatter and ran the ISS breast notebook against it. It worked.
Remaining work for @ambrosejcarr :
Remaining work for @ttung :
Remaining work for Deep: