Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download sample images #35

Closed
shntnu opened this issue Apr 1, 2020 · 17 comments
Closed

Download sample images #35

shntnu opened this issue Apr 1, 2020 · 17 comments

Comments

@shntnu
Copy link
Collaborator

shntnu commented Apr 1, 2020

@jatinarora-upmc @sasgari
The notebook 1.profile-cell-lines⁩/7.select_images_to_print.Rmd shows how to download sample images. Have a look and LMK if you have any questions.

@shntnu
Copy link
Collaborator Author

shntnu commented Apr 1, 2020

I have uploaded sample images here (13Gb)

But you may want to resample to get more / different images.

@shntnu
Copy link
Collaborator Author

shntnu commented Apr 2, 2020

@jatinarora-upmc The file data/gwas_images.csv generated by 7.select_images_to_print.md has the information you need about cell lines

In #36 I updated the code so that all this information is readily available in two CSV files.
E.g. In a row from sample_images.csv

Metadata_Plate Metadata_Well Metadata_Channel filename URL
cmqtlpl1.5-31-2019-mt A12 URL_OrigDNA r01c12f05p01-ch5sk1fk1fl1.tiff https://s3.amazonaws.com/imaging-platform/projects/2018_06_05_cmQTL/2019_06_10_Batch3/images/cmqtlpl1.5-31-2019-mt__2019-06-10T16_42_36-Measurement2/Images/r01c12f05p01-ch5sk1fk1fl1.tiff

we see that the file cmqtlpl1.5-31-2019-mt/r01c12f05p01-ch5sk1fk1fl1.tiff comes from plate cmqtlpl1.5-31-2019-mt and well A12.

We can join with sample_images_metadata.csv

to figure the metadata corresponding to plate cmqtlpl1.5-31-2019-mt and well A12:

Metadata_Plate Metadata_Well Metadata_Row Metadata_FieldID Metadata_Assay_Plate_Barcode Metadata_Plate_Map_Name Metadata_well_position Metadata_plating_density Metadata_line_ID
cmqtlpl1.5-31-2019-mt A12 1 5 cmqtlpl1.5-31-2019-mt cmQTL_plate1_5.31.2019 A12 10000 34

specifically, that the cell line id is 34.

@jatinarora-upmc
Copy link
Collaborator

I guess there is an issue here. There are many image identifiers which map to more than 1 cell lines. For example, the image r14c11f05p01-ch5sk1fk1fl1.tiff maps to two cell lines on two plates (98 on BR00106709, and 236 on BR00107338). Could you check, or am I looking in wrong way?

@shntnu
Copy link
Collaborator Author

shntnu commented Apr 2, 2020 via email

@shntnu shntnu closed this as completed Apr 23, 2020
@shntnu shntnu reopened this Jul 9, 2020
@shntnu
Copy link
Collaborator Author

shntnu commented Jul 9, 2020

@jatinarora-upmc I'm following up on your Slack message here. Can you review this thread and LMK if you are able to figure out how to get example images for a cell line ID?

@jatinarora-upmc
Copy link
Collaborator

@shntnu thanks much for reminding me of this thread.
Is the file sample_images_metadata.csv for all images across all plates (for which the link you put in second message here (13g))?

@shntnu
Copy link
Collaborator Author

shntnu commented Jul 10, 2020

The notebook referred to in #35 (comment) was used to produce sample_images_metadata.csv and sample_images.csv. It samples one well per cell line and then a single, fixed field-of-view a.k.a. site from each well (it always picks Metadata_Site = 5)

@jatinarora-upmc
Copy link
Collaborator

Thanks @shntnu . Would it be possible to generate another set of images with another Metadata_Site, let's say 3? I guess the images are stored at your side, so you might have to re-run the script?

@shntnu
Copy link
Collaborator Author

shntnu commented Jul 15, 2020

Now available via #45

@shntnu
Copy link
Collaborator Author

shntnu commented Jul 15, 2020

@jatinarora-upmc this notebook has details on how to download; I am copying it below. The sample_images.csv file referred to below is produced by that notebook.

IMAGE_DIR=/tmp/cmqtl

mkdir -p $IMAGE_DIR

cut -d"," -f1 data/sample_images.csv | grep -v Metadata_Plate| sort -u > /tmp/plates.txt

parallel -a /tmp/plates.txt --no-run-if-empty mkdir -p $IMAGE_DIR/{} 

parallel \
 --header ".*\n" \
 -C "," \
 -a data/sample_images.csv \
 --eta \
 --joblog ${IMAGE_DIR}/download.log \
 wget -q -O ${IMAGE_DIR}/{1}/{4} {5}

@jatinarora-upmc
Copy link
Collaborator

@shntnu i saw this code previously, but i could not figure out where new sample_images.csv file is. #45 redirects me to #35, and i got lost in circle. Sorry to bug again, am not used to github at all - so i think a direct link to sample_images.csv would be so helpful.

@shntnu
Copy link
Collaborator Author

shntnu commented Jul 15, 2020

Sure thing.
This is the file https://github.com/broadinstitute/cmQTL/blob/bcef95625d964d10ad8d81e31b453ab21f09f969/1.profile-cell-lines/data/sample_images.csv

In that snippet, I had a relative path to it data/sample_images.csv because the notebook is in the folder 1.profile-cell-lines/.

@jatinarora-upmc
Copy link
Collaborator

all set now, thanks so much @shntnu

@jatinarora-upmc
Copy link
Collaborator

@shntnu i am going through the comments i got in today's meeting. May i ask for the images for plate7 also?

@shntnu
Copy link
Collaborator Author

shntnu commented Sep 7, 2020

I have updated sample_images.csv to include the new version of plate 7
https://github.com/broadinstitute/cmQTL/blob/master/1.profile-cell-lines/data/sample_images.csv

@jatinarora-upmc see #35 (comment) for what to do next (everything is the same as before, just that I have now replaced with the new plate 7 images)

@jatinarora-upmc
Copy link
Collaborator

@shntnu it seems there are fewer lines in this updated samples_images.csv, and there is no plate cmQTLplate7-7-22-20 anywhere in Metadata_Plate column. Could you please check?

@shntnu
Copy link
Collaborator Author

shntnu commented Sep 16, 2020

@jatinarora-upmc
Now fixed in #60, which updated https://github.com/broadinstitute/cmQTL/blob/master/1.profile-cell-lines/data/sample_images.csv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants