Implementation #16

codingcode111 · 2020-12-13T18:13:21Z

Hi,
I would like to thank you for sharing this interesting work. However, I am trying to apply this experiment on the same data that you use, it shows me that tumor type and mode required. Can you kindly tell me how can I fix this issue please?. Your reply and help would be highly appreciated.

Kind regards,

haranrk · 2020-12-13T19:03:04Z

Hi @codingcode111, can you kindly share with us which script you are running and what are your parameters?

codingcode111 · 2020-12-13T19:26:56Z

I am running points_extractor.py, and using PAIP2019 training dataset. However, I getting the following error

codingcode111 · 2020-12-13T19:27:54Z

haranrk · 2020-12-15T09:41:19Z

The points_extractor.py requires two arguments:

Mode - train or valid
Tumor Type - viable or whole, the two tumor types present in the PAIP dataset

You can run the point_extract_script.sh present in the same directory to automatically run the points_extractor.py for each permutation of arguments. You can run bash script using the following command:

bash point_extract_script.sh

Don't forget to pull the latest changes before proceeding.
Kindly write to us if you have any further issues.

codingcode111 · 2020-12-16T16:03:04Z

Hi Haran, Thank you so much for your reply and help. I pulled the latest changes and placed the 50 training unzip files in the data folder. By running the command above: raw-data, train and patch-coords-200k folders have been created with empty files and the following error has been occurred. [cid:6C3B0BD4-CABE-422E-B3B9-4F5DC7A5A9C8-L0-001] Any suggestions please?. We really appreciate your clarification and cooperation.

haranrk · 2020-12-17T04:00:01Z

The error in questions seems to occur because of the algorithm is unable to find the files. The directory of data is hardcoded into the script. You may have to change it based on where your files are.

Also, don't forget to refer this issue to resolve file format errors.

codingcode111 · 2020-12-24T04:27:54Z

Hi Haran,

Thank you so much for your reply and help. I was extremely busy last week. Regarding the project, I have tried to change the directory and did put the data file in the project directory, so both are in the same directory. However, I’m still getting the same error. Any suggestion to resolve this issue please?.
Your suggestions and support would be highly appreciated.

haranrk · 2020-12-24T06:03:39Z

Can you post the entire output after you run the program?
Did you convert the mask files to pyramidal tiff as outlined in this issue?

codingcode111 · 2020-12-27T11:43:49Z

Hi Haran,

Thank you for your help. However, I tried to convert tiff files into pyramids tiff using the command you provided. But, I’m not sure where exactly I need to use it Does it need to be in the same directory that has the data folder Or I can call the folder path that has the entire tiff images. I am new to histopathology images and struggling to sort out the initial steps to run out this experiment successfully. Your help and reply would be highly appreciated.

Best regards,

haranrk · 2020-12-28T03:01:09Z

convert input -compress jpeg -quality 90 -define tiff:tile-geometry=256x256 ptif:output

input - Complete file path to input image
output - Complete file path to output image

codingcode111 · 2020-12-29T11:24:36Z

Hi Haran,

I might need to let you know that I only downloaded the 50 training PAIP2019 dataset and unzip them. They are 50 folder, each folder represents training phase number and contains svs, xml, tiff_viable and tif_whole. Ex:ex:Training_phase_1_006 contains svs, xml, viable_tif and whole_tif. I also, extracted the 4 images in new folder and got 200 items in the folder. I am not sure if you are using the same dataset format that I am using. I hope you got what I mean and sort out this problem please. Your guidance and support would be highly appreciated.

Best regards,

haranrk · 2020-12-29T16:39:25Z

Hi,
That is the exact same dataset that I used as well. The tiff_viable and the tiff_whole are the images that are not in openslide format. Therefore, you would need to convert those images to openslide format. To do that you can use the command mentioned in the previous comment above. For example, if you had a file titled Ex:ex:Training_phase_1_006.tif, use the following command in the folder it's present in.

convert Training_phase_1_006.tif -compress jpeg -quality 90 -define tiff:tile-geometry=256x256 ptif:Training_phase_1_006.tiff

codingcode111 · 2021-01-01T13:08:09Z

Hi Haran,
Thank you so much for your reply. I did convert the images into pyramid tiff. Now, I have a data folder with data name. Also, there is a raw-data folder inside it. Where should I placed the pyramid tiff images? because when I tried to place the tiff images under both the data and data-raw folders, and I got the following error in the both tries:

File "points_extractor.py", line 454, in batch_patch_gen
    image_path = glob.glob(os.path.join(data_path,id,'*.svs'))[0]
IndexError: list index out of range

Please let me know where should I put the svs files and the exact order of the data and images including the SVS files that are included inside each training file the the 50 files that are available by the by the challenge website. By converting the images into pyramid images,I got 6 images inside each training folder. Should I use the 50 training folders under the data folder or should I only use the pyramid images. I am a bit confused as it is not clear where should we place the data and each type and folder of data should be placed. We need more clarification please to successfully apply this work . Thank you again Haran so much for your support and help.

haranrk · 2021-01-02T07:51:53Z

Did you change the data_path variable in the points_extractor.py script to where your data currently is present?

codingcode111 · 2021-01-02T14:38:48Z

I put the data under the data folder in the same directory. In the script, it’s written( data, raw-data,..etc). So, I did put the tiff files in the data folder. I also, tried to put them in raw_data and haven’t made any change in the script as I think there is no full path in the script, only data and raw-data folders which are already presented in the same directory under DigitalHistopath folder.

heba9004 · 2021-01-04T03:13:04Z

After converting the binary mask images into the pyramidal format, and start running the script of "points_extractor", I am facing a memory error when using deep copy. Before throwing this error, it only save the coordinates of the patches extracted from the first training svs file. Can you please advise me if the problem is with my machine or not? I am using 32GB RAM. Thank you in advance.

haranrk · 2021-01-04T05:38:24Z

@codingcode111
In the script I have the following declaration

data_path = os.path.join('..','..','data','raw-data','train')

The .. represents the parent directory, so in my setup, I have the following structure.

DigitalHistoPath/code_cm17/patch_extraction/points_extractor.py
DigitalHistoPath/data/raw-data/train

However, I advise you to change the data_path and out_path and other such variables to suit your needs.

haranrk · 2021-01-04T05:44:23Z

@heba9004
The issue warrants more attention. Can you post the error message in the new issue page here (#18 (comment))? Specifically, I would like to know at which line of the code, the memory error ocurred.

codingcode111 · 2021-01-06T16:00:04Z

Thank you so much Haran for your reply. Unfortunately, there is no clear instructions and layout regarding the dataset as we are working on WSI and have different format in our dataset( svs, tiff, xml), we need more clarification regarding the data organization before working on the project. In the following image for example, it’s clearly stated the files order and how exactly the data organised. Is it possible to share any similar layout to this project please?

Also, if @heba9004 applied this project successfully can you kindly share with me your layout and initial steps please. Your reply and help would be highly appreciated.

haranrk · 2021-01-06T18:07:56Z

The below file contains depicts the directory structure of our repository as is at the time of submission to the PAIP 2019 grand challenge.

dir-structure.txt

The important directory is the data/raw-data/train directory. The directory here shows only the files of only one sample (Training_phase_1_004), but the other sample directories are of the same format.

I have also included a new script convert_to_pyramidal.py under the patch_extraction folder which would help you with converting the mask files to pyramidal format.

heba9004 · 2021-01-20T01:14:40Z

@codingcode111 , sorry I just noticed your question, I converted the images and place them in their original folders, thus each folder in training have the original and converted tif images, but I am still facing memory error that I try to fix. Hope this helps you.

haranrk mentioned this issue Jan 4, 2021

Memory error while running points_extractor.py #18

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation #16

Implementation #16

codingcode111 commented Dec 13, 2020

haranrk commented Dec 13, 2020

codingcode111 commented Dec 13, 2020

codingcode111 commented Dec 13, 2020

haranrk commented Dec 15, 2020

codingcode111 commented Dec 16, 2020 via email •

edited by haranrk

haranrk commented Dec 17, 2020

codingcode111 commented Dec 24, 2020

haranrk commented Dec 24, 2020

codingcode111 commented Dec 27, 2020

haranrk commented Dec 28, 2020

codingcode111 commented Dec 29, 2020

haranrk commented Dec 29, 2020

codingcode111 commented Jan 1, 2021 •

edited by haranrk

haranrk commented Jan 2, 2021

codingcode111 commented Jan 2, 2021

heba9004 commented Jan 4, 2021

haranrk commented Jan 4, 2021

haranrk commented Jan 4, 2021 •

edited

codingcode111 commented Jan 6, 2021

haranrk commented Jan 6, 2021 •

edited

heba9004 commented Jan 20, 2021

Implementation #16

Implementation #16

Comments

codingcode111 commented Dec 13, 2020

haranrk commented Dec 13, 2020

codingcode111 commented Dec 13, 2020

codingcode111 commented Dec 13, 2020

haranrk commented Dec 15, 2020

codingcode111 commented Dec 16, 2020 via email • edited by haranrk

haranrk commented Dec 17, 2020

codingcode111 commented Dec 24, 2020

haranrk commented Dec 24, 2020

codingcode111 commented Dec 27, 2020

haranrk commented Dec 28, 2020

codingcode111 commented Dec 29, 2020

haranrk commented Dec 29, 2020

codingcode111 commented Jan 1, 2021 • edited by haranrk

haranrk commented Jan 2, 2021

codingcode111 commented Jan 2, 2021

heba9004 commented Jan 4, 2021

haranrk commented Jan 4, 2021

haranrk commented Jan 4, 2021 • edited

codingcode111 commented Jan 6, 2021

haranrk commented Jan 6, 2021 • edited

heba9004 commented Jan 20, 2021

codingcode111 commented Dec 16, 2020 via email •

edited by haranrk

codingcode111 commented Jan 1, 2021 •

edited by haranrk

haranrk commented Jan 4, 2021 •

edited

haranrk commented Jan 6, 2021 •

edited