Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preprocessing of files #2

Closed
mabbasi6 opened this issue Jun 2, 2022 · 5 comments
Closed

Preprocessing of files #2

mabbasi6 opened this issue Jun 2, 2022 · 5 comments

Comments

@mabbasi6
Copy link

mabbasi6 commented Jun 2, 2022

I have been trying to use the preprocessing code and I have been getting these errors:
When I enter <data/Vestibular-Schwannoma-SEG>:

Traceback (most recent call last):
  File "TCIA_data_convert_into_convenient_folder_structure.py", line 125, in <module>
    assert(all(found)), f"Not all required files found"
AssertionError: Not all required files found

When I point to the folder <data_path>:

Traceback (most recent call last):
  File "TCIA_data_convert_into_convenient_folder_structure.py", line 42, in <module>
    dd = pydicom.read_file(first_file)
  File "/Users/mabbasi6/opt/anaconda3/envs/momo_seg/lib/python3.6/site-packages/pydicom/filereader.py", line 993, in dcmread
    fp = open(fp, 'rb')
IsADirectoryError: [Errno 21] Is a directory: '/Users/mabbasi6/Downloads/VS_Seg/data/new/manifest-1614264588831/Vestibular-Schwannoma-SEG/VS-SEG-061/03-17-1996-NA-Avanto RoutineImage Guidance-11244'

Could you please let me know how I could solve it? and/or the data has changed causing some errors?

@aaronkujawa
Copy link
Collaborator

Hi,
the data has not changed.
Did you download the complete dataset from TCIA in "Descriptive Directory Name" format?
The --input path should be Vestibular-Schwannoma-SEG (in your case it should be /Users/mabbasi6/Downloads/VS_Seg/data/new/manifest-1614264588831/Vestibular-Schwannoma-SEG )

What does your full command look like?
Does the error appear straight away or are some files copied to the output folder?

@mabbasi6
Copy link
Author

mabbasi6 commented Jun 2, 2022

Hi,
I downloaded the complete dataset from TCIA in "Descriptive Directory Name" format.
Using the directory you recommended, the command is:
python3 TCIA_data_convert_into_convenient_folder_structure.py --input /Users/mabbasi6/Downloads/VS_Seg/data/new/manifest-1614264588831/Vestibular-Schwannoma-SEG --output /Users/mabbasi6/Downloads/VS_Seg/prepped/

and the error is:

Traceback (most recent call last):
  File "TCIA_data_convert_into_convenient_folder_structure.py", line 125, in <module>
    assert(all(found)), f"Not all required files found"
AssertionError: Not all required files found

No data is written into the output folder.

@aaronkujawa
Copy link
Collaborator

Looks like you are doing everything right.
Sorry, I'll have to download the data again from TCIA to run this on my side. Will get back to you when I can confirm that it is still running as expected.

@aaronkujawa
Copy link
Collaborator

Sorry for the delay, I couldn't immediately download the dataset because TCIA has added a new license to the dataset that required me to ask for access first. The newly added LICENSE file in the Vestibular-Schwannoma-SEG folder is also the reason for the script failing: The script assumes it's another case. For now you can just delete the LICENSE file, then the script will work. I'll also update the script to ignore the file. Thanks for raising the issue.

@aaronkujawa
Copy link
Collaborator

I updated the script to ignore the file. Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants