Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading shows different behavioural on different datasets #6

Closed
Sins-code opened this issue May 27, 2021 · 23 comments
Closed

Loading shows different behavioural on different datasets #6

Sins-code opened this issue May 27, 2021 · 23 comments

Comments

@Sins-code
Copy link

Hello @ALL,
im doing scientific research about coil compression possibilities. To be able to do everything in just one programming language, i decided to use this module instead of importing some matlab-functionalities to python.
While working i found an error i can't really explain to myself:
I work with to different .dat MRI-files, 1) with dimensions (256, 20, 261, 1, 20, 1, 1, 1, 3, 1, 9, 1, 1, 1, 1, 1) and 2) with dimensions (256, 20,208, 1, 44, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1) after removing oversampling. Both times i want to load the whole dataset in the beginning with data = twixObj.image[''] . But this works just for dataset 1. If i use this on the second dataset, i get this error-message:

line 23, in data = twixObj.image['']
line 646, in getitem out = self.readData(mem, ixToTarg, ixToRaw, selRange, selRangeSz, outSize)
line 748, in readData fid.seek(mem[k] + szScanHeader, 0)
OSError: [Errno 22] Invalid argument

If i use for example this command data= twixObj.image[:,:,:,:,20] , the loading works...but i can never load the whole data-set 2!
Any ideas why this happens? If you want, i can try to share the dataset which does not work well with the module on some platform, so you can reproduce the error.
Many thanks

@wtclarke
Copy link
Owner

wtclarke commented Jul 8, 2021

Hi @Sins-code Are there any odd characters in the filename of the file you pass in for the second scan. In pdb can you step through and see where the errors occur and what the values of mem[k] + szScanHeader are?

@Sins-code
Copy link
Author

Greetings @wexeee, im glad you found the time to answer my question, many thanks! The Filenames i'm passing are for example: meas_MID00102_FID31136_t2_tse11_tra_256_4mm.dat -> works and meas_MID00040_FID08032_REF_EPI1_256x244x44_TE6_TR580.dat -> does not work!
I can't see any odd characters at first glance!
I will now debug the loading and post values for mem and ScanHeader!

@Sins-code
Copy link
Author

The error happens right after executing twixObj.image[''], when calling fid.seek(mem[k] + scanHeader, 0) in the method readData. Values are: k=0, kmax=27456, mem[0] has the value: -2147439168 and szScanHeader has value 192.

@wtclarke
Copy link
Owner

wtclarke commented Jul 9, 2021

I think that mem[0] being < 0 doesn't make sense, it is trying to set the position to before the start of the file. Can you trace that back through the code to see why that occurs?

@wtclarke
Copy link
Owner

wtclarke commented Jul 9, 2021

Perhaps see if it is an artefact of this line: https://github.com/wexeee/pymapvbvd/blob/3300290f3d626dc4c9305e326c0ea453ff5c90f6/mapvbvd/twix_map_obj.py#L681 where there is some casting.

@Sins-code
Copy link
Author

Sins-code commented Jul 9, 2021

I checked getitem right now:

1 mem = self.memPos[ixToRaw]
2 # sort mem for quicker access, sort cIxToTarg/Raw accordingly
3 ix = np.argsort(mem)
4 mem = mem[ix]
5 ixToTarg = ixToTarg[ix]
6 ixToRaw = ixToRaw[ix]
7 # import pdb; pdb.set_trace()
8 out = self.readData(mem, ixToTarg, ixToRaw, selRange, selRangeSz, outSize)

This is the first time readData is called! But already in line 1 there are some big negative values stored quite at the end of the array mem, which get to the front after being sorted in in line 3.

I will check memPos now!

@wtclarke
Copy link
Owner

wtclarke commented Jul 9, 2021

Thanks. Yes, that is calculated via the filePos variable set here. https://github.com/wexeee/pymapvbvd/blob/3300290f3d626dc4c9305e326c0ea453ff5c90f6/mapvbvd/mapVBVD.py#L353

@Sins-code
Copy link
Author

Sins-code commented Jul 9, 2021

The error happens somewhere in here: After this while loop the are some negative values in the filePos-array!
https://github.com/wexeee/pymapvbvd/blob/3300290f3d626dc4c9305e326c0ea453ff5c90f6/mapvbvd/mapVBVD.py#L74-L154

The negative values start to appear in iteration 25942, before that everything is normal. Max positive value is in iteration 25941 with value 2147445376.00, then everything gets negative, value in iteration 25942 is -2147439168.00! It seems to me like a numpy int32 overflow!
This would also explain, why the error only occurs in some cases! The file, which produces the error, is nearly twice as big as the other one-2,3GB!

@Sins-code
Copy link
Author

@wexeee What is your opinion on my explanation for the problem?

@wtclarke
Copy link
Owner

That sound possible (hence my incorrect guess with the explicit cast above). What type is being used here? I thought python was fairly good at handling things like this.

I've also had issues with (matlab) mapVBVD before where mdh loop parameters were centred around zero rather than just incrementing from zero. That isn't happening here is it?

@Sins-code
Copy link
Author

Seems like I was on the wrong path...filePos has dtype float64, which is big enough ^^ i should have checked that first. I'm looking for other errors now!
To your second question: I don't know, i try to check this as well.

@Sins-code
Copy link
Author

Seems like i found the error now! My debugger tells me that the variable cpos, which is used to fill filePos, is from type int32!

@wtclarke
Copy link
Owner

So fid.tell() returns an int32?

What OS are you on?

@Sins-code
Copy link
Author

Sins-code commented Jul 12, 2021

Yeah exactly. Windows 10! Using Python 3.8.10.

@wtclarke
Copy link
Owner

And does the negative number come from fid.tell() or some calculation applied to the cpos variable?

@Sins-code
Copy link
Author

Im trying to find out, running some more tests, give me a second!

@Sins-code
Copy link
Author

Sins-code commented Jul 12, 2021

It happens in between, directly in the first iteration! cpos starts as an (unlimited?) int, then becomes an int32 exactly here, at the end of the while loop:
https://github.com/wexeee/pymapvbvd/blob/3300290f3d626dc4c9305e326c0ea453ff5c90f6/mapvbvd/mapVBVD.py#L154

@wtclarke
Copy link
Owner

Ah, probably something to do with the numpy types of the vairables feeding into ulDMALength. Does cPos = cPos + int(ulDMALength) fix the issue?

@Sins-code
Copy link
Author

Yes it does! 👍

@wtclarke
Copy link
Owner

Great. Could you make a PR with this fix in? I can then check out some of the tests with it.

@Sins-code
Copy link
Author

I still have one general question:
Why is the loading process sometimes so much slower, depending on the datasets?

For example here the loading goes super fast:

pymapVBVD version 0.4.1
Software version: VD
Scan 1/1, read all mdhs: 98%|█████████▊| 1.18G/1.21G [00:01<00:00, 1.27GB/s]
Software: vd
Number of acquisitions read 15660
Data size is [512, 20,261, 1, 20, 1, 1, 1, 3, 1, 9, 1, 1, 1, 1, 1]
Squeezed data size is [256,20,261,20,3,9] (['Col', 'Cha', 'Lin', 'Sli', 'Rep', 'Seg'])
NCol = 512
NCha = 20
NLin = 261
NAve = 1
NSli = 20
NPar = 1
NEco = 1
NPhs = 1
NRep = 3
NSet = 1
NSeg = 9
NIda = 1
NIdb = 1
NIdc = 1
NIdd = 1
NIde = 1

read data: 0%| | 0/15660 [00:00<?, ?it/s]
read data: 0%| | 62/15660 [00:00<00:27, 568.67it/s]
read data: 2%|▏ | 254/15660 [00:00<00:18, 836.64it/s]
read data: 2%|▏ | 382/15660 [00:00<00:17, 892.67it/s]
read data: 3%|▎ | 510/15660 [00:00<00:15, 977.49it/s]
read data: 4%|▍ | 625/15660 [00:00<00:14, 1028.39it/s]
read data: 5%|▍ | 730/15660 [00:00<00:14, 1004.91it/s]
read data: 5%|▌ | 832/15660 [00:00<00:16, 878.54it/s]
read data: 6%|▌ | 958/15660 [00:01<00:17, 854.85it/s]
read data: 7%|▋ | 1086/15660 [00:01<00:16, 889.76it/s]
read data: 8%|▊ | 1214/15660 [00:01<00:15, 916.03it/s]
read data: 9%|▊ | 1340/15660 [00:01<00:14, 1001.08it/s]
read data: 9%|▉ | 1470/15660 [00:01<00:14, 961.09it/s]
read data: 10%|█ | 1626/15660 [00:01<00:12, 1110.39it/s]

Whereas here it goes really slow:

pymapVBVD version 0.4.1
Software version: VD
Scan 1/1, read all mdhs: 95%|█████████▌| 2.02G/2.12G [00:01<00:00, 1.25GB/s]
Software: vd
Number of acquisitions read 27456
Data size is [512, 20,208, 1, 44, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1]
Squeezed data size is [256,20,208,44,3] (['Col', 'Cha', 'Lin', 'Sli', 'Rep'])
NCol = 512
NCha = 20
NLin = 208
NAve = 1
NSli = 44
NPar = 1
NEco = 1
NPhs = 1
NRep = 3
NSet = 1
NSeg = 1
NIda = 1
NIdb = 1
NIdc = 1
NIdd = 1
NIde = 1

read data: 0%| | 0/27456 [00:00<?, ?it/s]
read data: 0%| | 2/27456 [00:00<3:12:53, 2.37it/s]
read data: 0%| | 6/27456 [00:02<3:10:39, 2.40it/s]
read data: 0%| | 14/27456 [00:05<3:12:33, 2.38it/s]
Scan 1/1, read all mdhs: 100%|█████████▉| 2.12G/2.12G [00:20<00:00, 1.25GB/s]
read data: 0%| | 62/27456 [00:27<3:21:16, 2.27it/s]
read data: 0%| | 126/27456 [00:55<3:22:18, 2.25it/s]

Do you have an explanation for this? :D

@Sins-code
Copy link
Author

Great. Could you make a PR with this fix in? I can then check out some of the tests with it.

Of course I can. But I will have to learn how to do it first, since I'm new to github!

@wtclarke
Copy link
Owner

I have now merged this fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants