numpy hypnogram format #15

alexblnn · 2021-03-11T21:51:01Z

Hi and sorry to bother you again Mathias!

I'm not managing to launch training with numpy hypnograms (in my case, they represent binary labels).

My np arrays contain 3 sub arrays of equal length corresponding to start, duration and label (I tried to follow the conventions mentionned in utime/hypnogram/formats.py) but I get an error when trying to train:

Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/utime/dataset/sleep_study.py", line 595, in load
self._load()
File "/opt/conda/lib/python3.7/site-packages/utime/dataset/sleep_study.py", line 555, in _load
sample_rate=header["sample_rate"])
File "/opt/conda/lib/python3.7/site-packages/utime/io/high_level_file_loaders.py", line 106, in load_hypnogram
sample_rate=sample_rate)
File "/opt/conda/lib/python3.7/site-packages/utime/io/hypnogram/hyp_extractors.py", line 144, in extract_hyp_data
ann_to_class=annotation_dict
File "/opt/conda/lib/python3.7/site-packages/utime/hypnogram/utils.py", line 280, in sparse_hypnogram_from_ids_format
ann_class_ints = [ann_to_class[a] for a in annotations]
File "/opt/conda/lib/python3.7/site-packages/utime/hypnogram/utils.py", line 280, in
ann_class_ints = [ann_to_class[a] for a in annotations]
KeyError: 0.0

Am i formatting my data wrong?

Best regards,

perslev · 2021-03-12T09:18:39Z

Hi! Sorry I have been busy and completely forgot to return to you. Please keep letting me know if you experience issues.

The numpy format is a bit special (this is not well documented), it actually expects a flat and dense format of stages, that is 1 integer stage label for each period/segment in your input. For instance, if your input signal is 10 minutes long and your period length is 30 seconds then the npz file should store just a single array of shape [20,]. Finally, the data type should be integer and not float. Have a look at:

U-Time/utime/io/hypnogram/hyp_extractors.py

Line 82 in 10d11e1

def extract_from_np(file_path, period_length_sec, sample_rate):

and

U-Time/utime/hypnogram/utils.py

Line 217 in 10d11e1

def ndarray_to_ids_format(array, period_length_sec, sample_rate):

Let me know if you need further help.

Cheers,
Mathias

alexblnn · 2021-03-12T10:32:44Z

EDIT : solved my removing the "sleep_stage_annotations" part in the yaml dataset file, finally i've been able to start training :)

Hi Mathias, thanks for you quick answer!

Each of my input signals are 90 seconds long and I have a label for each second. I have set "period_length_sec" to 1 second in the yaml dataset file. My PSG data is sampled at 100Hz.

I modified my .npy files in order that each contains a (90,) array of dtype int64, but I still get the same error. By the looks of it the code doesn't use "extract_from_np" as I would expect. Maybe utime is not recognizing that my hypnograms are in the numpy format? [ann_to_class[a] for a in annotations] doesn't seem to be coherent with the numpy format.

Here is the error:
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/utime/dataset/sleep_study.py", line 595, in load
self._load()
File "/opt/conda/lib/python3.7/site-packages/utime/dataset/sleep_study.py", line 555, in _load
sample_rate=header["sample_rate"])
File "/opt/conda/lib/python3.7/site-packages/utime/io/high_level_file_loaders.py", line 106, in load_hypnogram
sample_rate=sample_rate)
File "/opt/conda/lib/python3.7/site-packages/utime/io/hypnogram/hyp_extractors.py", line 144, in extract_hyp_data
ann_to_class=annotation_dict
File "/opt/conda/lib/python3.7/site-packages/utime/hypnogram/utils.py", line 280, in sparse_hypnogram_from_ids_format
ann_class_ints = [ann_to_class[a] for a in annotations]
File "/opt/conda/lib/python3.7/site-packages/utime/hypnogram/utils.py", line 280, in
ann_class_ints = [ann_to_class[a] for a in annotations]
KeyError: 0

Best regards,
Alexandre

alexblnn · 2021-03-12T10:42:54Z

I have a last question: it seems that by default utime uses the sparse categorical cross entropy loss, is it possible to use the sparse generalized dice loss you mentioned in the paper? My dataset is pretty highly imbalanced (7% for the "1" class only) and I believe it could be useful.

Thanks again for your help!

perslev · 2021-03-12T10:46:46Z

Great that you made it work! :)

You can use the dice loss by replacing 'SparseCategoricalCrossentropy' with 'SparseDiceLoss' in the hyperparameter file.

alexblnn · 2021-03-12T14:57:57Z

Thanks for the tip; weirdly enough the dice loss doesn't seem to manage to handle the class imbalance. After a few epochs the model always predicts 0 (majority class), do you have an idea of what could cause this?

perslev · 2021-03-15T08:07:54Z

Hmm that is strange indeed. Did you try to let it run for a bit longer and see if it picks up on the minority class? And how does the learning curve look, is the loss decreasing at all?

alexblnn · 2021-04-15T22:23:20Z

Hi Mathias, sorry for not updating you on this earlier, i forgot about this issue. I had a problem with GCP and I lost my UTime notebook, so I wasn't able to do any more testing. Anyways, thanks for your help on this and good luck in the future :)

alexblnn closed this as completed Apr 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

numpy hypnogram format #15

numpy hypnogram format #15

alexblnn commented Mar 11, 2021

perslev commented Mar 12, 2021

alexblnn commented Mar 12, 2021 •

edited

Loading

alexblnn commented Mar 12, 2021 •

edited

Loading

perslev commented Mar 12, 2021

alexblnn commented Mar 12, 2021

perslev commented Mar 15, 2021

alexblnn commented Apr 15, 2021

numpy hypnogram format #15

numpy hypnogram format #15

Comments

alexblnn commented Mar 11, 2021

perslev commented Mar 12, 2021

alexblnn commented Mar 12, 2021 • edited Loading

alexblnn commented Mar 12, 2021 • edited Loading

perslev commented Mar 12, 2021

alexblnn commented Mar 12, 2021

perslev commented Mar 15, 2021

alexblnn commented Apr 15, 2021

alexblnn commented Mar 12, 2021 •

edited

Loading

alexblnn commented Mar 12, 2021 •

edited

Loading