Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data read #2

Open
banchuangliangyue opened this issue Aug 23, 2019 · 6 comments
Open

data read #2

banchuangliangyue opened this issue Aug 23, 2019 · 6 comments

Comments

@banchuangliangyue
Copy link

banchuangliangyue commented Aug 23, 2019

First of all, thank you very much for your code! When I read your code, I found the process of generating ECG sequence in the original code is as follows.
Firstly, the heartbeat segmented by ECG records in the training set is stored into a two-dimensional
array, then the heartbeats of each kind (one of N S V) are selected and shuffled to get three smaller
two-dimensional arrays. Next, the three two-dimensional arrays are concatenated by rows to get a
larger two-dimensional array. Finally, the array is split into a sequence every max_time(the default is
10) line.
So I would like to ask why you want to group all the heartbeats according to categories in advance?
If you do this, the labels of each sequence are almost the same, and the heartbeats contained in each
sequence are not split from the same record at all (due to shuffle). This seems very different from the
actual situation, where the sequences should be composed of heartbeats from the same ECG record.
I wonder if this is a hidden trick of data processing? Why do we do that? Thank you very much for your reply! Here are some of the code that puzzles me.
data = np.asarray(data)
shape_v = data.shape
data = np.reshape(data, [shape_v[0], -1])
t_lables = np.array(t_lables)
_data = np.asarray([],dtype=np.float64).reshape(0,shape_v[1])
_labels = np.asarray([],dtype=np.dtype('|S1')).reshape(0,)
for cl in classes:
_label = np.where(t_lables == cl)
permute = np.random.permutation(len(_label[0]))
_label = _label[0][permute[:max_nlabel]]
_data = np.concatenate((_data, data[_label]))
_labels = np.concatenate((_labels, t_lables[_label]))

data = _data[:(len(_data)// max_time) * max_time, :]
_labels = _labels[:(len(_data) // max_time) * max_time]
data = [data[i:i + max_time] for i in range(0, len(data), max_time)]
labels = [_labels[i:i + max_time] for i in range(0, len(_labels), max_time)]
permute = np.random.permutation(len(labels))
data = np.asarray(data)
labels = np.asarray(labels)
data= data[permute]
labels = labels[permute]
@banchuangliangyue
Copy link
Author

@sajadmo

@MousaviSajad
Copy link
Owner

MousaviSajad commented Sep 2, 2019 via email

@banchuangliangyue
Copy link
Author

Thanks for your reply. I am looking forward to your updated code!

@jrvmalik
Copy link

@banchuangliangyue is correct, both the training and testing sequences contain one label type only.
image

@MousaviSajad
Copy link
Owner

MousaviSajad commented Dec 31, 2019 via email

@banchuangliangyue
Copy link
Author

@banchuangliangyue is correct, both the training and testing sequences contain one label type only.
image

So I think this means knowing the label of the test heartbeat in advance, which is unrealistic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants