-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataset #7
Comments
Just call the |
I downloaded dataset Cinc 2021 from https://physionet.org/content/challenge-2021/#files . I want to run trainer.py from benchmarks/cinc2021. I also added ds_train and ds val.
` I am getting below error: File "trainer.py", line 423, in |
It's a typo in this file, which happened perhaps when doing copy-paste (from torch_ecg/databases/datasets/cinc2021/cinc2021_dataset.py). The right bracket of this |
Hi, I'm trying to run trainer.py for train_hybrid_cpsc2020. I have downloaded the CPSC 2020 dataset and specified the data path inside cfg.py like this: File "C:\Users\AK\miniconda3\envs\cpsc\Lib\site-packages\torch\utils\data\dataloader.py", line 350, in init |
It seems that the data reader did not find the recording files. The def _ls_rec(self) -> None:
"""Find all records in the database directory
and store them (path, metadata, etc.) in some private attributes.
"""
self._df_records = pd.DataFrame()
n_records = 10
all_records = [f"A{i:02d}" for i in range(1, 1 + n_records)]
self._df_records["path"] = [path for path in self.db_dir.rglob(f"*.{self.rec_ext}") if path.stem in all_records]
self._df_records["record"] = self._df_records["path"].apply(lambda x: x.stem)
self._df_records.set_index("record", inplace=True)
all_annotations = [f"R{i:02d}" for i in range(1, 1 + n_records)]
df_ann = pd.DataFrame()
df_ann["ann_path"] = [path for path in self.db_dir.rglob(f"*.{self.ann_ext}") if path.stem in all_annotations]
df_ann["record"] = df_ann["ann_path"].apply(lambda x: x.stem.replace("R", "A"))
df_ann.set_index("record", inplace=True)
# take the intersection by the index of `df_ann` and `self._df_records`
self._df_records = self._df_records.join(df_ann, how="inner")
if len(self._df_records) > 0:
if self._subsample is not None:
size = min(
len(self._df_records),
max(1, int(round(self._subsample * len(self._df_records)))),
)
self._df_records = self._df_records.sample(n=size, random_state=DEFAULTS.SEED, replace=False)
self._all_records = self._df_records.index.tolist()
self._all_annotations = self._df_records["ann_path"].apply(lambda x: x.stem).tolist() Theoretically, you can pass any of its parents because the pathlib.Path.rglob is used. |
I think I know the reason now. The CPSC2020 dataset uses sliced recordings since the original recordings are fairly long. So, you should call the persistence method first, which takes quite a long time to slice the recordings. |
Thank you for your guidance, it seems like training requires a CNN.h5 and a CRNN.h5 file located in signal_processing/ecg_rpeaks_dl_models directory but I only have the corresponding json files. It's worth noting that I've only run trainer.py. Should I do anything before running trainer.py? could you please help me on this one as well? |
I added automatic downloading of these models, which you can find in https://opensz.oss-cn-beijing.aliyuncs.com/ICBEB2020/file/CPSC2019-opensource.zip. However, these models were trained with a very older version of Keras. One might have trouble loading these models. I also removed the auto-load of deep learning models in the signal_processing module. The changes were made in the dev branch currently and will be merged into the master branch soon. |
How to get CINC2021 dataset? How to download dataset from url you provided in benchmarks. I could not find prepare_dataset.py but I found it from original repo.
The text was updated successfully, but these errors were encountered: