list index out of range #13

yuehua-Song666 · 2024-05-01T17:41:20Z

Dear authors,

Thank you so much for such great work. I'm really interested in it.
I got an issue here, after getting processed.pt, I tried to run main.py. It uses das_split.pt to split the data into train, val and test, right? But I got an "index out of range" error. I wonder if you have any clues why this happened? By the way, I saw under the data folder, you have three '_split.pt' files, can you please tell me the difference between them?

Error log:
46 Traceback (most recent call last):
47 File "/home/yjwang/geometric-rna-design/main.py", line 246, in
48 main(config, device)
49 File "/home/yjwang/geometric-rna-design/main.py", line 39, in main
50 train_list, val_list, test_list = get_data_splits(config, split_type=config.split)
51 File "/home/yjwang/geometric-rna-design/main.py", line 119, in get_data_splits
52 train_list = index_list_by_indices(data_list, train_idx_list)
53 File "/home/yjwang/geometric-rna-design/main.py", line 113, in index_list_by_indices
54 return [lst[index] for index in indices]
55 File "/home/yjwang/geometric-rna-design/main.py", line 113, in
56 return [lst[index] for index in indices]
57 IndexError: list index out of range

Thanks in advance,
yuehua

chaitjo · 2024-05-21T23:34:53Z

Hi @yuehua-Song666, many thanks for your interest! And apologies for this very delayed response.

I got an issue here, after getting processed.pt, I tried to run main.py.

Have you created the processed dataset yourself from the raw RNAsolo PDB files? Or have you downloaded it from our link: https://drive.google.com/file/d/1gcUUaRxbGZnGMkLdtVwAILWVerVCbu4Y/view?usp=sharing

It uses das_split.pt to split the data into train, val and test, right? But I got an "index out of range" error. I wonder if you have any clues why this happened?

I think the index error could be happening if you have created the processed dataset by yourself and there are fewer entries/samples in the new processed dataset than there were when I created the splits. Could you check whether this is the case?

Essentially, index out of range means that the list of indexes in the das_split contains one or more indexes that are far too large to be able to correctly index the processed data list. It is likely that the processed data list has length N, but the index value is something like N + x > N, so it leads to an index out of range error.

By the way, I saw under the data folder, you have three '_split.pt' files, can you please tell me the difference between them?

We have provided two splits used in our experiments in the data/ directory:

Single-state split from Das et al., 2010: data/das_split.pt (called the Das split for compatibility with older code)
- This split is used to fairly evaluate gRNAde for single-state design on a set of RNA structures of interest from the PDB identified by the Das et al. paper, which mainly includes riboswitches, aptamers, and ribozymes.
- We identify the structural clusters belonging to the RNAs identified in Das et al. and add all the RNAs in these clusters to the test set (100 samples).
- The remaining clusters are randomly added to the training and validation splits.
Multi-state split of structurally flexible RNAs: data/structsim_split.pt
- This split is used to test gRNAde's ability to design RNA with multiple distinct conformational states.
- We order the structural clusters based on median intra-sequence RMSD among available structures within the cluster.
- The top 100 samples from clusters with the highest median intra-sequence RMSD are added to the test set. The next 100 samples are added to the validation set and all remaining samples are used for training.

Let me know if this is helpful.

chaitjo · 2024-06-04T22:46:28Z

Hi @yuehua-Song666, I recently updated the instructions for preparing the data and for reproducing our splits for benchmarking: #16

Somebody else told me that RNAsolo was no longer allowing downloading older versions based on date cutoffs, and I suspect the issues you were facing can be due to the same reason. If you try the new data instructions in the README, I think it should work.

Let me know how it goes!

yuehua-Song666 · 2024-06-05T03:28:57Z

Hi authors,

Thank you so much for such an useful reply! I figured it out. =)

Thanks a lot,
Yuehua

yuehua-Song666 closed this as completed Jun 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

list index out of range #13

list index out of range #13

yuehua-Song666 commented May 1, 2024

chaitjo commented May 21, 2024

chaitjo commented Jun 4, 2024

yuehua-Song666 commented Jun 5, 2024

list index out of range #13

list index out of range #13

Comments

yuehua-Song666 commented May 1, 2024

chaitjo commented May 21, 2024

chaitjo commented Jun 4, 2024

yuehua-Song666 commented Jun 5, 2024