Skip to content

Process mmcif structures #27

@Tlijun821

Description

@Tlijun821

Hello, while using Process mmcif structures to handle data, I found that running process_mmcif.py only generates the structures and records of the data. However, in your datamodule, the required data structure is:
• target_dir_for_dataset_A/

◦ structures/  

    ▪ {record_id}.npz  

• tokenized_dir_for_dataset_A/

◦ tokens/  

    ▪ {record_id}.pkl  

◦ records/  

    ▪ {record_id}.json  

◦ manifest.json  

It seems that the data preprocessing step is missing the part that generates the tokens files. How should this be resolved?

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentationenhancementNew feature or requestquestionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions