Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

longer compound IDs coming #1754

Open
jamesmkrieger opened this issue Sep 19, 2023 · 1 comment
Open

longer compound IDs coming #1754

jamesmkrieger opened this issue Sep 19, 2023 · 1 comment
Labels
Feature request Adding a feature that does not yet exist.

Comments

@jamesmkrieger
Copy link
Contributor

At current growth rates, we anticipate running out of three-character Chemical Component IDs by the end of 2023. After this point, the wwPDB will issue five-character alphanumeric accession codes for CCD IDs in the OneDep system. To avoid confusion with current four-character PDB IDs, four-character codes will not be used. Owing to limitations of the legacy PDB file format, PDB entries containing the new five character ID codes will only be distributed in PDBx/mmCIF and PDBML formats (see previous announcement).

In addition, wwPDB has reserved a set of CCD IDs: 01 - 99, DRG, INH, LIG that will never be used in the PDB. These reserved codes can be used for new ligands during structure determination so that they can be identified as new upon deposition and added to the CCD during biocuration.

wwPDB asks users and software developers to review code to remove any current limitations on CCD ID lengths, and to enable use of PDBx/mmCIF format files. Example files with extended CCD IDs are available via GitHub to assist code revisions. Information about the PDBx/mmCIF dictionary and file format is provided at mmcif.wwpdb.org.

For any further information please contact us at info@wwpdb.org.

@jamesmkrieger jamesmkrieger added the Feature request Adding a feature that does not yet exist. label Nov 5, 2023
@jamesmkrieger
Copy link
Contributor Author

The PDB three-character Chemical Component IDs are consumed and PDB has begun issuing five-character alphanumeric accession codes for CCD IDs in the OneDep system. To avoid confusion with current four-character PDB IDs, four-character codes are not used. Owing to limitations of the legacy PDB file format, PDB entries containing the new five character ID codes are distributed in PDBx/mmCIF and PDBML formats and will not be supported by the legacy PDB file format (see previous announcement).

In addition, wwPDB has reserved a set of CCD IDs: 01 - 99, DRG, INH, LIG that will never be used in the PDB. These reserved codes can be used for new ligands during structure determination so that they can be identified as new upon deposition and added to the CCD during biocuration.

wwPDB is asking users and software developers to review their code and remove any current limitations on PDB and CCD ID lengths, and to enable use of PDBx/mmCIF format files. Example files with extended PDB and/or CCD IDs are available via GitHub to assist with code revisions.

To learn about PDBx/mmCIF, please visit https://mmcif.wwpdb.org/.

For any further information please contact us at info@wwpdb.org

@jamesmkrieger jamesmkrieger changed the title longer ligand IDs coming longer compound IDs coming Dec 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature request Adding a feature that does not yet exist.
Projects
None yet
Development

No branches or pull requests

1 participant