1.0.0 (2025-08-18)

BREAKING CHANGE: update to cifutils 2.0 (#50) (77dd6fd)

Bug Fixes

3to1 (ab6b4b2)
adapt naming of regression tests to match new names (c44b387)
add 'overwrite' option to view_pymol to avoid updating existing structures (#64) (ac0f12d)
add make to apptainer (7cba23e)
add back readme (#1) (831bc23)
add back stacking msas by recycle (#2) (fbe0c32)
add conda init (2e0a0c2)
add current data to fail log for ease of analysis (da4bdf7)
add links to the ccd & pdb mirrors (430ae71)
add missing default (4f020cf)
add missing test files for local test (68f2e0a)
add missing transforms in AF3 pipeline (39a465d)
add new logo and changes of urls to public url (b753e57)
add test (80b6113)
add test cases (11dbb61)
add test coverage bit (57166f5)
add testpypi setup: (7ded1bf)
add tests for fix_formal_charge, ruff (0a4072d)
adding badges (4588955)
address minor pipeline issues in af3 (3943cd7)
adjust error type on transform history tracking (a468233)
af3 parsing (#130) (37c6791)
allow remove_unsupported_chain_types to work without specified query_pn_unit_iids. Implement functional API while we're at it. (126b846)
allow AddRFTemplates to proceed when no pdb_id given (c63f10e)
Allow compatibility with newer rdkit version. (#122) (e6ecbac)
allow more general covalent bonds (1ef9858)
allow parsing entries with multiple methods (e.g. 5e5j) (28ad455)
allow passing on boolean annotations, allowing distogram bins to be a list (9253102)
allow processing to continue in the case of covalent bonds between... (88036e4)
allow saving of failed examples to error, default to a user-based failures path on scratch (c3160de)
allow unknown users for CI (fa14dda)
apptainer creation to expose /net (24b8be4)
apptainer spec (a0c3294)
arg_fixing: swap coordinates of nh1/nh2 instead of renaming when resolving ARG naming ambiguity, since otherwise charges & bond order are inconsistent (NH2 carries positive charge & double bond by convention) (#41) (8d4b0a6)
argument error (9c5daba)
atom level embeddings (#159) (ebaaf51)
automorphisms (#36) (7cd6ad2)
avoid building covalent bonds with water or crystallization aids (951a12c)
bad ligands, new test dataset (0234fab)
bonds (#125) (2b1a714)
bug fixes for inference (#46) (e5254d9)
bug in initializing chain info (7c89186)
bugfix when using get_residue_starts and general annot_start_stop_idxs, which incorrectly used len() instead of .array_length() to determine the size of an AtomArrayStack (#65) (9b2cc83)
Bugfixes in get_within_group_res_idx and get_within_poly_res_idx (#121) (4955d19)
bugs in tests (6b72a3f)
bugs in using MSAs for inference, supporting MSAs with # headers (f7c2c44)
build apptainer (ce3c4d6)
build assembly arguments (905e6b9)
by default cast aromatic bonds to same order when comparing atom arrays for graph hashes (4587d10)
cached conformers with chirals (#149) (cec9f83)
calculate rf2aa chirals off af3 centers (so they are correct) (#114) (64bfca9)
categories: keep residues not in the CCD instead of converting to UNL (#47) (6a9b0a1)
chain type miss (0099133)
chain_id to _iid in Frank's hotfix (9fe6186)
chains with all resolved tokens (886ffc3)
changing chain_iid to pn_unit_iid in AF3 features (181467e)
changing inference ligand residue names to use non-conflicting characters (641f1e6)
charges (d730b8a)
chirals (#105) (732af76)
ci (119b5fa)
ci (7720ee7)
cif files for inference (#79) (f552453)
cif: remove automatic writing of 2d categories to cif (20bbb1f)
clash enum (f2f2613)
correct bug with assume_residues_all_resolved when parsing pdb file (#22) (33838cb)
correct for bonds from nucleophilic additions (#100) (eb65472)
correct handling of dative bonds, improve timeout error handling in conformer generation (e66c03b)
correct ligand filtering expressions to deal with None/NaN values, remove superfluous transform history test assertion (35cf5fa)
correct usage of fix_formal_charges to only work on inter-residue bonds (0cf5e10)
corrected bond adding in get_structure (#84) (6d69f4f)
correctly resolve atoms to closest resolved residue in sequence (#88) (0de26f1)
create feats dict only if it doesn't exist yet (0f01c8d)
datasets (8f128f0)
dealing with sequence heterogeneity (e.g. 3nez) (79f60cb)
decouple linting of biotite and legacy cifutils (3c8c535)
default to original atom id if renaming cannot be performed (4d1efa3)
dna tests (09dcb86)
documentation (483abb9)
downgrade cluster size < len(df) assertion to warning, fix rare issue with templates having inf confidence (0d6d4d7)
dynamic template paths (#104) (bb12509)
dynamically generate cifutils version & track any commited or uncommited changes (151fa37)
embedding dim, FilePath dataset generalization (#156) (ef867be)
empty struct_conn (3b02789)
enable datahub version extraction even when symlinked (718b023)
enable looking for alt_atom_id's in parsing struct_conn connections as well (96b06cd)
enable version extraction when cifutils is sym-linked (fa24093)
enforce correct numpy shapes (3e4b02a)
ensure /squash gets mounted in ci (e7e519c)
ensure dataloader wrappers & encoding definitions are pickle-able for use in spawn multiprocessing (a9e414e)
ensure debug path is read/writeable by others (c74fbeb)
ensure element is given as str (4529045)
ensure element is str even if atomic number is given (83ec2d3)
env token (4031d8f)
env token (350c2de)
explicitly specify NA values (4fe2121)
fallback CCD coordinates (#101) (d6c797b)
first try installing via BIOTITE_INSTALL_TOKEN (e0903af)
fix pad_dna tests (1ca0367)
fix accidental nesting (c0430e6)
fix arginine ambiguity resolving function (1b96ae1)
fix CI paths & add stage for slow tests) (0cf3679)
fix conftest import (075545e)
fix file type test (4ad62b9)
fix flaky pad_dna tests (0019c2a)
fix matching of bond atom ids and names (6437d26)
fix old tests which broke due to function signature change (eb64dee)
fix operation expression parsing for rare exceptions (a398911)
fix ref_space_uid to res_id, not token_id (#97) (ae0ac04)
fix regression tests to have nan coords for unoccupied atoms (d964165)
fix remaining issues with conformer generation transform (e59a65e)
fixes for data preprocessing (#124) (ed10245)
force pip upgrade of biotite (c2b6de7)
formats, subset to keys (670736d)
further CI improvements (e9b3476)
further test fixes from refactor & logging improvements (625cfc9)
handle annotation carry over from atom_array to full_atom_array by id matching instead of via ordering. This resolves the remaining matching problems (36a1a0b)
handle CIF files with no resolved atoms (19ae934)
hydrogen addition placement in parser affecting resolving residu… (#74) (86a2973)
hydrogen policy (97efd06)
import error (801165e)
imporve error message (4266666)
improve error message (b8046d7)
improvements for PadDNA (#110) (c35443f)
in case of unknown atom names, do not rename (fix to allow us testing against assinging unknown atom names 0 occupancy in test_parser) (0a900cf)
include apptainer build in ci (#8) (05546e2)
include chembl smirks in package (4f48f45)
include nucleic acids when masking residues with unresolved back… (#96) (9bbc326)
increase stringency on inferred polymer bond creation. Only create bonds between AA-AA or NA-NA like residues automatically. Everything else must be defined in struct_conn (1f362f9)
inference: fixes for inference (#59) (e6a62da)
inferred sequence (c74946c)
io_utils: allow writing scalars (707ef8a)
io_utils: also allow passing pathlib.Path to read_any (7c887fb)
io_utils: do not write empty CIF categories, which otherwise causes an error (7c49f15)
io: backwards compatibility bug (2965ee3)
issues with RDKit pickling that lead to information loss for inferring atom names (4940fbe)
leaving_groups: fix leaving group computation for edge case of only hydrogens, add further tests (fd749ee)
loading entity from spoofed cif (#131) (c1b0abb)
make PadDNA optional (2dcc610)
make arginine renaming work (b3c55dd)
make cif parsing more robust to non-existent fields (0220241)
make compatible with cifutils hydrogens (#77) (68f321b)
make conformer generation timeout more lenient (79e7b08)
make IPython part optional (CI containers don't have ipython) (23c1712)
make leaving group identification insensitive to hydrogens for robustness (2e6a889)
make openbabel dependency optional by try-excepting import (c66639c)
make rdkit-dependent regression tests pass regardless of operating system (e5f0f0b)
make test import function properly with pytest without requiring module to be installed via pip (a4e8235)
make type hints in tests backward compatible with python <3.10 (5d169a4)
make typehints backwards compatible with python versions <3.10 (8fbe3ac)
metadata check in test parser (495ba44)
migrate viz_utils to cifutils (f9a9eb9)
minor bugfixes to tests (c7c214c)
minor fixes to tests (e841dfc)
minor improvements & generalisations (removing unneccessary reliance on extra_info, allowing templates to work with chain types as enums/ints/strings, ...) (1887e5d)
minor integration bugs from renaming, template masking (4f4f38c)
minor test fixes (d967d8f)
minor test issues, make ARG renaming optional to compare to legacy parser (159fb59)
minor updates for production (#158) (64ca81b)
misc bug fixes (e98792d)
missing residues get nan coords (fa877a8)
missing test PDB IDs (b0fc6f4)
molecule_iids (064a69a)
more test bugs (7d9846f)
more workers for CI (21ecddc)
MR comments, tests (383f24a)
msa bug (69ad2c8)
MSA caching (eefb05f)
MSAs with NCAA (#101) (813862f)
mse tests (a54c7dc)
multiple ligands during inference (581b23c)
multiple ligands during inference (54b840c)
name parameter in tests (360a373)
names (fdaf9b0)
naming (b6753a3)
new env token for pypi (#3) (1948c74)
non-update of AtomArrayStack and missing fields in added H atoms (#62) (65afa5d), closes #63 #64 #65
np.full (b2823ed)
only log if heavy atoms failed to match (b1659f0)
PadDNA and rdkit bugs (#124) (19f80d9)
PadDNA asserts to warnings (#102) (e9848da)
parsers and warnings (#129) (71a54a0)
pass args to parse_from_cif/pdb (2d001d2)
patch biotite array (#107) (5375829)
patch biotite's get_residue_starts function to differentiate between residues of different transformation ids (dbef1c4)
patch error where leavingroups were overwritten by latest found group instead of accumulated (c13d795)
paths (b3c5258)
paths to biotite (91878b9)
PDBs with polymers and NPs on same chain (7f5fdf1)
peptides as polymers during inference (#82) (bc915aa)
peptides in AF3 validation splits (70ece97)
per default, set output of atom_array_from_rdkit to hetero atoms (28a432c)
permissions for caching (e3843e9)
place unresolved atoms (#86) (1af2f20)
rdkit from smiles (#127) (d5d1e3c)
rdkit: utf-8 encoding (#20) (f69f683)
re-use transform for hydrogen removal, set default hydrogen policy to 'keep' when transforming to atom arrays (#61) (28672ee)
readability improvements (fbe9d28)
refactoring to not change ground truth (e237a4a)
reintroduce _get_matching_atom for error handling (57926ef)
reintroduce masking of residues that had a heavy atom mismatch (c13c36e)
relax tolerance (dc0718c)
remaining path changes (2296fe1)
remove needs from .gitlab-ci.yml file (d112a3a)
remove automorphisms from rdkit (6272b3a)
remove close pn units column (0dea67c)
remove cuda (6b087ed)
remove custom error type wrapping due to pickling issues (a4587f8)
remove deprecated only statement (64e4ae1)
remove deprecated remove_hydrogens argument (6b90a72)
remove duplicate test (a36afad)
Remove erroneous assert statement in GenericDFParser (#120) (8f950cf)
remove legacy parser samples that do not match up anymore due to parser improvements (42588cb)
remove query seq from extra msa first row (1f43f74)
remove redundant #TODO (110bc9d)
remove spurious argument (c380e88)
remove unecessary dataclass which causes errors downstream (58710fd)
remove unused imports, add missing imports, clean up whitespaces (e8a5254)
rename msa to msa_stack in AF-3 pipeline (2c76c09)
replace README (#1) (cbe76c0)
resolve atom_order mismatch bug (bc82059)
resolve occasional duplicate indices (e.g. 4xkw) by also specifying res_name (5ccd8f0)
revert CI (763a295)
run apptainer job on worker (c982a56)
safeguard coordinate extraction from ideal rdkit conformers in case no ideal rdkit conformer coordinates are provided in the cif (33fa08b)
samplers (#111) (ae9cf6f)
selection strings, utils (#85) (bf9157f)
semantic release as single source of truth for versions (cd467f2)
set crop_center_atom_id, ..._atom_idx and ...token_idx even when not cropping for forward compatibility with further atom-level cropping (#155) (cc39788)
set CI to fail if CI job gets killed (12f1256)
set conformer default to not use forcefield optimization (b40bc5e)
set log levels from warning to info (5c7479a)
setup testpypi (#2) (63b8e95)
simplify and fix CI (1b0d453)
skip bind/nobind tests (c1f73d8)
skip pad DNA tests (#163) (a9cd7db)
sort imports, clean up parse API (a5271c8)
specify biotite internal dtypes for matching to avoid rare cases of long chain ID's etc (2455d37)
speed up parse by 3x by vectorizing various subroutines, reducing the amount of subsetting and adding a cache hierarchy (#44) (d3b7f54)
speed up tests (8070409)
standardize heavy atom naming for matching rare cases where alt atom id's are used (fe69d4b)
subset atom id renaming to heavy atoms (8cea0eb)
support AF-3-style CIFs (#67) (947d1cd)
switch to patched get_residue_starts function to avoid rare bugs where, after cropping, two residues that only differ by transformation_id are sequential in the atom array and get misinterpreted as a single residue (593eebb)
sym center trans id type (cea2377)
template: correct usage of fix_formal_charges (89b7ab0)
template: keep hydrogens around until fixing formal charges (f774b4b)
test bug (3fbffe8)
test for updated inference with multiple ligands (a8fc1ca)
test restricting to CI to merge requests onto main and non-drafts only (1a26c34)
testing speed (435ce48)
tests (193a9cd)
tests (b4116df)
tests (4e19c9a)
tests (4220698)
tests (b994834)
tests for AF3 pipeline (c93dc4c)
tests for build assembly arguments (1bb9ff2)
tests refactored transforms (de4c7b5)
tests: fix import error (99ce9b4)
transforms/base: treat edge case of 0 probabilities in RandomRoute (7cd74fd)
try/except ccd loading (9c43eec)
try/except ccd loading for inference (6f3fc99)
type-cast for older torch versions (c83eccd)
type-cast issue (dae0479)
typo (e269513)
typo (25d9fc7)
typo (ee5f811)
typo (00e1d5e)
typo fix in ruff config (c65895f)
typo in list o_O (a52193f)
unconditional support, try/except metrics (#116) (19bc1dc)
undo accidental too deep nesting (e5795c7)
unnest bucketize (ea69aeb)
unwanted ions in validation (#41) (0926ec9)
update caching to save as .pkl.gz and fix path creation (dc34984)
update ci (8fd01c6)
update ci for github (#10) (1fe9024)
update dependencies to align with rf3 (cd9e8b7)
update description (f376f09)
update deserialization checks (3101a77)
update encodings to work with AF3SequenceEncoding (a62feba)
update outdated escape pattern (6a85641)
update paths to new datahub repo (e98c490)
update pipeline regression tests (7d6f57b)
update RDKit version (8386c69)
update regression tests (0a12fca)
update regression tests to ignore hydrogens, fix typos, add debug code to regression tests (9a599e5)
update standardize heavy atom id renaming to deal with elements as integers and strings (12c807b)
update test coverage (2939f46)
update tests to reload general CIFParser object for test speed (d2be606)
update to biotite main, remove hack for NaN coordinates (#54) (48a49b3)
update to latest cifutils (fde3f3f)
update to latest cifutils apptainer (73bb693)
upgrade > force-reinstall to enforce updating even on non-version bump commits (c1629e6)
use extra info dictionary with all_pn_unit_iids (d1a8380)
validation: update to biotite main, fix msa bug, add LoadBalancedDistributedSampler (#74) (4711256)
various bug fixes / refactors (e989c3e)
various minor fixes to get pipeline profiling running again & extend to af3 (4b7547f)

chore

update cifutils (7474b0d)

Features

add .bcif parsing test (951a0c2)
add category_to_dict util (9aa4c9f)
add query, mask and idxs functionality (#111) (73247d8)
add show_cartoon argument to view (#63) (f66071e)
add sum_string_arrays util to sum string arrays with dynamic dtype resizing (e185fa6)
add view_pymol functionality and improve to_cif exports (#38) (05df1c6)
add AF2 FB Distillation dataset & corresponding tests (b27cbee)
add af3 inference pipeline (d842d05)
add AF3 token level features (96af662)
add arginine renaming tests (61ae5a5)
add atomworks CLI (dc84b1b)
add autoformatting commands and make script (682119f)
add automorphisms to AF3 pipeline (45b01f9)
add capabilities to slice atom array by segments (e.g. ResIdx / ChainIdx segments) (#72) (576a16d)
add centering and principle components to atomselectionstack (#114) (9f3eb39)
add chain types for 'water', 'branched' & 'macrolide' (44ac820)
add CI apptainer building stage, improve test speed, fix minor CI bugs, add CI secrets, bump environments, add testmon & xdist pytest plugins for speeding up tests (9344fad)
add code for plotting pipeline performance (46a1344)
add confidence head processing to af3 pipeline (#56) (2d590aa), closes #41 #45 #44
add conformer generation for smiles to keep stereochemical annotation (#78) (6952251)
add contants, remove old assets (55f2266)
add convenience API for ChainType enums (0bc5dc6)
add convenience readability utils (0ea9676)
add crystallization aid & ligands to remove data (4b3248b)
add custom context for handling errors (#90) (09e9642)
add dynamical string size resizing to get minimum length (a615926)
add encdoing to pipeline (a03f3d4)
add environment specification (b3edcb5)
add fixing of formal charges for atom arrays for inference (dd7fc18)
add flag to fix formal charges (a4eafae)
add full fledged AF3 Encoding (3cf571c)
add functional API for remove hydrogens (7d43400)
add functional API for spatial cropping (c07475a)
add functionality to remove crystallization aids, including a test (2efc191)
add further functional API (beeb6fb)
add further rf2aa assumption check that ensures that no individual chain can ever be entirely unresolved (as can happen e.g. with chain AB in 3rj1) (fa04418)
add geometry utils (0adf3c7)
add ground truth ref_pos through new track (#115) (2bcf1bb)
add group scatter utils (#119) (baa1739)
add hydrogens via biotite supported hydride library (#56) (9df9801)
add immutable_lru_cache, add mapping of chem_comp_types to their corresponding UNKNOWN ccd (c00bbd8)
add inference utils, add rdkit utils and clean up base cifutils (enums, constants) (#18) (5071084)
add is_same_in_segment convenience function (#91) (8e7aa7b)
add ligand of interest information (840a3b0)
add mapping of noncanonicals to canonical residues (60fddc4)
add metal elements as constants (c036af5)
add missing ChEMBL rules for fixing (b006ddf)
add MSA paths into chain info (#52) (615bba3)
add offset-slope timeout for rdkit conformer generation (a824994)
add pdb example from FB distillation set (93f3633)
add prior bugs as test cases (e15ca01)
add reference molecule feature transforms (3c66009)
add RemovePolymersWithTooFewResolvedResidues Transform to pipelines (c6b756d)
add scaffold for environment, apptainer spec, readme, add test coverage (2cc9912)
add scaffold test for fixing operations (14552a3)
add script for fast ColabFold-style MSA generation with MMseqs-GPU (#71) (06ba64b), closes #76 #95 #99
add scripts for convenient IPD specific setup, add documentation (2e86cdd)
add scripts to get the ccd & pdb mirrors and replace digs-specific paths through the corresponding mirror paths (753485c)
add standard NAs & AAs, remove old assets (9d648a7)
add support for bcif & pdb filetypes, add universal loading, improve to_cif functionalities to allow outputting arbitary metadata (f380319)
add support for rf2aa inference pipeline (8b48008)
add test for selection utils (3dd5e4f)
add tests for geometry utils (9a428db)
add tests for timeout utils (aeb4c24)
add tests for visualize (ce6674e)
add tests for writing out cifs (292c8ad)
add the abillity to randomly pad DNA (#84) (5cd4088), closes baker-laboratory/cifutils#72 #85 #87
add timeout context manager (25beacc)
add timeout decorator (72ed6a2)
add tipatom constants (9f7fe70)
add to_pdb_string tests (c638ee9)
add tools for nested dictionaries (2ccd3ad)
add tools to fix partially corrupted molecules, streamline atomarray <> rdkit interconversions (999dc39)
add transform to compute spatial k-nn masks useful for spatially local attention (#60) (23665ff)
add transform to further shrink a crop if the crop at token level would result in a crop that exceeds a specified max number of atoms (#9) (0769a8a)
add unresolved residue handling to pipeline (09cc692)
add util to compute rng hash for convenient debugging & easy random state comparisons (#98) (d2d9d63)
add utility to get RDKit conformers from res names with timeout & fallback to idealized coords (b6cc8e3)
add utils for writing cif files regardless of where they come from, enable view_pymol to visualize CIFBlock & BinaryCIFBlocks (#66) (d03df1a)
add utils to get automorphims from rdkit (2561644)
add utils to get idxs and masks for representative tokens in AF3 (87c390a)
add utils to go directly from res_name to rdkit molecules, fix capitalization of element lookup (0c44352)
add utils to patch metals at symmetry centers (bfca0f8)
add utils to standardize atom id's to the standard atom id instead of alternative atom id (44da2a3)
add visualization utils for atom arrays (043c263)
address bad conformer id issue for molecules with many rotatable bonds (0caefe3)
af-3 validation dataset loaders initial commit (920bbfa)
allow token_starts re-use in token utils, add safeguards for ensuring each token has a representative atom (causes dataloader failures instead of model failures downstream) (3359279)
arbitrary nested datasets (11cafee)
assign stereo-chemistry when converting an atom_array to rdkit based on the coordinates (if possible). Make ccd_code_to_rdkit caching immutable. Add nan-coord utils for AtomArrays & Stacks (#77) (4992c7e)
atom-level embeddings (#151) (58f10c3)
AtomArrayPlus, AtomArrayPlusStack (#109) (7c6622e)
bump biotite version, fix CI, speed up test collection, clean up pyproject.toml (96e8dc3)
bump ruff version (41a9b6f)
caching stores parameters (#45) (ec714ee)
chiral center processing bugfix (#103) (631ab8e)
ci improvements for post-merge pipeline, adding auto-coverage and releases (5fe345d)
convert_af3_model_output_to_atom_array (5d517b5)
database utils for bind/no-bind project (#154) (2f298eb)
disentangle AF3 token representative and token center definitions, bump cifutils version (b61f466)
doc updates (c5772d5)
docs and release setup (#126) (357a681)
enable parallel tests to worksteal (f7f96ca)
enable reading 'all' extra_fields in parser (#76) (a5e70bc)
expose altloc specification during loading, fix saving of altloc id when none specified, to avoid biotite parsing issues (7214edd)
expose datetime and user when logging failed examples, only log per default if user is given (c39ef16)
expose option to choose RDKit conformer generation method (e0760bd)
expose pdb id (640010a)
extract RF2AA assumptions check into its own Transform, generalize PDBDataset (3285be0)
featurization of unresolved residues to avoid distribution shifts (e78da27)
final splits (7e9b2b6)
first attempt at github ci (#2) (a64bfc0)
from_pymol_str for AtomSelection (0a66dad)
generalized-preprocessing (#44) (f20c85c)
ground truth reference conformer (#83) (c5f9a89)
gzip cifs by default (#53) (4fdd6e8)
implement to_pdb methods (8728781)
implement WorkStealDataLoader to de-bottleneck dataloading when parsing from files with highly variable runtimes (bcf5868)
implement AF3 template featurization and harmonize RF2AA & AF3 names (07c5867)
implement automatic semantic versioning release (ae2d775)
implement automorphism features for AF3 (f1bf992)
implement building multiple assemblies (e591844)
implement fixing formal charges after bond formation (e06e39d)
implement MSE to MET conversion (cf8c0d6)
implement proper passing on of random seeds to RDKit (so random… (#93) (92f5cc8)
implement resolving of ARG naming ambiguities (391fbff)
implement standard to alternate atom id translation (37cd4e7)
implement test for bioassembly building (5846d2e)
implement user error when trying to save files with ambiguous bond information (498ad64)
imporve timeout error messages (cf945c3)
improve base transforms by avoiding error masking by TransformPipelineError, implement + operator for transforms, implement basic ApplyFunction transform (f8eb1e0)
improve ci with auto-coverage tests (9bd6fe8)
improve ruff rules & add option to configure number of cores to run pytest on (7a9589f)
include an apptainer build stage in CI (2e5b087)
include atom order tests with PDBs that violate atom ordering (af587fb)
inference bugs (#63) (ee49a31)
inference like AF3 (#104) (526ae95)
initial split notebook (3e65ed5)
integrate pdb parsing into CIF Parser (a26451d)
integrate templates into AF3 pipeline (3dcd50e)
integrate timeouts in rdkit conformer generation (9310df1)
interface splits (ccddbcf)
io_utils,visualize: add bcif output capabilities, generalize view_pymol, (#60) (4216f16)
load cached reference conformers (#145) (257845c)
mask residues with unresolved backbone atoms (84ba100)
md5 hash for hash_atom_array (ec4bd85)
migrate setupy.py instruction into pyproject.toml (b530685)
more robust incrementing of chain ids (4199bfd)
move init arguments to parse, integrate ARG ambiguity resolving (6069770)
MSAs from multiple directories (015a75f)
mse_met conversion test improvements, tightening of typing (9efca8a)
NCAA for inference (#86) (c0c6977)
nested datasets bug fixes (48f36ef)
networkx automorphisms, no tests (cedf164)
parse AtomArrays directly, introduce PDBOrCIFFileComponents (#94) (0ff8a0a)
patch symmetry centers (875a44c)
peptide sampling (127bbc9)
pH info metadata (#122) (9e8fea4)
pipe: add flag to return atom array from af3 pipeline (#76) (a79abe3)
random remove ligands (#144) (775853b)
remove polymer chains with too few resolved residues; refactor filters (5a93d8f)
script to count AF3 tokens initial draft (dbdd489)
selection: add n_body='all' option to get_annotation_categories() (#116) (da5f09f)
separate peptides (be894b4)
SequenceSelection utils (#80) (406cb87)
set up CI (774f271)
specify covalent bonds during inference (#27) (e815a67)
standardize atom ordering within each residue to CCD order (as some PDBs have incorrect ordering) (44c8d48)
start adding AF3 pipeline (47e8355)
subsample templates and rotate conformers (#35) (3c65428)
support buffers in parse (#48) (68406fa)
support models as af3 outputs (#39) (7d269df)
support MSAs during inference (6643f68)
support UNL for inference (#54) (ef8e75d)
switch from black,isort,autoflake > ruff (d2038b6)
switch from internal biotite to public biotite 1.1 (98eae71)
take first chiral subordering (#125) (c383977)
template: implement template creation and matching (9e04e5e)
update AF3 pipeline to include reference features (660ee92)
update residue library creation from CCD to enable idealized coordinates... (9238b6f)
update scripts to count tokens (37e5ab8)
update timeout utils to support both signal & subprocessing based timeouts (subprocess strategy needed for RDKit timeouts) (f651ba4)
update to local biotite installation (5d3ba27)
update to public biotite (v1.1.0) and make apptainer digs independent (#55) (ddfe62f)
updates from 2d conditioning (#139) (3f1b34a), closes #142
upgrade CI to use git clone instead of apptainer rebuilding, fix broken chirals (#147) (41b6c80)
upweight LOI (92df8e2)
use element for atom name char of atomized tokens (#67) (e35e2b0)
user-friendly MSA generation entrypoint (#138) (d55b9c4)
versioning: Include automatic versioning of datahub version, in… (#59) (a8636ad)
visualize: add slot capabilities for view_pymol (#115) (06f6c16)

Performance Improvements

significantly speed up representative coordinate fetching (9326e0c)
vectorize custom inter-residue bond removal (3d8a6ab)

BREAKING CHANGES

Renamed utils files in cifutils
feat: add ipd setup utils, fix missing numpy import
fix(ci): add missing files for CI
chore(ci): group lint & test together again to avoid buggy github ci
feat: add af2 distillation dataset path for ipd setup
fix: combine fast & slow tests in CI, expose number of CPU cores to run on
fix(ci): update time limits
chore: clean up ci & add documentation
chore(ci): robustness improvements in setup.sh
refactor: implement src-architecture for datahub repo
The old export paths to datahub won't work anymore and will need to add /src
chore: cleanup for release
chore: format
fix: preprocessing pipelines
chore: update validation scripts, new test dataframes
chore: update validation notebook
fix: tests broken by using new test datasets
chore: bump cifutils version
update cifutils to version 1.1.0, which includes a significant refactor and breaking changes
Switch to updated cifutils

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.0.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

1.0.0 (2025-08-18)

Bug Fixes

chore

Features

Performance Improvements

BREAKING CHANGES

Uh oh!