All notable changes to this project will be documented in this file.

v0.5.1 (2023-08-15)


  • Edit default openmp num threads behavior (use OMP_NUM_THREADS when set). #404 @lbluque

v0.5.0 (2023-08-15)


  • SQSGenerator.from_processors to create a generator from a list of processors. #398 @lbluque
  • Periodic ground state solver (upper-bound problem formulation) #343 @qchempku2017

v0.4.1 (2023-07-21)


  • Fix #386 by making cython extension classes pickleable. #387 @lbluque
  • Fix #385 data type issues on Windows. #392 @qchempku2017

v0.4.0 (2023-06-09)


  • openmp parallelization when computing correlation and cluster interaction vectors. #338 @lbluque
  • stochastic SQS generation functionality. #344 @lbluque
  • enumeration of symmetrically distinct supercell matrices. #344 @lbluque


  • Fix #334 getting subclasses of non-abstract classes. #335 @lbluque
  • Fix #353 appending in Sublattice.restricted_sites property. #355 @qchempku2017
  • Use jsanitize to serialize dicts/lists of msonables. #354 @lbluque
  • Use site_mappings in StructureWrangler.process_entry. #363 @lbluque

v0.3.1 (2023-02-07)


  • Save ensemble as attributed in SampleContainer add get_sampled_structures. #326 @lbluque


  • ClusterExpansion.cluster_interaction_tensors as a cached property and reset when pruning. #330 @lbluque
  • Fix flakey unit tests. #328 @lbluque & @qchempku2017

v0.3.0 (2023-01-13)


  • Cluster decomposition analysis and sampling functionality. #306 @lbluque


  • Keep ensemble as attribute in MCKernels. #304 @lbluque
  • Change default processor into ClusterDecompositionProcessor when initializing Ensemble. #309 @qchempku2017
  • Use -mcpu=native compile option to build successfully on newer macs. #310 @lbluque


  • Add polytope and cvxpy to test requirements. #304 @lbluque

v0.2.0 (2022-12-11)


  • WangLandau kernel class for density of states sampling. #294 @qchempku2017 & @lbluque
  • Metadata class to record specifications of MC simulations. #297 @lbluque


  • Remove size key when all orbits of a certain size have been removed. #292 @lbluque

v0.1.0 (2022-10-20)


  • Charge neutral semi grand canonical sampling. #271 @qchempku2017
  • MultiStep and Composite mcushers for more flexible sampling. #279 @lbluque

v0.0.7 (2022-09-26)


  • Additional tests to ensure correlation vectors are consistent across equivalent supercell matrices. #262 @qchempku2017
  • Unit-test updates. #269 @lbluque


  • Improved orbit alias detection. #262 @qchempku2017

v0.0.6 (2022-09-02)

⚠️ This version introduced updates that change the order in which orbits are sorted in a ClusterSubspace. This means that the order correlation functions appear a correlation vector will be different when generating ClusterSubspaces compared to previous versions. However, loading a ClusterSubspace from a json file that was created with a previous version will still have its original order.


  • Include number of corr functions when sorting orbits. #256 @lbluque
  • Use max distance of centroid to sites in unit cell in cluster search. #256 @lbluque


  • Fixed search of clusters by correctly using centroid of unit cell. #255 @kamronald


  • Removed CanonicalEnsemble and SemigrandEnsemble. #257 @lbluque

v0.0.5 (2022-08-10)


  • Data centering example notebook. #238 @kamronald


  • Single sampler multiple kernels. #245 @qchempku2017


  • Fix returning all sub_orbit_mappings. #249 @lbluque

v0.0.4 (2022-06-23)


  • Allow streaming to h5 in simulated annealing. #216 @lbluque


  • Fix recording sampled traces for nwalkers > 1. #219 @lbluque
  • Fix minor error in ClusterSubspace.str #226 @lbluque

v0.0.3 (2022-06-03)


  • Developing section of docs. #215 @lbluque
  • Single Ensemble class for canonical and semi-grand canonical sampling. #210 @lbluque


  • Package name properly smol instead of statmech-on-lattices.


  • Fixed #213 metadata serialization for saving of SampleContainers. #214 @lbluque


  • SemiGrandEnsemble and CanonicalEnsemble. Use Ensemble with or without setting chemical potentials instead. #210 @lbluque

v0.0.2 (2022-05-22)


  • version dunder with pypi project rename.
  • use of np.random.default_rng for reproducibility. #206 (lbluque)
  • Fix passing seed explicitly in Sampler.from_ensemble

v0.0.1 (2022-04-26)


  • Method to detect and identify orbit degeneracies based on supercell shape. #184 (kamronald)
  • Automatic github release.
  • PyPi install as statmech-on-lattices (arghhh)


  • Moved cython code for computing correlations to smol/correlations.pyx and imports as smol.correlations #190 (lbluque)


  • Fix importing numpy in

v0.0.0 (2022-04-13)


  • Cluster as pymatgen.SiteCollection, str and repr methods for Cluster, Orbit, ClusterSubspace and ClusterExpansion akin to pymatgen, and functionality to render Clusters with crystal-toolkit. #181 (lbluque)
  • Sublattice splitting. #179 (qchempku2017)
  • StructureWrangler.get_similarity_matrix to get similarity fractions between correlation vectors of training set. #153 (kamronald)
  • ClusterSubspace with no point terms using {1: None}. #158 (lbluque)
  • MCBias implementation for biased sampling, Trace objects for general state saving during sampling. #154 (lbluque)
  • Active and inactive sublattices for MC sampling. #152 (lbluque)
  • SamplerContainer.to_hdf5 to save MC sample containers #151 (lbluque)
  • PottsSubspace class to generate redundant frame expansions. #146 (lbluque)
  • Methods is_suborbit and sub_orbit_mappings in Orbit and related function_hierarchy and orbit_hierarchy in ClusterSubspace. #141 (lbluque)
  • UniformlyRandomKernel for high temperature/random limit sampling. ThermalKernel ABC class for all temperature based MC Kernels. #134 (lbluque)
  • structure selection functions. #133 (lbluque)
  • RegressionData dataclass to save regression details in ClusterExpansions #132 (lbluque)
  • rotate method in SiteBasis class. #130 (lbluque)


  • StructureWrangler based on pymatgen ComputedStructureEntry. #189 (lbluque)
  • unittests for smol.cofe using pytest. #159 (lbluque)
  • New corr_from_occupancy and delta_corr faster and cleaner implementations. And renamed CEProcessor to ClusterExpansionProcessor #156 (lbluque)
  • Dropped "er" endings for MCUsher names. Renamed MuSemigrandEnsemble to SemigrandEnsemble. #154 (lbluque)
  • Changed ClusterSubspace.supercell_orbit_mappings to only include cluster site indices. #145 (lbluque)
  • Enable setting cluster cutoffs for duplicate searching. #142 (lbluque)
  • Methods orbits_from_cutoffs and function_inds_from_cutoffs now allow a dictionary as input to pick out orbits with different cluster diameter cutoffs. #135 (lbluque)


  • Allow Ewald only MC. #141 (kamronald)
  • Fix 141 corrected implementation of correlation function hierarchy. #141 (lbluque)
  • Fix 129 saving bit_combos in Orbit.as_dict when pruning has been done. #130 (qchempku2017)
  • Fix orbit generation to play nicely with changes in pymatgen Structure.sites_in_sphere return value. #125 (lbluque)
  • Fix cluster searching issue #104 when generating orbits from cutoffs. #138 (qchempku2017)


  • optimize_indicator in ClusterExpansionProcessor and corresponding cython function. #156 (lbluque)
  • FuSemiGrandEnsemble now FugacityBias. #154 (lbluque)
  • Numerical conversion of coefficients between bases ClusterExpansion.convert_coefs #149 (lbluque)

alpha1.0.1 (2021-03-03)


  • Method in StructureWrangler to get structure matching duplicates #122 (lbluque)
  • Include tolerance when detecting duplicate correlation vectors. #121 (lbluque)
  • Convenience method to get feature matrix orbit ranks. #117 (lbluque)
  • bit combo hierarchy in ClusterSubspace for fitting hierarchy constraints. #106 (qchempku2017)
  • data indices in StructureWrangler to keep track of training/test splits, duplicate sets, etc. #108 (lbluque)
  • ClusterSubspace.cutoffs property to obtain tight cutoffs of included orbits. #108 (lbluque)
  • Added properties to get orbit and ordering multiplicities of corr functions. #102 (lbluque)
    • function_ordering_multiplicities, function_total_multiplicities
  • Added helpful methods/properties to ClusterSubspace to get corr function indices based on cluster diameter cutoffs and/or cluster sizes. #102 (lbluque)
    • orbits_by_cutoffs, function_inds_by_cutoffs, function_inds_by_size.


  • Allow using external term values when detecting duplicate corr vectors. #124 (lbluque)
  • Warn instead of printing when structure matching fails. #124 (lbluque)
  • filter functions in smol.wrangling replaced with functions returning indices corresponding to structures to keep. This can be used saving indices with StructureWrangler.add_data_indices. #102 (lbluque)
  • Cleanup of sites, active sites and restricted sites in Sublattice #95 (juliayang)
  • Make feature matrix optional when creating a ClusterExpansion construction. #102 (lbluque)
  • Renamed ncorr_functions_per_orbit -> num_functions_per_orbit in ClusterSubspace and ClusterExpansion.convert_eci -> ClusterExpansion.convert_coefs #102 (lbluque)
  • Changed orthonormalization of site basis to use np.linalg.qr. #102 (lbluque)
  • Changed cython corr functions to reduce Python interaction in loop (~1.5x faster) #102 (lbluque)


  • Raise error in StructureWrangler.append_data_items when item properties are missing keys already included. #117 (lbluque)
  • Correctly recreate coefs in CompositeProcessor.from_dict #116 (lbluque)
  • Disallow setting chemical potentials/fugacities with duplicate string/species in dictionary. #114 (lbluque)
  • Fixed loading ClusterSubspace with polynomial basis from dict. #112 (lbluque)
  • Fixed Sublattice serialization, saving/loading SiteSpaces. #96 (lbluque)
  • Fix json serialization when saving ClusterSubspaces with orthonormal basis sets. #90 (lbluque)

alpha1.0.0 (2020-10-27)


  • Completely new smol.moca module. Design based generally up as follows:
    • Processor classes used to compute features, properties and their local changes from site flips for fixed supercell sizes.
      • ClusterExpansionProcessor to handle cluster expansions.
      • EwaldProcessor to handle Ewald electrostatic energy.
      • CompositeProcessor to mix energy models. Currently only the ones above.
    • Ensemble classes to represent the corresponding statistical ensemble (probability space). These classes no longer run monte carlo, they only compute the corresponding relative Boltzman probabilities.
      • CanonicalEnsemble for fixed compositions.
      • MuSemigrandEnsemble for fixed chemical potentials.
      • FuSemigrandEnsemble for fixed fugacity fractions.
    • Sublattice class to encapsulate previous implementation using dictionaries.
    • Sampler class to run MCMC trajectories based on the given kernel using a specific ensemble. #80 (lbluque)
    • MCKernel classes used to implement specific MCMC algorithms. Metropolis currently only kernel implemented to run single site Metropolis random walk.
    • MCUsher classes to handle specific MCMC step proposals (i.e. single swaps, to preserve composition, single flips, single constrained flips, multisite flips, local flips, etc).
    • SampleContainer class to hold MCMC samples and pertinent information for post-processing and analysis (improvement on previous implementation using lists).
  • smol.moca unit-tests all now using pytestst instead of unittest.
  • Vacancy class, inherits from pymatgen.DummySpecie.
  • SiteSpace class to encapsulate prior site space implementation using OrderedDicts.
  • get_species function to mimic get_el_sp from pymatgen but correctly handle Vacancy.
  • MCMC sample streaming functionality using hdf5 files. #84 (lbluque)
  • Initial Sphinx code documentation, currently hosted here.
  • StructureWrangler warning when adding structures with duplicate correlation vectors. #85 (lbluque)


  • Ensemble classes are temperature agnostic. Temperature is set when sampling. #83 (lbluque)
  • Refactored smol.cofe.configspace ->
  • A few method name changes in ClusterSubspace to be more precise and appropriate. Most notably from_radii classmethod now from_cutoffs (since the distances used, max distance between 2 pts, are more like a diameter rather than a radius.)
  • filtering functions no longer methods in StructureWrangler, now defined as functions in cofe.wrangling.filter. #85 (lbluque)
  • Species in site spaces and occupancy strings are now pymatgen Specie or inherited classes instead of string names. (This is to allow keeping additional properties for species, such as oxidation state, magnetization, the sky is the limit.)
  • Single StandardBasis site basis class that is constructed using a basis function iterator for specific basis sets.
  • Example notebooks updated accordingly.


  • smol.learn and all regression estimators have been removed.


  • Proper calculation of ECIs in ClusterExpansion using both random crystallographic symmetry multiplicity and function decoration multiplicity. (credits to qchempku2017 for pointing this out.)
  • Fixed MSONable serialization of cluster subspaces with orthonormal basis sets by making StandardBasis MSONable and saving corresponding arrays. #90

alpha0.0.0 (2020-10-8)

Initial relatively stable version of the code.