MoleculeSets (MSets) are objects which contain geometries of a single molecule. They can be saved and loaded in JSON format.

In [1]:
from tensorchem.dataset.molecule import MoleculeSet as MSet

In [2]:
mset = MSet()
mset.filename = "../data/17940.mset"
mset.load()

MSets contain a list of internal atom objects. A minimal atom object has an atomic number (at_num) property. MSets will generally have a set of minimal atom objects, and geometries will be built by filling in coordinates and properties of the atoms.

In [3]:
print(type(mset.atoms))
print(type(mset.atoms[0]))
print([atom.at_num for atom in mset.atoms])
mset.atoms[0].__dict__

<class 'list'>
<class 'tensorchem.dataset.molecule.Atom'>
[8, 8, 8, 7, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]


{'at_num': 8, 'xyz': None, 'labels': {}}

MSets will typically contain a list of geometries. Each geometry is the same molecule, but should have a different coordinates and labels.

In [5]:
print(len(mset))
print(mset[0])
print(mset[0].labels)

41
38

8     0.6799850303     -1.0162504038     0.6944058435
8     -5.3341583476     0.5673792481     0.3917185381
8     -4.9015001071     0.1510446691     -1.6678625395
7     4.1469728392     0.2133908322     -0.1357977544
6     3.0679719814     -0.5753224815     0.5563754881
6     5.5273723788     -0.3290186453     0.0894755463
6     4.0400095946     1.7039492424     0.0437890492
6     1.6165309479     -0.2011373522     0.1829018674
6     5.8045882601     -1.7876072442     -0.3438279318
6     4.9454721244     2.6297031134     -0.8000466618
6     -0.5598646186     -0.8113910714     0.5115651699
6     -1.2591432098     0.132925334     1.2901799132
6     -1.2136451174     -1.4614225848     -0.5510083057
6     -3.2377245291     -0.1187515254     -0.1483290027
6     -2.5819991332     0.477359111     0.9561932082
6     -2.5309808772     -1.1047647594     -0.8832060351
6     -4.5382301371     0.2227726866     -0.4990407736
6     -6.6670481767     0.6340892126     0.4142532697
1     3.182617

Geometries themselves are also a list of atom objects, but in a geometry the atom should have a set of coordinates.

In [11]:
geom = mset[0]
for atom in geom.atoms:
    print(atom.at_num, atom.xyz)

8 [0.6799850303, -1.0162504038, 0.6944058435]
8 [-5.3341583476, 0.5673792481, 0.3917185381]
8 [-4.9015001071, 0.1510446691, -1.6678625395]
7 [4.1469728392, 0.2133908322, -0.1357977544]
6 [3.0679719814, -0.5753224815, 0.5563754881]
6 [5.5273723788, -0.3290186453, 0.0894755463]
6 [4.0400095946, 1.7039492424, 0.0437890492]
6 [1.6165309479, -0.2011373522, 0.1829018674]
6 [5.8045882601, -1.7876072442, -0.3438279318]
6 [4.9454721244, 2.6297031134, -0.8000466618]
6 [-0.5598646186, -0.8113910714, 0.5115651699]
6 [-1.2591432098, 0.132925334, 1.2901799132]
6 [-1.2136451174, -1.4614225848, -0.5510083057]
6 [-3.2377245291, -0.1187515254, -0.1483290027]
6 [-2.5819991332, 0.477359111, 0.9561932082]
6 [-2.5309808772, -1.1047647594, -0.8832060351]
6 [-4.5382301371, 0.2227726866, -0.4990407736]
6 [-6.6670481767, 0.6340892126, 0.4142532697]
1 [3.1826173234, -0.4332432478, 1.6106858785]
1 [3.1981814311, -1.5881449243, 0.2367876313]
1 [5.7298457864, -0.2628778468, 1.1380568584]
1 [6.1578905611, 0.26490423