A PHP 8.4+ package that parses Chemical Structure Markup Language (CSML) documents and renders them as SVG images.
CSML is an XML-based markup language for describing molecular structures with full topological, geometric, and repeating-unit information. This package provides the reference PHP implementation for parsing and visualizing CSML documents.
composer require chemstrucml/csml-parser- PHP 8.4 or higher
ext-simplexmlext-domext-libxml
use ChemStrucML\Csml\CsmlRenderer;
$renderer = new CsmlRenderer();
// Render from a file
$svg = $renderer->renderFile('benzene.csml');
// Render from a string
$svg = $renderer->renderString('<csml version="0.1">
<molecule id="water" name="Water">
<atom id="O1" element="O" />
<atom id="H1" element="H" />
<atom id="H2" element="H" />
<bond from="O1" to="H1" order="single" />
<bond from="O1" to="H2" order="single" />
</molecule>
</csml>');
// Save to file
file_put_contents('molecule.svg', $svg);use ChemStrucML\Csml\CsmlRenderer;
$renderer = new CsmlRenderer();
// Parse to inspect the document model
$document = $renderer->parse($csmlString);
echo $document->version; // "0.1"
echo $document->molecules[0]->name; // "Benzene"
echo count($document->molecules[0]->atoms); // 6
// Render to SVG
$svg = $renderer->render($document);use ChemStrucML\Csml\CsmlRenderer;
use ChemStrucML\Csml\Config\RenderConfig;
$config = new RenderConfig(
bondLength: 80.0, // Bond length in SVG pixels (default: 60.0)
bondWidth: 2.0, // Stroke width (default: 1.5)
fontSize: 16.0, // Atom label font size (default: 14.0)
fontFamily: 'monospace', // Font family (default: 'Arial, Helvetica, sans-serif')
padding: 30.0, // SVG padding (default: 20.0)
showAllCarbons: false, // Show carbon labels in skeletal mode (default: false)
showImplicitHydrogens: true, // Show implicit H labels (default: true)
useColoredAtoms: true, // CPK coloring for atoms (default: true)
backgroundColor: 'white', // SVG background (default: 'transparent')
);
$renderer = new CsmlRenderer(config: $config);
$svg = $renderer->renderFile('molecule.csml');use ChemStrucML\Csml\Parser\XmlParser;
$parser = new XmlParser();
$document = $parser->parseFile('molecule.csml');
foreach ($document->molecules as $molecule) {
echo "Molecule: {$molecule->name}\n";
echo " Atoms: " . count($molecule->atoms) . "\n";
echo " Bonds: " . count($molecule->bonds) . "\n";
echo " Rings: " . count($molecule->rings) . "\n";
}| Feature | Elements | Status |
|---|---|---|
| Atoms | <atom>, <atom-list> |
Supported |
| Bonds | <bond>, <bond-chain> |
Supported |
| Ring systems | <ring>, <fused-ring> |
Supported |
| Groups & fragments | <group>, <group-ref>, <anchor>, <attach> |
Supported |
| Repeat units | <repeat>, <connector>, <cap> |
Supported |
| Branching | <branch> |
Supported |
| Copolymers | <copolymer> |
Supported |
| Coordinates | <coordinates>, <point> |
Supported |
| Metadata | <meta> |
Supported |
| Implicit hydrogens | implicit-h="auto" |
Supported |
| Stereochemistry | chirality, stereo attributes |
Parsed |
| Geometry constraints | <angle>, <torsion>, <length> |
Parsed |
- Single, double, triple bonds with proper parallel line offset
- Aromatic rings with inscribed circle
- Wedge bonds (filled triangle for stereo-up)
- Hatch bonds (dashed lines for stereo-down)
- Dashed bonds (hydrogen bonds, partial bonds)
The renderer follows skeletal formula conventions by default:
- Carbon atoms are not labeled (unless they carry a charge, isotope, or explicit label)
- Non-carbon atoms display their element symbol with CPK coloring (O = red, N = blue, S = yellow, etc.)
- Implicit hydrogens are shown as subscripts (e.g. NH, OH, NH₂)
- Charges are rendered as superscripts (e.g. O⁻, NH₃⁺)
<csml version="0.1">
<molecule id="benzene" name="Benzene">
<atom-list element="C" prefix="C" from="1" to="6" />
<ring id="benz" size="6" aromatic="true">
<member atom="C1" /><member atom="C2" /><member atom="C3" />
<member atom="C4" /><member atom="C5" /><member atom="C6" />
</ring>
</molecule>
</csml><csml version="0.1">
<molecule id="1-hexanol" name="1-Hexanol">
<atom-list element="C" prefix="C" from="1" to="6" />
<atom id="O1" element="O" />
<bond-chain atoms="C1 C2 C3 C4 C5 C6" order="single" />
<bond from="C6" to="O1" order="single" />
</molecule>
</csml><csml version="0.1">
<molecule id="naphthalene" name="Naphthalene">
<atom-list element="C" prefix="C" from="1" to="10" />
<ring id="ring_a" size="6" aromatic="true">
<member atom="C1" /><member atom="C2" /><member atom="C3" />
<member atom="C4" /><member atom="C5" /><member atom="C6" />
</ring>
<ring id="ring_b" size="6" aromatic="true">
<member atom="C5" /><member atom="C6" /><member atom="C7" />
<member atom="C8" /><member atom="C9" /><member atom="C10" />
</ring>
<fused-ring rings="ring_a ring_b" shared-atoms="C5 C6" />
</molecule>
</csml>| Feature | CSML | SMILES | MOL/SDF | CML | InChI |
|---|---|---|---|---|---|
| Human-readable | Yes | Yes | No | Yes | No |
| Polymer repeat units | Yes | No | Partial | No | No |
| Copolymer patterns | Yes | No | No | No | No |
| Reusable fragments | Yes | No | No | Partial | No |
| Extensible (namespaces) | Yes | No | No | Yes | No |
| End-group specification | Yes | No | No | No | No |
CSML separates topology (what is connected to what) from geometry (angles, lengths) and from presentation (how to draw it), making it uniquely suited for polymer science, materials engineering, and structural chemistry applications where existing formats fall short.
MIT