Skip to content

ChemStrucML/CSML-PHP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CSML Parser for PHP

A PHP 8.4+ package that parses Chemical Structure Markup Language (CSML) documents and renders them as SVG images.

CSML is an XML-based markup language for describing molecular structures with full topological, geometric, and repeating-unit information. This package provides the reference PHP implementation for parsing and visualizing CSML documents.

Installation

composer require chemstrucml/csml-parser

Requirements

  • PHP 8.4 or higher
  • ext-simplexml
  • ext-dom
  • ext-libxml

Quick Start

use ChemStrucML\Csml\CsmlRenderer;

$renderer = new CsmlRenderer();

// Render from a file
$svg = $renderer->renderFile('benzene.csml');

// Render from a string
$svg = $renderer->renderString('<csml version="0.1">
  <molecule id="water" name="Water">
    <atom id="O1" element="O" />
    <atom id="H1" element="H" />
    <atom id="H2" element="H" />
    <bond from="O1" to="H1" order="single" />
    <bond from="O1" to="H2" order="single" />
  </molecule>
</csml>');

// Save to file
file_put_contents('molecule.svg', $svg);

Usage

Parse and Render Separately

use ChemStrucML\Csml\CsmlRenderer;

$renderer = new CsmlRenderer();

// Parse to inspect the document model
$document = $renderer->parse($csmlString);

echo $document->version;                    // "0.1"
echo $document->molecules[0]->name;         // "Benzene"
echo count($document->molecules[0]->atoms); // 6

// Render to SVG
$svg = $renderer->render($document);

Custom Rendering Configuration

use ChemStrucML\Csml\CsmlRenderer;
use ChemStrucML\Csml\Config\RenderConfig;

$config = new RenderConfig(
    bondLength: 80.0,           // Bond length in SVG pixels (default: 60.0)
    bondWidth: 2.0,             // Stroke width (default: 1.5)
    fontSize: 16.0,             // Atom label font size (default: 14.0)
    fontFamily: 'monospace',    // Font family (default: 'Arial, Helvetica, sans-serif')
    padding: 30.0,              // SVG padding (default: 20.0)
    showAllCarbons: false,      // Show carbon labels in skeletal mode (default: false)
    showImplicitHydrogens: true, // Show implicit H labels (default: true)
    useColoredAtoms: true,      // CPK coloring for atoms (default: true)
    backgroundColor: 'white',   // SVG background (default: 'transparent')
);

$renderer = new CsmlRenderer(config: $config);
$svg = $renderer->renderFile('molecule.csml');

Using the Parser Directly

use ChemStrucML\Csml\Parser\XmlParser;

$parser = new XmlParser();
$document = $parser->parseFile('molecule.csml');

foreach ($document->molecules as $molecule) {
    echo "Molecule: {$molecule->name}\n";
    echo "  Atoms: " . count($molecule->atoms) . "\n";
    echo "  Bonds: " . count($molecule->bonds) . "\n";
    echo "  Rings: " . count($molecule->rings) . "\n";
}

Supported CSML Features

Feature Elements Status
Atoms <atom>, <atom-list> Supported
Bonds <bond>, <bond-chain> Supported
Ring systems <ring>, <fused-ring> Supported
Groups & fragments <group>, <group-ref>, <anchor>, <attach> Supported
Repeat units <repeat>, <connector>, <cap> Supported
Branching <branch> Supported
Copolymers <copolymer> Supported
Coordinates <coordinates>, <point> Supported
Metadata <meta> Supported
Implicit hydrogens implicit-h="auto" Supported
Stereochemistry chirality, stereo attributes Parsed
Geometry constraints <angle>, <torsion>, <length> Parsed

Bond Rendering Styles

  • Single, double, triple bonds with proper parallel line offset
  • Aromatic rings with inscribed circle
  • Wedge bonds (filled triangle for stereo-up)
  • Hatch bonds (dashed lines for stereo-down)
  • Dashed bonds (hydrogen bonds, partial bonds)

Atom Rendering

The renderer follows skeletal formula conventions by default:

  • Carbon atoms are not labeled (unless they carry a charge, isotope, or explicit label)
  • Non-carbon atoms display their element symbol with CPK coloring (O = red, N = blue, S = yellow, etc.)
  • Implicit hydrogens are shown as subscripts (e.g. NH, OH, NH₂)
  • Charges are rendered as superscripts (e.g. O⁻, NH₃⁺)

CSML Examples

Benzene

<csml version="0.1">
  <molecule id="benzene" name="Benzene">
    <atom-list element="C" prefix="C" from="1" to="6" />
    <ring id="benz" size="6" aromatic="true">
      <member atom="C1" /><member atom="C2" /><member atom="C3" />
      <member atom="C4" /><member atom="C5" /><member atom="C6" />
    </ring>
  </molecule>
</csml>

1-Hexanol

<csml version="0.1">
  <molecule id="1-hexanol" name="1-Hexanol">
    <atom-list element="C" prefix="C" from="1" to="6" />
    <atom id="O1" element="O" />
    <bond-chain atoms="C1 C2 C3 C4 C5 C6" order="single" />
    <bond from="C6" to="O1" order="single" />
  </molecule>
</csml>

Naphthalene (Fused Rings)

<csml version="0.1">
  <molecule id="naphthalene" name="Naphthalene">
    <atom-list element="C" prefix="C" from="1" to="10" />
    <ring id="ring_a" size="6" aromatic="true">
      <member atom="C1" /><member atom="C2" /><member atom="C3" />
      <member atom="C4" /><member atom="C5" /><member atom="C6" />
    </ring>
    <ring id="ring_b" size="6" aromatic="true">
      <member atom="C5" /><member atom="C6" /><member atom="C7" />
      <member atom="C8" /><member atom="C9" /><member atom="C10" />
    </ring>
    <fused-ring rings="ring_a ring_b" shared-atoms="C5 C6" />
  </molecule>
</csml>

Why CSML?

Feature CSML SMILES MOL/SDF CML InChI
Human-readable Yes Yes No Yes No
Polymer repeat units Yes No Partial No No
Copolymer patterns Yes No No No No
Reusable fragments Yes No No Partial No
Extensible (namespaces) Yes No No Yes No
End-group specification Yes No No No No

CSML separates topology (what is connected to what) from geometry (angles, lengths) and from presentation (how to draw it), making it uniquely suited for polymer science, materials engineering, and structural chemistry applications where existing formats fall short.

License

MIT

About

A php package to create chemical structure images based on the Chemical Structure Markup Language (CSML).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

Generated from renfordt/php-skeleton