Creating XML Schemas From the Harmonized Model Using ShapeChange

Ted Habermann edited this page Jul 30, 2018 · 30 revisions

In order to utilize ISO TC211 UML models in practical applications, they must be implemented in a representation that can be serialized and transmitted between information processing agents. Most current implementations are using XML as the serialization scheme, with XML schema documents used to define the structure of interchange documents. In order to automatically produce schema implementing the conceptual models, a software process is needed to generate XML schema from the UML models. The process described here was used for several recent standards.

Software Requirements

This section provides a high level overview; for details see the Harmonized Model Management Group wiki.

Enterprise Architect (EA): A commercial software package used to create, update and manage the ISO TC211 Harmonized UML Model. Descriptions here are based on version 12.1.1230; as of 2017-07 the current version of Enterprise Architect is 13.5, but procedures here have not been tested with new versions.

ISO TC211 Harmonized Model: The ISO TC211 Harmonized Model is stored in a web-accessible directory that contains files containing all UML models under the stewardship of TC211. Models are exported from Enterprise Architect using the XMI XML interchange format for UML models, and these XMI files are maintained under version control (see Subversion below) in the online directory. UML modeling software packages other than Enterprise Architect should be able to use these XMI files, but in practice, the format is not standardized enough to reliably translate all aspects of a model (notes, tags, color coding etc.). The XMI documents in the TC 211 harmonized model repository use the following namespaces: xmi.version="1.1" xmlns:UML="omg.org/UML1.3", with Enterprise Architect exporter version 2.5, and windows-1252 character encoding.

Subversion: The TC211 Harmonized Model repository is managed using Subversion software, and open source package from the Apache Software Foundation. In order to check out model components and check in changes, you must have a subversion client on your machine that is connected to the model repository.

ShapeChange: An open source software package from Interactive Instruments GmbH software that implements the UML to XML transformation to generate XML schema for model implementation.

Solid Ground: A software extensions for EA originated by CSIRO that is very useful for testing UML models before running ShapeChange. The software is no longer maintained by CSIRO. A copy of the installer for Windows (Solid_Ground_Setup.msi) is in the XML Maintenance Group repository. Some documentation is available in a file in that repository.

Model Setup

In order to generate XML schema for TC211, a specific UML profile must be followed in constructing the UML model. The online ShapeChange documentation is recommended if you have any questions. A few key stereotypes and tags are described here.

Stereotype

Scope

interpretation

Application Schema

UML package

Classes contained in the package will be implemented in a single XML namespace.

Abstract

Class

The UML class will be implemented as an abstract element in XML. The class must be the target of one or more concrete classes in the UML model.

Union

Class

Stereotype indicates that one, and only one, of the attributes in the UML class are allowed in each instance of the class, implemented as an xsd:choice in the XML schema.

CodeList

Class

Stereotype indicates that the UML class is implemented as a codelist as defined in the gco namespace, and the attributes in the class provide the allowed values.

Each model module that will be implemented as a separate XML namespace must be in a separate UML package with the stereotype <<Application Schema>>. The application schema package must have tags as shown in Table 1 and Figure 1.
Tag Example Value Notes
targetNamespace http://standards.iso.org/iso/19115/-3/cit/2.0 The namespace URI that will be declared in the XML schema
version 1.0
xmlns cit Three letter abbreviation, unique within scope of TC211, that will be used to prefix element names in the xml schema
xsdDocument ISO19115-3/cit/1.0/cit.xsd Path fragment and file name for generated XML schema. This file path will be appended on the location where the schema generation process is executed. See discussion below.
xsdEncodingRule iso19139_2007 Identifier for the UML to XML transformation rule set. Currently this should always have the value iso19139_2007.

Figure 1. Enterprise Architect dialog for assigning tag values.
Figure 1. Enterprise Architect dialog for assigning tag values.

The application schema package must contain one or more child packages, each of which will be implemented in a separate xml schema file. These may have the stereotype <<Leaf>> assigned for clarity, but this is not a requirement.

One XML schema file, the name of which is specified by the xsdDocument tag on that package (cit.xsd in Figure 1), will be created at the application schema level. This schema will <xsd:include> the schema files generated from sub packages within the application schema package. If you want to have separate schema files (recommended), the packages within the application schema package require an xsdDocument and xsdEncodingRule tags. Note that the xsdDocument path should match the document path for the root xsd so the <xsd:include> paths are simple (see Post Processing section below).

Tags for attributes in UML classes

Each attribute and navigable association end from a UML class should have a sequenceNumber tag, and the sequence numbers must be unique in the scope of an individual UML class. The sequence numbers are not mandatory--ShapeChange will generate XML schema without them. If they are not present, XML elements for properties inside a class will follow the order they appear in the class, but, if the class has associations to other classes, the sequence of those elements in the generated XML schema is unpredictable. It's OK to leave gaps between sequence numbers, which can be handy for classes with many attributes and relationships if the sequence order is changed or new attributes added at a later time.

Linkage between packages

Classes from other TC211 application schema in the EA model must be included by linking to that class using the EA interface. Typing in the name of the class does not create the necessary internal linkages. These linkages allow EA to generate dependency diagrams and to import the necessary namespaces into the xml schema when they are generated by the UML to XML processor.

Organization of schema files

The output schema are loaded into a directory structure for various ISO schemas (Figure 2), and the paths for importing schema from other models or including schema implementing various parts of single model are based on this directory structure. There is a folder for each TC211 standard (e.g. ISO19115, ISO19110…) within the top-level folder. These folders contains subfolders for each xml namespace defined in the model, using the namespace abbreviations as folder names. The next level of subfolders are named using version numbers for implementations of that namespace, and the actual schema files are within the folders with the version number for those schema.

Figure 2. Folder naming for the ISO TC211 XML schema directories.
Figure 2. Folder naming for the ISO TC211 XML schema directories.

Schema are loaded into this structure by specifying ‘xsdLocation’ tagged values that provide paths (see Figure 1) to the location where the generated schema will be saved. The path locations are relative to the root directory where ShapeChange is executed.

Workflow

The UML models in the main TC211 harmonized model repository have not been edited and configured to work directly with ShapeChange. The 'implementation-XML' subdirectory in this repository should be accessed as a separate repository. It contains copies of the UML models from the main harmonized model that have been edited to enable xml schema generation. These copies were generated by exporting XMI from the main repository and importing into the implementation repository with new identifiers assigned to the model elements. Thus, cross model linkages can be established in the implementation model independent of the main harmonized model.

Here’s the process. For more detailed information see the HMMG wiki pages on Connecting to the Harmonized Model Repository.

  1. Check out the Implementation-xml subversion directory [accessed 2018-03-10, rev 3333] (this will require a username and password) to your local environment.
  2. Set up an EA version control configuration named 'implementation-XML' linked to this directory.
  3. Get packages in EA ('package control/get package' on context menu for root of the EA project table of contents), and select the 'implementation-XML' version control configuration, and then the 'implementation-XML' package. This should load the directory structure for the workspace in the EA project.
  4. On the context menu for the model root, select 'Package Control/Get All Latest', and the models will load. You're ready to go.

To check, right click on root of package in Enterprise Architect, select 'Package Control → File properties' and you should see something like this:

Figure 3. File properties from Enterprise Architect.
Figure 3. File properties from Enterprise Architect.

If checking out the packages from EA fails, you may need to resort to subversion command line functions. In the command line interface (from a Windows com window), navigate to the directory where subversion is installed (unless you have put that in your PATH environmental variable), e.g.
>cd 'C:\Program Files (x86)\Subversion\bin'

Run

>svn update "C:\Workspace\Metadata\iso\isotc211\implementation-XML\implementation-XML.xml"

If this is your first checkout, you will be asked for your username and password to access the TC211 repository. The software should remember this for future access. If a repository is locked and shouldn't be, usually due to some sort of svn operation failure or Windows crash by one of the users, you might need to unlock. E.g.:

>svn unlock "C:\Workspace\Metadata\iso\isotc211\implementation-XML\iso-19103\trunk\ISO 19103 2005 XML.xml"

'Package control/Check out' and Check-in should work now.

SolidGround operations.

With the model loaded in EA, and the SolidGround extension working, you can now do some checking on the model to look for problems that can interfere with generating XML. Right click on a package in the EA table of contents, at the top of the context menu select:
'Extensions/Solid Ground/Standards Conformance'

Figure 4. Stacked context menus to access SolidGround functions.
Figure 4. Stacked context menus to access SolidGround functions.

Select the package you wish to check, and select 'Run Conformance Tests'. Run the conformance test on all the packages you will be using to make the XML schema. In the resulting reports, you generally don't have to worry about warnings, but they often flag things that should be changed. Errors need to be fixed, ShapeChange gags when it runs into one of these (for the most part…you might be lucky…)

The Standards Conformance context menu has some other useful options:

Figure 5. Caption.
Figure 5. Context menu for conformance tests.

'Generate Package Dependency Diagram' Adds a dependency diagram in the package 'Run Conformance Tests'—looks for tag, links etc.
'Verify Package Dependencies' makes sure all referenced packages are in the model you're using.
'Assign Sequence Numbers' will assign sequence numbers to attributes and associations that determine the order in the generated XML schema. These are required by ShapeChange, but you have to make sure the order is correct. Attributes in a class will be ordered as they appear in the UML, but the ordering of associations is not obvious from the diagrams. The sequence numbers show up as tags on the attributes inside the classes, or on the associations, accessed through the EA property dialogs.

Running ShapeChange

According to the ShapeChange documentation, EA compatibility was developed on EA versions 7.0 and 7.1, and runs unchanged for later versions. ShapeChange has been used with EA up to version 12.0. Note: The Norwegian company Arktiektum has developed an EA extension for ShapeChange; this has not been tested for this tutorial, but might be useful. See https://kartverket.no/globalassets/standard/programmer-og-verktoy/shapechange_setup_2_0.zip

Run test.bat in ShapeChange installation directory to verify that the software is working correctly.

One common problem is getting ‘Exception in thread "main" java.lang.UnsatisfiedLinkError: no SSJavaCOM in java.library.path’ error. This general indicates compatibility problem with 32 bit software running on a 64 bit machine. From http://shapechange.net/app-schemas/ea/: 'To process Enterprise Architect models with ShapeChange, copy the file SSJavaCom.dll located in /Java API to /System32 (on a 32-bit machine) or to /SysWOW64 (on a 64-bit machine). On the 64-bit machine use the java.exe in the /SysWOW64 folder since Enterprise Architect is a 32-bit application.' The bottom line is that ShapeChage must run on a 32 bit version of java.exe, and the SSJavaCom.dll has to be in the class path; I put it in both Windows/System32 and Windows/SysWOW64. ShapeChange is executed from the command line, and if the default Java path does not point to a 32 bit JVM, then the command to execute shape change has to include a full path to a 32 bit Java.exe installation:

[full path to 32 bit java.exe] –jar [shape change jar location] -Dfile.encoding=UTF-8 –c [configuration file location] The italicized strings in brackets will need to be determined by your configuration. I generally set up the command with full paths for all files so I don't have to worry about the current working directory when I run it. The configuration file sets up the paths to locate the Enterprise Architect project containing the UML model with properly constructed and tagged packages to generate xml schema from UML.

Getting the ShapeChange configuration file ready

The ShapeChange configuration file gives you great control over details of the conversion from UML to XSD. A few settings are critical: In the <input> section of the configuration:
Input file name: <parameter name="inputFile" value= "path to your Enterprise Architect project file with the TC211 model"/>
ShapeChange uses a regular expression to identify the application schema that will be processed. Only packages with the stereotype "Application Schema" and a namespace tag with value matching this regEx will be processed. In the configuration file the regex is specified by the appSchemaNamespaceRegex parameter. The regular expression shown here works for ISO namespaces with three letter abbrerviations:
<parameter name="appSchemaNamespaceRegex" value="^http://standards.iso.org/iso/19115/-3/[a-z,0-9]{3}/\d.\d"/> There is a targetParameter 'outputDirectory' that is supposed to be used to construct the output directory path for schema produced by ShapeChange but the behavior is apparently not what is expected. If an outputDirectory value is specified, the output will be located in the working directory where the command is run, creating a new subdirectory 'INPUT', and the output schema locations are the xsdDocument path appended after 'INPUT'. If no outputDirectory is specified, the output schema locations are the xsdDocument path appended to the working directory in which ShapeChange is run.

To work around this behavior, comment out the outputDirectory targetParamter, and run ShapeChange from a directory that contains the subdirectory structure matching the ISO xml directory structure described above.

Setting up directories for output schema

The schemas produced by ShapeChange need to be distributed into the directory structure described in the Organization of schema files section (above). ShapeChange will not create the directories for you, so the entire standards.iso.org/iso/191nn directory structure needs to be in place in the target output directory, which will be a child of the working directory from which the ShapeChange command is run. This can be set up by copying the contents of …GitHub\ISOTC211\XML\standards.iso.org\iso and deleting all the xsd files recursively. Existing schema files will be overwritten, but removing the existing ones makes it easier to see that new ones have been generated. The import statements in the schema are generated correctly by including a 'standardNamespaces.xml' file in the ShapeChange configuration: <xi:include href="[path]/StandardNamespaces.xml"/> The full path should be included. The contents of this file looks like:

<xmlNamespaces xmlns="http://www.interactive-instruments.de/ShapeChange/Configuration/1.1">
<xmlNamespace nsabr="cat" ns="http://www.isotc211.org/2014/cat/1.0" location="../../../ISO19115-3/cat/1.0/cat.xsd"/>
<xmlNamespace nsabr="cit" ns="http://www.isotc211.org/2014/cit/1.0" location="../../../ISO19115-3/cit/1.0/cit.xsd"/>

… for all namespaces. These relative paths work because of the standardized directory structure. Although the '../../../' pattern is not necessary for imports from the same ISO model, it works for internal imports as well as imports from other models. Using the same pattern for the relative paths makes maintenance easier.

Post Processing

There are several problems in the output generated by ShapeChange (these might be fixed in newer versions).

Issue 1.

Abstract classes have ':' prefix in generated XML

Figure 6. Buggy implementation for abstract element name.
Figure 6. Buggy implementation for abstract element name.

This is easily fixed by a global search and replace 'name=":Abstract' with 'name="Abstract'. I use Notepad++ search and replace in files to do it all at once.

Issue 2

Include statements in the schema use the paths from the schemaLocation tagged value, not the correct relative path for how they're actually loaded in the directory structure. The import paths are constructed correctly from the schemaLocations.xml file, but include paths are not. These must be manually adjusted by going through all schema and removing the path part.

Figure 7. Example of incorrect schema location path for an included schema.
Figure 7. Example of incorrect schema location path for an included schema.

In the example above, the include should be <include schemaLocation="catalogues.xsd"/>

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.