Switch branches/tags
Nothing to show
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
..
Failed to load latest commit information.
README.md
course.html
course.xml
courseWithNamespace.xml
courseWithNamespaceAndSchemaLocation.xml
courses.xsd
courses2html.xslt

README.md

Examples for XML

Here you can find some examples for the Extensible Markup Language (XML) and related standards.

1. course.xml

course.xml is a simple XML file which describes basic data of a course, such as the number of units, the teachers, and the participating students. It shows how data can be arranged hierarchical in an XML file and has both elements and attributes.

  1. course.xml

2. courses.xsd

The example above showed a basic use case for XML, but left away a few crucial issues. For instance, how does someone who receives the course.xml file know whether it is valid? In other words, how does she know that the teachers element is allowed to contain arbitrarily many teacher elements? If a person interprets the data, that might not be an issue. But we want to interpret data automatically. We may assume that hundreds of courses XML files are generated an processed. If we want to do that, we need some sort of mechanism to perform a sanity check, ideally already when loading the files, before feeding their data into the actual processing step.

For this purpose, XML Schemas exist. With an XML Schema, we can specify a blueprint for certain type of XML files, for XML files belonging to a certain namespace. Think that a XML Schema corresponds to a typedef of a struct in C, while the XML file is the actual variable instance of it.

After having defined an XML namespace via a Schema, we can now use it in an XML file by declaring it via xmlns.

However, this declaration itself does not ensure that the document is actually validated. For this purpose, an XML parser needs to know where to find the schema. This can be done by adding a schemaLocation attribute to the element declaration inside the XML file. Notice that an XML namespace is identified by a URI. This URI may or may not be a URL and even if it is a URL, it does not necessarily point to an existing resource. In our example, I intentionally did not use a URL but ustc:courses as namespace. Even if I had used http://www.test.com/xyz.xsd, there would have been no guarantee that this file actually existed. With the schemaLocation, we can link the namespace URI ustc:courses to a link to the location of the schema in the internet, right here on GitHub https://raw.githubusercontent.com/thomasWeise/distributedComputingExamples/master/xml/xml/courses.xsd. Now a validating XML parser can always download the schema and use it to check the document.

  1. courses.xsd
  2. courseWithNamespace.xml
  3. courseWithNamespaceAndSchemaLocation.xml

The Java program SAXReaderExampleValidating.java demonstrate how a parser can validate an XML document while parsing it.

3. courses2html.xslt

With Extensible Stylesheet Language Transformations ([XSLT]](https://en.wikipedia.org/wiki/XSLT)), we can transform one XML document to another document, which does not even necessarily need to be an XML document. XSLT is basically a language which tells a transformation process what output to produce for which element and attribute. Sometimes, there might be two different XML dialects / schemas / namespaces for similar domains, e.g., there could be a tsignhua:courses namespace with similar purpose than our declaration in the above example. With XSLT, we could translate one document from ustc:courses to tsinghua:courses. Or we could translate our document to HTML, i.e., we could translate our raw data to a web site. This is what we do with this example:

  1. courses2html.xslt
  2. courseWithNamespace.xml
  3. course.html (the result of the transformation)

The actual transformation needs to be performed by a program and XSLTTransform.java is such a program.