Skip to content

The Saxon XSLT Processor is using the accurate decimal-based floating-point arithmetic and half-up rounding for VAT rounding according to EU law. According to Europen Norm EN16931:2024

License

Notifications You must be signed in to change notification settings

svanteschubert/Saxon-HE-enhanced-accuracy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Saxon Home e-Commerce Edition

Purpose

This is a fork from the XSLT processor (SAXON Home) to provide accuracy and legal conformatiy in commercial calculations.

  1. The accuracy is being achieved by using for floating-point numbers the decimal-based implementation of IEEE 754 of Java instead of the inaccurate binary-based floating point.

  2. In some EU countries - like in Germany - the VAT has to be rounded half-up (away from zero - 0,5 becomes 1 and -0,5 becomes -1). But XML is using round half-up towards infinity, where 0,5 becomes 1 and -0,5 becomes 0). Therefore the (in Germany for VAT) legally required rounding had been added to SAXON HE.

This temporary fork of Michael Kay's Saxon is just a showcase of using Saxon in the e-commerce domain requiring the best numeric accuracy.

After convincing CEN TC 434 WG1 to add decimal-based floating-point-support as a recommendation of the EU e-invoice standard (EN16931), this project aims to enhance the EN16031 XSLT Schematron validation reference implementation with the support of decimal-based floating-point.

Background

In the context of EU e-invoice standardisation the CEN Technical Committee 434 discussed for weeks, how it could be achieved that invoices created from different software could be identical in all data fields, especially the calculated amounts were often varying. For weeks spreadsheets with various scenarios were exchanged and the tendency was towards a simple workaround using Slack (to accept the variations and provide a level of inaccuracy).

In the end, there were only three points to be taken care of:

  1. No calculation of rounded values (e.g. no addition of line gross values - even if allowed by law as in the Netherlands)
  2. Agree on a single rounding (there are more than a dozen different roundings - XML come up with its own but Germany requires by law for VAT the "half-up rounding away from zero" (or "kaufmännisches Runden") different to XML default rounding)
  3. Use Instead of the usually used binary floating-point use the accurate decimal floating-point (part of IEEE 754 since 2008)

These recommendations are part of the EN16931 amendments - no mandatory requirement as the members were afraid that this standard would not be accepted by the e-receipt industry as their software is "too weak"!

Decimal-based floating-point

Decimal-based floating-point was invented for the commercial sector. It missed the early IEEE 754 standard in the late 80ths and still took 20 years until it was embraced by IEEE 754 in 2008. Now, being part of all major libraries as Java, .Net, Intel, etc.

Invoice Example

quantity = 1000000000.0 
priceAmount = 1.0 
baseQuantity = 3

XSLT using binary floating-point

 $quantity * ($priceAmount div $baseQuantity)) = (1000000000.0 *(1.0 div 3 )) = 333333333.333333333333333333                                                                                                                                                          
($quantity *  $priceAmount div $baseQuantity)  = (1000000000.0 * 1.0 div 3 )  = 333333333.3333333333333333333333333333333333
Accuracy Problem

The above values should be the same, but differ by 0.0000000000000003333333333333333. In the energy & pharma sector prices with 6 to 9 decimal places are often and also going along with high quantities. By this, these errors show-up easily on Cent level.

XSLT using decimal-based floating-point (IEEE 754:2008 or later)

 $quantity * ($priceAmount div $baseQuantity)) = (1000000000.0 *(1.0 div 3 )) = 333333333.3333333333333333333333333 
($quantity *  $priceAmount div $baseQuantity)  = (1000000000.0 * 1.0 div 3 )  = 333333333.3333333333333333333333333 

For further information on decimal-based floating-point

How Accuracy was improved in Saxon

This Saxon update is achieved by several minor enhancements:

  1. Using solely decimal-based floating-point instead of binary floating-point. The fix was to disable Double creation in NumericValue.
  2. Extending the existing BigDecimal implementation to full floating-point support.
  3. Adding highest Java precision decimal-based floating-point support to multiplication and division of BigDecimals.
  4. Added round-half-away-from-zero() function (in Java half-up) as integrated extension functions of SAXON, as half-away-from-zero rounding is the default rounding in EU e-commerce - the rounding that we had likely learned in school - and now also added as default rounding to the EN16931 specification. The W3C XPath round() function is different by always rounding in the direction of positives, e.g. -1.5 becomes -1.

Building Saxon from latest Sources

As the Saxon HE sources do not exist on GitHub, I downloaded the sources and the pom.xml from the Maven Repository into a Maven directory structure. To make the JAR become useable, further artifacts had to be copied from the published SAXON JAR:

  • META-INF folder - (but removing all signature information)
  • src/main/resources/net/sf/saxon/data/

I have added a smoke test case to ease debugging from the IDE. The output XML file will be generated as target/generated-sources/out.xml file. JDK 1.8 is required by the original Saxon of Saxonica and Maven as build environment. Build & smoke test can be executed via command-line by calling: mvn clean install

Updating Saxon Version

There is bash script 'saxon-update.sh', which download the specified Saxon-HE version from Maven and rebase our changes on top of it.

  1. Two variables of next & current version of Saxon needs to be adopted (see Maven for latest version). In addition, this change of the 'saxon-update.sh' must be first commited on the accuracy-feature branch, otherwise the script will not start.
  2. Sometimes there might be merge conflicts if Saxon changed a line we are adopting (the script will stop). In this case the last three lines (change of version in pom.xml and its commit) have to be done manually, after resolving prior the rebase conflicts manually.
  3. Test if the sources build & our test runs without error (there are errors in JavaDoc nevermind).
  4. Tag manually the latest commit to trigger the GitHub automatic release deployment, see chapter GitHub Actions below.
    git tag -sm <TAG_MESSAGE> <TAG_LABEL>
    e.g. "git tag -sm v12.4 v12.4" # using -s to sign the tag & -m is taking the next parameter as message

Git Branches

  1. accuracy-feature (our feature branch - our feature on top of the Saxon functionality) - we only commit to this branch!
    Our feature branch that will be continously updated. Contains the script and everything on top of existing Saxon. Will be rebased on top of the SAXON sources (saxon-upstream).
  2. saxon-upstream (automatic generated - don't touch)
    As Saxon is not available on GitHub we need to create the sources from the Maven source & binary JAR (downloaded, extracted and normalized (dos2unix) via our bash script) Only the Java sources of Saxon (without the pom.xml resulting into continous merge conflicts (e.g. version number changes)). The required parts of the Maven Saxon sources and binaries JAR are being added on top of this branch.
  3. SAXON-HE-v<VERSION> (original Saxon sources - could be used for other features on top of Saxon)
    Branch with original Saxon functionality. Forks the saxon-upstream of Saxon source & binary JARs with adding first the original Saxon pom.xml also downloaded from Maven. With an additional commit overwriting this pom.xml with our feature branch pom.xml to allow the Saxon sources to be able to build.
  4. SAXON-HE-accuracy-v<VERSION> (our feature-enriched Saxon sources - could be used for maintenance)
    This branch provides the maintenance branch of our enriched Saxon sources. As we are rebasing our feature branch (accuracy-feature) always on top of the saxon-sources (saxon-upstream) all commits will get new hashes and the original branch (and commits) would get lost.
  5. prototyping (deprecated (pure historical) - don't touch)
    Initial work before it was later refactored to be automated by bash script 'saxon-update.sh'.
  6. basics (deprecated (pure historical) - don't touch)
    Branch the inital bash script was being started.

GiHub Actions

There are two GitHub Actions

  1. Build: Triggered by every push or pull-request on the default branch.
  2. Deployment: Triggered whenever a tag was pushed a GitHub release is being automated made using the version number extracted from the pom.xml file, for instance:
    1. git tag -sm <TAG_MESSAGE> <TAG_LABEL>
      e.g. "git tag -sm v12.4 v12.4" # using -s to sign the tag & -m is taking the next parameter as message
    2. git push --force --follow-tags --all origin # pushing with force (as we rebased our feature branch "accuracy-feature") with all tags & all branches to origin (this repo) Note: The overwrite function does not work a release has to be manually deleted for the same version from pom.xml!

Reports to Saxonica

About

The Saxon XSLT Processor is using the accurate decimal-based floating-point arithmetic and half-up rounding for VAT rounding according to EU law. According to Europen Norm EN16931:2024

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages