Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Templates for doing XSLT transformations #67

Open
michaelwechner opened this issue Mar 3, 2014 · 5 comments
Open

Implement Templates for doing XSLT transformations #67

michaelwechner opened this issue Mar 3, 2014 · 5 comments

Comments

@michaelwechner
Copy link
Member

The class

src/impl/java/org/wyona/yanel/impl/resources/BasicXMLResource.java

is currently re-parsing XSLTs with every request. In order to improve performance it probably would make sense to use javax.xml.transform.Templates

See for example

http://www.javaworld.com/article/2073394/java-xml/transparently-cache-xsl-transformations-with-jaxp.html

@baszero
Copy link
Contributor

baszero commented Mar 3, 2014

Thanks for creating this issue.
The discussed approach with the template cache sounds interesting and would make fully sense.

Should this approach find its way into Yanel, it would make sense to also implement a control flag that one can override for its realm, e.g.
xsltMode = [default, templates]

so that if "default" is set, the algorithm is used as today, if "templates" is set, the new algorithm is used (the flag would be read at startup of the realm).

@baszero
Copy link
Contributor

baszero commented Apr 10, 2014

This is a short update on how I introduced XSL Template objects (and caching) into my realm. Performance has been improved dramatically in our case, up to 10 times faster than the current Yanel implementation.

BasicXMLResource.getTransformedInputStream()

I replaced this line
xsltHandlers[i] = tf.newTransformerHandler(source);

by
xsltHandlers[i] = getTransformerHandler(source, tf);

In the same class I added this protected method:

protected TransformerHandler getTransformerHandler(Source source, SAXTransformerFactory tf) throws TransformerConfigurationException {
    return tf.newTransformerHandler(source);
}

So the change above actually does not change anything at all, except that now it is possible to overwrite the getTransformerHandler() method.

A new class extending BasicXmlResource with caching capabilities
As a next step I created a new class that extends BasicXmlResource and which overrides the method above. The class as such does not do anything else than overriding the BasicXmlResource's method from above:

@Override
protected TransformerHandler getTransformerHandler(Source source, SAXTransformerFactory tf) {
    TransformerHandler th = null;
    String sourceId = null;

    // Caching ?
    boolean useCaching = Boolean.FALSE;
    try {
        useCaching = retrieveFlagFromWhereYouWant();
    } catch (Exception e) {
        log.error(e,e);
    }

    try {
        sourceId = source.getSystemId();
        if (!useCaching) {
            th = tf.newTransformerHandler(source); // the normal way
        } else {
            // We use the cached templates
            Templates template = XslTemplatesCache.get(source.getSystemId());
            if (template == null) {
                Templates newTemplate = tf.newTemplates(source);
                XslTemplatesCache.put(source.getSystemId(), newTemplate);
                template = newTemplate;
            }
            th = tf.newTransformerHandler(template);
        }

    } catch (Exception e) {
        log.error(e,e);
    }
    return th;
}

As you can see, if caching is enabled (you have to implement it at your own, on resource level or globally), it uses the Templates.

The Templates Cache Class
And here is the class implementing the cache. Please note that it uses the java.util.concurrent.locks package, in particular the ReentrantReadWriteLock . This lock guarantees the following:

  • Multiple readers can read from the cache simultaneously if there is no writer thread
  • A writer can only write to the cache if all reading threads have got their value from the cache.

The usage of ReentrantReadWriteLock is crucial for a high-performing cache like this!!

package com.zwischengas.jaxp;

import java.util.HashMap;
import java.util.Map;
import java.util.Set;
import java.util.concurrent.locks.Lock;
import java.util.concurrent.locks.ReentrantReadWriteLock;

import javax.xml.transform.Templates;

import org.apache.log4j.Logger;

/**
 * This cache uses the ReentrantReadWriteLock from the java.util.concurrent.locks package instead of using synchronized code sections. 
 * Reason to use ReentrantReadWriteLock : we expect many concurrent Read Threads and only very few write threads. 
 * So we want to allow multiple read threads in parallel in order to improve performance.
 * 
 * @author baszero
 */
public class XslTemplatesCache {
    protected static Logger log = Logger.getLogger(XslTemplatesCache.class);

    private static volatile Map<String, Templates> templatesCache = new HashMap<String, Templates>();
    private static final ReentrantReadWriteLock rwl = new ReentrantReadWriteLock(Boolean.TRUE); // true = fair policy, order of lock acquisition is preserved
    private static final Lock r = rwl.readLock();
    private static final Lock w = rwl.writeLock();

    public static Templates get(String key) {
        r.lock(); // here it only waits if another thread is writing to the cache
        try {
            return templatesCache.get(key);

        } finally {
            r.unlock();
        }
    }

    public static Templates put(String key, Templates value) {
        w.lock(); // this thread waits until all preceding read threads have finished
        try {
            return templatesCache.put(key, value);

        } finally {
            w.unlock();
        }
    }

    public static void clear() {
        w.lock();
        try {
            templatesCache.clear();
        } finally {
            w.unlock();
        }
    }

    /**
     * @return number of entries
     */
    public static int size() {
        int result = -1;
        r.lock();
        try {
            result = templatesCache.size();
        } finally {
            r.unlock();
        }
        return result;
    }

    public static Set<String> getKeys() {
        Set<String> result = null;
        r.lock();
        try {
            result = templatesCache.keySet();
        } finally {
            r.unlock();
        }
        return result;
    }

}

Final remarks

If you use the cached templates as described above, you will get the following behaviour:

  • Today you can quickly modify an XSL and you instantly see the changes on the page
  • With caching enabled, this live-editing does not work anymore. But: I implemented an admin section in my application where I can clear all caches (just call XslTemplatesCache.clear()). This way I can still make hot-changes in an XSL and make the changes live immediately, but only upon explicit request.
  • Performance gets improved in any case. If you use MANY includes (e.g. <xsl:import href="include.snippets.xsl" />), it will help to improve a lot. In my case, the XSL rendering part got 10 times quicker.

@michaelwechner
Copy link
Member Author

Dear Balz

Great :-), thanks very much. Will try to integrate it shortly.

All the best

Michael

Am 10.04.14 15:34, schrieb baszero:

This is a short update on how I introduced XSL Template objects (and caching) into my realm. Performance has been improved dramatically in our case, up to 10 times faster than the current Yanel implementation.

BasicXMLResource.getTransformedInputStream()

I replaced this line
xsltHandlers[i] = tf.newTransformerHandler(source);

by
xsltHandlers[i] = getTransformerHandler(source, tf);

In the same class I added this protected method:

protected TransformerHandler getTransformerHandler(Source source, SAXTransformerFactory tf) throws TransformerConfigurationException {
    return tf.newTransformerHandler(source);
}

So the change above actually does not change anything at all, except that now it is possible to overwrite the getTransformerHandler() method.

A new class extending BasicXmlResource with caching capabilities
As a next step I created a new class that extends BasicXmlResource and which overrides the method above. The class as such does not do anything else than overriding the BasicXmlResource's method from above:

@Override
protected TransformerHandler getTransformerHandler(Source source, SAXTransformerFactory tf) {
    TransformerHandler th = null;
    String sourceId = null;

    // Caching ?
    boolean useCaching = Boolean.FALSE;
    try {
        useCaching = retrieveFlagFromWhereYouWant();
    } catch (Exception e) {
        log.error(e,e);
    }

    try {
        sourceId = source.getSystemId();
        if (!useCaching) {
            th = tf.newTransformerHandler(source); // the normal way
        } else {
            // We use the cached templates
            Templates template = XslTemplatesCache.get(source.getSystemId());
            if (template == null) {
                Templates newTemplate = tf.newTemplates(source);
                XslTemplatesCache.put(source.getSystemId(), newTemplate);
                template = newTemplate;
            }
            th = tf.newTransformerHandler(template);
        }

    } catch (Exception e) {
        log.error(e,e);
    }
    return th;
}

As you can see, if caching is enabled (you have to implement it at your own, on resource level or globally), it uses the Templates.

The Templates Cache Class
And here is the class implementing the cache. Please note that it uses the java.util.concurrent.locks package, in particular the ReentrantReadWriteLock . This lock guarantees the following:

  • Multiple readers can read from the cache simultaneously if there is no writer thread
  • A writer can only write to the cache if all reading threads have got their value from the cache.

The usage of ReentrantReadWriteLock is crucial for a high-performing cache like this!!

package com.zwischengas.jaxp;

import java.util.HashMap;
import java.util.Map;
import java.util.Set;
import java.util.concurrent.locks.Lock;
import java.util.concurrent.locks.ReentrantReadWriteLock;

import javax.xml.transform.Templates;

import org.apache.log4j.Logger;

/**
 * This cache uses the ReentrantReadWriteLock from the java.util.concurrent.locks package instead of using synchronized code sections. 
 * Reason to use ReentrantReadWriteLock : we expect many concurrent Read Threads and only very few write threads. 
 * So we want to allow multiple read threads in parallel in order to improve performance.
 * 
 * @author baszero
 */
public class XslTemplatesCache {
    protected static Logger log = Logger.getLogger(XslTemplatesCache.class);

    private static volatile Map<String, Templates> templatesCache = new HashMap<String, Templates>();
    private static final ReentrantReadWriteLock rwl = new ReentrantReadWriteLock(Boolean.TRUE); // true = fair policy, order of lock acquisition is preserved
    private static final Lock r = rwl.readLock();
    private static final Lock w = rwl.writeLock();

    public static Templates get(String key) {
        r.lock(); // here it only waits if another thread is writing to the cache
        try {
            return templatesCache.get(key);

        } finally {
            r.unlock();
        }
    }

    public static Templates put(String key, Templates value) {
        w.lock(); // this thread waits until all preceding read threads have finished
        try {
            return templatesCache.put(key, value);

        } finally {
            w.unlock();
        }
    }

    public static void clear() {
        w.lock();
        try {
            templatesCache.clear();
        } finally {
            w.unlock();
        }
    }

    /**
     * @return number of entries
     */
    public static int size() {
        int result = -1;
        r.lock();
        try {
            result = templatesCache.size();
        } finally {
            r.unlock();
        }
        return result;
    }

    public static Set<String> getKeys() {
        Set<String> result = null;
        r.lock();
        try {
            result = templatesCache.keySet();
        } finally {
            r.unlock();
        }
        return result;
    }

}

Final remarks

If you use the cached templates as described above, you will get the following behaviour:

  • Today you can quickly modify an XSL and you instantly see the changes on the page
  • With caching enabled, this live-editing does not work anymore. But: I implemented an admin section in my application where I can clear all caches. This way I can still make hot-changes in an XSL and make the changes live immediately, but only upon explicit request.
  • Performance gets improved in any case. If you use MANY includes, it will help to improve a lot. In my case, the XSL rendering part got 10 times quicker.

Reply to this email directly or view it on GitHub:
#67 (comment)

@baszero
Copy link
Contributor

baszero commented May 20, 2016

I see that this has still not found its way into Yanel... this is really a pity! This template cache is really boosting a Yanel App very much, if you have really large XSL files.

I will send a pull request, hoping that it will find its way into yanel.

For the moment I will just overwrite the BasicXMLResource.java in my realm.

baszero added a commit to baszero/yanel that referenced this issue May 20, 2016
backwards compatible).
- new class XslTemplatesCache with high performance cache
- How to use the cache: extend BasicXMLResource and overwrite the method
getTransformerHandler()
- Example for how to use Cache:
wyona#67 (comment)
@baszero
Copy link
Contributor

baszero commented May 20, 2016

Please close this issue and just process this pull request: #77

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants