Skip to content

JENA-2356: Fix race in QueryEngineRegistry, UpdateEngineRegistry init#2047

Merged
afs merged 1 commit intoapache:mainfrom
shawnsmith:factory-init
Oct 21, 2023
Merged

JENA-2356: Fix race in QueryEngineRegistry, UpdateEngineRegistry init#2047
afs merged 1 commit intoapache:mainfrom
shawnsmith:factory-init

Conversation

@shawnsmith
Copy link
Copy Markdown
Contributor

Pull request Description:

Fix the following exception observed in production:

java.util.ConcurrentModificationException: null
	at java.base/java.util.ArrayList$Itr.checkForComodification(ArrayList.java:1013)
	at java.base/java.util.ArrayList$Itr.next(ArrayList.java:967)
	at org.apache.jena.sparql.engine.QueryEngineRegistry.find(QueryEngineRegistry.java:94)
	at org.apache.jena.query.QueryExecutionBuilder.build(QueryExecutionBuilder.java:98)
	at org.apache.jena.query.QueryExecutionFactory.make(QueryExecutionFactory.java:622)
	at org.apache.jena.query.QueryExecutionFactory.make(QueryExecutionFactory.java:602)
	at org.apache.jena.query.QueryExecutionFactory.create(QueryExecutionFactory.java:146)
	at org.apache.jena.query.QueryExecutionFactory.create(QueryExecutionFactory.java:158)
        ...

The root cause is that two threads simultaneously called QueryEngineRegistry.get() for the first time.

At first glance, the old initialization code in QueryEngineRegistry looks ok because init() is called during class loading:

public class QueryEngineRegistry
{
    List<QueryEngineFactory> factories = new ArrayList<>();
    static { init(); }        <========== THIS LOOKS LIKE IT SAFELY INITIALIZES THE REGISTRY

    // Singleton
    static QueryEngineRegistry registry = null; <===== EXCEPT THIS EXECUTES AFTER init() AND CLEARS THE REGISTRY!
    static public QueryEngineRegistry get()
    {
        if ( registry == null )
            init();
        return registry;  <===== THIS PATTERN IS NOT THREAD-SAFE, CAN EXPOSE PARTIALLY-INITIALIZED REGISTRY
    }

    private QueryEngineRegistry() { }

    private static synchronized void init()
    {
        registry = new QueryEngineRegistry();
        registry.add(QueryEngineMain.getFactory());
        registry.add(QueryEngineFactoryWrapper.get());
    }

But if you run the static initialize blocks sequentially, you'll see that things can progress in the following order:

  1. Class initialization calls init() and sets registry to a valid object.
  2. Class initialization executes registry = null;
  3. The first call to QueryEngineFactory.get() sees registry == null and makes a second call to init()
  4. The first line of init() sets registry to an empty object.
  5. Before the second line init() executes, a concurrent call to QueryEngineFactory.get() sees registry != null and tries to use an empty registry.

This PR fixes the static call to init() and removes unnecessary lazy initialization logic from get().

  • As far as I can tell, the only risk with simplifying get() would be if some other class did QueryEngineRegistry.registry = null, forcing the class to re-initialize itself. None of the existing Jena code does that but, to be sure, this PR changes the static QueryEngineRegistry registry field to private.
  • It would also be nice to make the field final. But that's safe only if we can guarantee that QueryEngineRegistry.init() will never call back into QueryEngineRegistry.get() via the method calls to QueryEngineMain or QueryEngineFactoryWrapper, and I didn't want to make that assumption. If it's ok to assume that, I'd be happy to update the PR to use final.

  • Tests are included.
  • Documentation change and updates are provided for the Apache Jena website
  • Commits have been squashed to remove intermediate development commit messages.
  • Key commit messages start with the issue number (GH-xxxx, or if in JIRA, JENA-xxxx)

By submitting this pull request, I acknowledge that I am making a contribution to the Apache Software Foundation under the terms and conditions of the Contributor's Agreement.


See the Apache Jena "Contributing" guide.

@afs afs self-assigned this Oct 20, 2023
@afs
Copy link
Copy Markdown
Member

afs commented Oct 21, 2023

@shawnsmith Looks good to me. Because initialization is now only in class initialization, it will be thread-safe.

Thank you!

@afs afs merged commit 9bb24d2 into apache:main Oct 21, 2023
@shawnsmith shawnsmith deleted the factory-init branch October 23, 2023 02:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants