-
Notifications
You must be signed in to change notification settings - Fork 680
Description
Change
NodeValue's _setByValue method only handles xsd datatypes however it eagely materializes the lexical form even of non-xsd namespace'd datatypes. This introduces a noticeable performance overhead when dealing with datatype extensions such as geometries or json objects which are only used as intermediary values.
With my current workload of many small json objects it is around 5-10%.
NodeValue itself bears the following comment
- Conversely, delaying turning a value into a graph node is
valuable because intermediates, like the result of 2+3, will not
be needed as nodes unless assignment (and there is no assignment
in SPARQL even if there is for ARQ).
Node level operations like str() don't need a full node.
The simple solution is to defer materialization of the lexical form after having ensured the given Node has a datatype in the xsd namespace.
As a question, I wonder if it is really necessary for _setByValue to always go via the lexical form for all XSD types, or whether as a future improvement it would be possible to reuse the LiteralLabel's Java object.
Profile with enhancement. Note, that JsonWriter.string() no longer appears:

Are you interested in contributing a pull request for this task?
Yes
