-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-4952 - Introduce FedXConfig overrides for FedXRepositoryConfig #4953
GH-4952 - Introduce FedXConfig overrides for FedXRepositoryConfig #4953
Conversation
…oryConfig - Provides a subset of options
@@ -187,6 +203,14 @@ public void parse(Model m, Resource implNode) throws RepositoryConfigException { | |||
Models.objectLiteral(m.getStatements(implNode, DATA_CONFIG, null)) | |||
.ifPresent(value -> setDataConfig(value.stringValue())); | |||
|
|||
Models.objectResource(m.getStatements(implNode, FEDX_CONFIG, null)) | |||
.ifPresent(res -> { | |||
if (getConfig() == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I chose to do this so the previous behaviour is preserved:
- If no
FedXConfig
has been provided (programmatically or via template) then it will benull
- When exporting,
FedXConfig
is not included (whennull
) rather than being expanded into defaults (seeexport
above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good considerations! Please see my comments for the exporting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed with the other suggestion: 5a160df
Note: Integration test pipeline has failed due to an unrelated reason:
|
I've set it to rerun the integration test. I've not seen this test being flaky before, but maybe it is. |
It's not an issue with your code. It's a test that is marked as disabled, but has somehow started to run again. Not sure how that happened. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the initiative and contribution. I see the value of making certain settings of the federation easier configurable via the repository configuration file (e.g. managed from the RDF4J workbench). Actually I was also looking for such feature.
Overall the functional change looks clean and good. Thanks for the unit test coverage.
I have one comment though inline on compatibility (at least with the use FedX in our product). Can you please have a look at that?
In addition, users would benefit from a small documentation adjustment of https://rdf4j.org/documentation/programming/federation/#fedx-configuration (stating with a small example that settings can be added in the repository configuration). Likely the example from the javadoc is sufficient already. As far as I remember the documentation is also built from git, the location should be rdf4j/site/content/documentation/programming/federation.md
Regarding the contribution procedures I'd also like to see a confirmation from @hmottestad
/** | ||
* IRI of the property populating {@link FedXConfig#getEnforceMaxQueryTime()} | ||
*/ | ||
public static final IRI CONFIG_ENFORCE_MAX_QUERY_TIME = vf.createIRI(NAMESPACE, "enforceMaxQueryTime"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f.y.i: nowadays in RDF4J instead of ValueFactory
one could simply use Values.iri(NAMESPACE, "enforceMaxQueryTime")
For me OK to stick with the ValueFactory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know 👍
Happy to update or, based on your comment, do you prefer it "looks" the same as FedXRepositoryConfig
does?
Edit: Right - with the other change this will be N/A since they'll all use the ValueFactory
approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -152,6 +163,11 @@ public Resource export(Model m) { | |||
m.add(implNode, DATA_CONFIG, vf.createLiteral(getDataConfig())); | |||
} | |||
|
|||
if (getConfig() != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will mean: any externally set FedXConfig
(e.g. via code) will be serialized into the repository configuration, e.g. when saving it.
I think overall (from history) there is a bit of technical design issue in the FedXRepositoryConfig
and FedXConfig
: the FedXConfig
mixes "string" like options and things that can only be set via code (e.g. the TaskWrapper).
For our product use-case: we are having an extension of FedXConfig (with some proprietary options, and also a TaskWrapper) and we are setting this config via code. We generally would not want to see it exported to the repository config.
So maybe to retain compatibility on that level, we can use one of the following options:
a) introduce another boolean field (say isExternalConfig
) and only export if it is not externally provided
b) put the handling of the FedXConfig for both parsing and exporting to a protected method, which can be overridden in customizations of the repository config
I think option b) is cleaner an will allow most flexibility and still retain compatibility.
So the idea would be to move your new exporting/parsing part of the code from this class to new protected methods
protected void exportFedXConfig(Model model, Resource implNode)
protected void parseFedXConfig(Model model, Resource implNode)
and call those in the current places.
Does this make sense to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - that sounds good. I'll have a go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought - the proposed method signatures allow sub-classes to do whatever they want (be it config-related or not). Would it not make more sense to pass them the node which contains the config (like is done in the separate class) - separation of concerns and all that.
And e.g. allow new FedXConfig(ConfigReader)
to inject a custom or sub-classed implementation?
(But I also appreciate that this probably doesn't need to be over-engineered.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -187,6 +203,14 @@ public void parse(Model m, Resource implNode) throws RepositoryConfigException { | |||
Models.objectLiteral(m.getStatements(implNode, DATA_CONFIG, null)) | |||
.ifPresent(value -> setDataConfig(value.stringValue())); | |||
|
|||
Models.objectResource(m.getStatements(implNode, FEDX_CONFIG, null)) | |||
.ifPresent(res -> { | |||
if (getConfig() == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good considerations! Please see my comments for the exporting
@vtermanis did you have a chance to look at the above feedback? I think it would be ready with a minimal technical change of moving some pieces of code to protected methods. |
@aschwarte10 - sorry about the delay, we've been busy. I will update the PR based your recommendations. Thank you for the initial review. |
…RepositoryConfig
…RepositoryConfig
I've added the remaining conf options: f72c49a |
I've added that in a separate commit - see what you think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing the comments.
Overall the change now looks clean and solid. I am fine to expose all configuration settings also as part of the repository config.
I have one comment / question inline, where I'd like to hear @hmottestad 's feedback.
|
||
BNode confNode = Values.bnode(); | ||
|
||
model.add(confNode, CONFIG_JOIN_WORKER_THREADS, vf.createLiteral(config.getJoinWorkerThreads())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the downside of the current implementation is that exporting will also write the FedXConfig default values, i.e. values that may not have been present in the repository configuration ttl before.
Ideally the parse / export logic would be consistent, i.e. running parse and export produces the same result.
I think I could live with this, but I would like hear @hmottestad's opinion whether this is in line with the RDF4J design and other repo configurations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally the parse / export logic would be consistent, i.e. running parse and export produces the same result.
One could check against the default values for each option, but ...
If the user has specified explicit configuration including default values in the input RDF and we choose to ignore defaults - then the export will not be the same since it'll be missing options previously included that were set to default values. (that information is lost on import.)
.. so unless all options were Optional
to show whether they've been set at any point we can't be sure (unless I'm mistaken).
So I'm not sure whether it's worth going for a halfway house. (But if you decide you'd like the check-whether-default-before-write - I can add it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would leave it up to @hmottestad as he has likely the broadest overview at this point.
As a side note: we are usually solving the parsing / export consistency by having "null" state for the field (i.e. Boolean, Integer, ...). In the current design this does not really fit as it uses the API level FedXConfig. To achieve this properly something similar one would need a repository level FedXConfig and some translation to the api level FedXConfig.
@@ -299,9 +299,10 @@ FedX provides various means for configuration. Configuration settings can be def | |||
|Property | Description | | |||
|---------|-------------| | |||
|prefixDeclarations | Path to prefix declarations file, see [PREFIX Declarations](#prefix-declarations) | | |||
|cacheLocation | Location where the memory cache gets persisted at shutdown, default _cache.db_ | | |||
|sourceSelectionCacheSpec | Cache specification for the `SourceSelectionMemoryCache`, default _maximumSize=1000,expireAfterWrite=6h_ | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing includeInferredDefault / consumingIterationMax
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…ration documentation
@aschwarte10 I've created a CQ for the license issue, but you don't need to wait for that before you merge this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the chance to have a conversation with @hmottestad and we are both fine with the exporting of defaults. Sorry for the delay
Hence, for me the change looks good to me. Thanks for the clean and concise contribution.
We will see this change as part of the 5.0 release of rdf4j.
No, problem. Thanks for the review! |
GitHub issue resolved: #4952
Briefly describe the changes proposed in this PR:
Introduce
FedXConfig
option overrides forFedXRepositoryConfig
so that repo templates can also access options which previously were only available programmatically.We've raised the PR against the
main
branch because from our understanding it's not clear yet when v5.0.0 will be released and we'd be looking to use this feature in production ourselves. Looking at what's changed between 4 latest & 5 (against the files affected by this change) I think it should be trivial to port.PR Author Checklist (see the contributor guidelines for more details):
mvn process-resources
to format from the command line)fixup
commits once review is complete.