Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add list intersection processor #269

Merged
merged 4 commits into from
Mar 2, 2022
Merged

Conversation

DanMelman
Copy link
Collaborator

@DanMelman DanMelman commented Mar 1, 2022

This PR introduces a new processor that allows intersecting two list fields.
The result of the intersection is saved to a chosen target field.
If the intersection is empty, we just add an empty list to that field.

Iterable<Object> listB = doc.getField(sourceFieldB);

Set<Object> intersection = new HashSet<>(intersection(listA, listB));
if (isNotEmpty(intersection)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not create the field and leave the value empty?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be easier for querying later (using exists), but can be changed (or even be configurable) if we want

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems counter-intuitive to me. But I am not sure what is the common behavior in other processors.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it makes more sense to add an empty list. I just hope this doesn't produce more complexity later when querying it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Note that the automatic build currently does not run all unit tests - this is something @alexpalchuk started working on. So make sure to run all unit tests locally.
Similarly, the automatic upload to the maven repository is also broken...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I have already run all tests locally using mvn

@DanMelman DanMelman requested a review from barakm March 1, 2022 15:14
Comment on lines +18 to +19
@ProcessorProvider(type = "listIntersect", factory = ListIntersectProcessor.Factory.class)
public class ListIntersectProcessor implements Processor {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to call it arraysIntersect and ArraysIntersectProcessor, this way it's aligned with ElasticSearch type names. (there is no list in ES / OS).

plural form is probably better because we need 2 arrays to do the intersect

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shit. I missed this comment @matvey-mtn.

@DanMelman DanMelman merged commit b397b8e into master Mar 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants