Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIFI-10067 enable use of script, dynamic_templates and _bulk headers for PutElasticsearchJson/Record processors #6628

Closed
wants to merge 2 commits into from

Conversation

ChrisSamo632
Copy link
Contributor

@ChrisSamo632 ChrisSamo632 commented Nov 7, 2022

Summary

NIFI-10067 enable use of script when updating documents in Elasticsearch via PutElasticsearchJson or PutElasticsearchRecord

NIFI-3262 enable use of dynamic_templates in Elasticsearch _bulk operations

NIFI-3262 enable use of bulk header fields in Elasticsearch _bulk operations via BULK: dynamic headers in PutElasticsearchJson and PutElasticsearchRecord

Tracking

Please complete the following tracking steps prior to pull request creation.

Issue Tracking

Pull Request Tracking

  • Pull Request title starts with Apache NiFi Jira issue number, such as NIFI-00000
  • Pull Request commit message starts with Apache NiFi Jira issue number, as such NIFI-00000

Pull Request Formatting

  • Pull Request based on current revision of the main branch
  • Pull Request refers to a feature branch with one commit containing changes

Verification

Please indicate the verification steps performed prior to pull request creation.

Build

  • Build completed using mvn clean install -P contrib-check
    • JDK 8
    • JDK 11
    • JDK 17

Licensing

  • [ ] New dependencies are compatible with the Apache License 2.0 according to the License Policy
  • [ ] New dependencies are documented in applicable LICENSE and NOTICE files

Documentation

  • Documentation formatting appears as expected in rendered files

@MikeThomsen
Copy link
Contributor

Are you planning to update the UpdateByQueryElasticsearch processor to support scripts? I was planning to do that and didn't see it getting updated in your PR.

@ChrisSamo632
Copy link
Contributor Author

Are you planning to update the UpdateByQueryElasticsearch processor to support scripts? I was planning to do that and didn't see it getting updated in your PR.

I don't believe there's anything that needs doing - the processor accepts arbitrary json as the "query" and sends that to elasticsearch, so you can already provide a script object within that

Or have you tried it and found something not working/you think could be improved?

Hopefully all that's left for this PR now is a rebase and finishing the additional docs for the PUT processors to describe the new user of dynamic properties to populate the bulk api request headers

@MikeThomsen
Copy link
Contributor

Or have you tried it and found something not working/you think could be improved?

My vision for that processor would be a query builder that allows you to break down the process. Ex specifying the script or script id and then the update query. That combined with the ability to just use the flowfile body as a raw document to punt to update_by_query.

@ChrisSamo632
Copy link
Contributor Author

Or have you tried it and found something not working/you think could be improved?

My vision for that processor would be a query builder that allows you to break down the process. Ex specifying the script or script id and then the update query. That combined with the ability to just use the flowfile body as a raw document to punt to update_by_query.

That (to me) sounds like a (potentially breaking) change to the processor's API, best left to another ticket/discussion

I'm happy to update the existing processor's documentation to clarify that the query property for UpdateByQuery is currently the request body that's sent to Elasticsearch for the _update_by_query endpoint, i.e. if you want to run a Script, write your JSON like {"script": {"source": "..."}, "query": {"match_all":{}}}

@ChrisSamo632
Copy link
Contributor Author

My vision for that processor...

@MikeThomsen FYI I raised NIFI-11016 to cover your future update to the UpdateByQueryElasticsearch processor if you wanted to do that in future

@ChrisSamo632 ChrisSamo632 force-pushed the NIFI-10067 branch 2 times, most recently from 9e6ff72 to 2de7ff5 Compare January 6, 2023 09:43
@ChrisSamo632 ChrisSamo632 changed the title NIFI-10067 enable use of script for Elasticsearch updates NIFI-10067 enable use of script, dynamic_templates and _bulk headers for PutElasticsearchJson/Record processors Feb 13, 2023
Copy link
Contributor

@gresockj gresockj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this improvement, @ChrisSamo632. A couple minor naming suggestions, and then I'd like to test this out.


static final PropertyDescriptor DYNAMIC_TEMPLATES = new PropertyDescriptor.Builder()
.name("put-es-json-dynamic_templates")
.displayName("Script")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a copy/paste error

@@ -190,6 +201,26 @@ public class PutElasticsearchRecord extends AbstractPutElasticsearch {
.expressionLanguageSupported(ExpressionLanguageScope.FLOWFILE_ATTRIBUTES)
.build();

static final PropertyDescriptor SCRIPT_RECORD_PATH = new PropertyDescriptor.Builder()
.name("put-es-record-script-path")
.displayName("script Record Path")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps capital S

@ChrisSamo632
Copy link
Contributor Author

ChrisSamo632 commented Mar 31, 2023

A community member has asked on Slack whether this change can also incorporate the option for scripted_upsert.

Options to set this flag for the PutElasticsearchRecord and PutElasticsearchJson processors should be added as part of this PR (it would make sense to do so, should be simple and hopefully completes the current set of Elasticsearch Update API options available to clients), to set the scripted_upsert: true flag in the Bulk API operation when wanted.

@ChrisSamo632
Copy link
Contributor Author

scripted_upsert option added to the PutElasticsearchJson and PutElasticsearchRecord processors (with associated Integration Tests in the ElasticsearchClientServiceImpl

ChrisSamo632 and others added 2 commits April 25, 2023 20:05
…rch via PutElasticsearchJson or PutElasticsearchRecord

NIFI-3262 enable use of dynamic_templates in Elasticsearch _bulk operations

NIFI-3262 enable use of bulk header fields in Elasticsearch _bulk operations via BULK: dynamic headers in PutElasticsearchJson and PutElasticsearchRecord
Copy link
Contributor

@gresockj gresockj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I tested out the latest changes -- thanks for the updates! I'll go ahead and merge.

@asfgit asfgit closed this in 3a3b9ed May 2, 2023
@ChrisSamo632 ChrisSamo632 deleted the NIFI-10067 branch May 2, 2023 16:01
ChrisSamo632 added a commit to ChrisSamo632/nifi that referenced this pull request May 2, 2023
…arch via PutElasticsearchJson or PutElasticsearchRecord

NIFI-3262: enable use of dynamic_templates in Elasticsearch _bulk operations

Signed-off-by: Joe Gresock <jgresock@gmail.com>

This closes apache#6628.
ChrisSamo632 added a commit to ChrisSamo632/nifi that referenced this pull request May 2, 2023
…arch via PutElasticsearchJson or PutElasticsearchRecord

NIFI-3262: enable use of dynamic_templates in Elasticsearch _bulk operations

Signed-off-by: Joe Gresock <jgresock@gmail.com>

This closes apache#6628.
asfgit pushed a commit that referenced this pull request May 3, 2023
…arch via PutElasticsearchJson or PutElasticsearchRecord

NIFI-3262: enable use of dynamic_templates in Elasticsearch _bulk operations

Signed-off-by: Joe Gresock <jgresock@gmail.com>

This closes #6628.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants