Extract more standard metadata from binary files (#78754) #81106
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Until now, we have been extracted a few number of fields from the binary files sent to the ingest attachment plugin:
content,title,author,keywords,date,content_type,content_length,language.Tika has a list of more standard properties which can be extracted:
modified,format,identifier,contributor,coverage,modifier,creator_tool,publisher,relation,rights,source,type,description,print_date,metadata_date,latitude,longitude,altitude,rating,commentsThis commit exposes those new fields.
Related to #22339.
Co-authored-by: Keith Massey keith.massey@elastic.co
gradle check?