-
Notifications
You must be signed in to change notification settings - Fork 8.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rollup index patterns with live data #21187
Rollup index patterns with live data #21187
Conversation
💔 Build Failed |
If you want to play around with creating rollup jobs and index patterns, generate some makelogs data. Then create a rollup job and run it. The configuration I've been using which rolls up
Start job:
Let job run for about a minute, or until you see that it has been triggered and documents have been processed. You can get these stats at this endpoint:
I always stop my jobs after the documents have been processed since I'm not feeding in continuous data:
|
- Timelion `index` suggestions - ML - Graph
8d32307
to
c7f8131
Compare
💔 Build Failed |
💔 Build Failed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏 Looks awesome! Just had a few suggestions, mostly about readability.
$scope.editingId = $route.current.params.indexPatternId; | ||
$scope.$watch('defaultIndex', () => renderList()); | ||
config.bindToScope($scope, 'defaultIndex'); | ||
$scope.$apply(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, what motivated these changes? And why call $scope.$apply()
? I think it's generally considered a bad practice because it can cause a $digest already in progress
error. The safer way would be to wrap some code inside of $evalAsync
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a change on top of a previous change that converted the sidebar to react and I ran into some issues with rendering the react component. I agree $scope.$apply()
should be avoided, will take a closer look at this next pass with my work for scripted fields
|
||
illegalCharacters = (characters = []) => { | ||
return ['*'].concat(characters); | ||
isRollupIndex = (indexName) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is pretty simple and straightforward, but because it's so far removed from the context in which it's consumed, maybe some unit tests would be useful for establishing intention and expected use cases? I'm mostly thinking for the getIndexPatternCreationOption
, isRollupIndex
, getIndexTags
, and checkIndicesForErrors
methods. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 will add tests next pass too
const jobs = indexPattern.typeMeta && indexPattern.typeMeta.jobs; | ||
const capabilities = jobs && indexPattern.typeMeta.capabilities && indexPattern.typeMeta.capabilities[jobs[0]].fields[field]; | ||
const allAggs = indexPattern.typeMeta && indexPattern.typeMeta.aggs; | ||
const fieldAggs = allAggs && Object.keys(allAggs).filter(agg => allAggs[agg][field]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would 0
or an empty string be valid values for allAggs[agg][field]
here? In which case we'd need to check against null/undefined instead of falsiness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope, allAggs[agg][field]
should be an array - will add an array check here
// date histogram field. | ||
if( | ||
!allAggs[aggName] || | ||
(!allAggs[aggName][fieldName] && aggName !== 'date_histogram') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about extracting these into variables to make the code more self-descriptive?
const aggDoesntExist = !allAggs[aggName];
const fieldDoesntExist = !allAggs[aggName][fieldName];
const aggIsntDateHistogram = aggName !== 'date_histogram';
if(
aggDoesntExist ||
(fieldDoesntExist && aggIsntDateHistogram)
) {
allAggs[aggName] = allAggs[aggName] || {};
allAggs[aggName][fieldName] = { ...agg };
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if allAggs[aggName]
is undefined, const fieldDoesntExist = !allAggs[aggName][fieldName];
will throw an error. will change this to const fieldDoesntExist = allAggs[aggName] && !allAggs[aggName][fieldName];
// For date histograms, if it is on the same field, check that the configuration is identical, | ||
// otherwise reject. If not the same field, reject; | ||
case 'date_histogram': | ||
if(!allAggs[aggName][fieldName] || !isEqual(allAggs[aggName][fieldName], agg)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can reuse one of the variables here:
const isDifferentDateHistogram = !isEqual(allAggs[aggName][fieldName], agg);
if(fieldDoesntExist || isDifferentDateHistogram) {
throw new Error('Multiple date histograms configured');
}
} catch(err) { | ||
if (isEsError(err)) { | ||
return reply(wrapEsError(err)); | ||
switch(agg) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check my logic on this, but it looks like we're duplicating some code here? Also, maybe we could add a comment on why we're caching the filtered fields?
const regularFields = fields.filter(field => !rollupFieldNames.includes(field));
// Cache this for later for some reason?
rollupFieldNames.push(...regularFields);
switch(agg) {
case 'date_histogram':
const fieldsWithRollupCapabilities = map(field => {
return {
...fieldsFromFieldCapsApi[`${field}.${agg}.timestamp`],
...defaultField,
name: field,
};
});
rollupFields.push(...fieldsWithRollupCapabilities);
break;
default:
const fieldsWithRollupCapabilities = map(field => {
return {
...fieldsFromFieldCapsApi[`${field}.${agg}.value`],
...defaultField,
name: field,
};
});
rollupFields.push(...fieldsWithRollupCapabilities);
break;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I de-duped the code here, got rid of the switch altogether. the only thing that actually needed to be different is ${field}.${agg}.timestamp
vs ${field}.${agg}.value
so thanks for catching that.
the rationale behind this code is:
- We get rollup aggregation capabilities information back in this kind of format. Note that
bytes
belongs to more than one aggregation, and we don't have any information as what type fields are (numeric, keyword, etc):
{
logstash_rollup: {
aggs: {
avg: { bytes: {agg: "avg"}, memory: {agg: "avg"}},
min: { bytes: {agg: "min"} }
...
}
},
...
}
- We get fields information back via regular field capabilities (
getFieldCapabilities()
) that tells us type, but they are keyed with internal rollup field names:
{
fields: {
"bytes.min.value": {
name: "bytes.min.value",
type: "number",
searchable: true,
aggregatable: true,
readFromDocValues: true
},
...
}
}
- Logic in this endpoint looks through each aggregation, and for each field in the aggregation, pulls the relevant information from field capabilities by using the key
${field}.${agg}.value
(or${field}.${agg}.date_histogram
for date histogram aggs) - Since a field can have multiple aggregations, if we don't cache and check the rollup fields we have already collected, we end up with duplicate field information
return { name: savedObject.attributes.title }; | ||
}); | ||
return resp.savedObjects | ||
.filter(o => !o.get('type')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename o
to savedObject
?
x-pack/plugins/graph/public/app.js
Outdated
@@ -661,7 +661,7 @@ app.controller('graphuiPlugin', function ($scope, $route, $interval, $http, kbnU | |||
} | |||
} | |||
|
|||
$scope.indices = $route.current.locals.indexPatterns; | |||
$scope.indices = $route.current.locals.indexPatterns.filter(o => !o.get('type')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename it to indexPattern
here?
@@ -48,7 +48,7 @@ module.controller('MlNewJobStepIndexOrSearch', | |||
timefilter.disableTimeRangeSelector(); // remove time picker from top of page | |||
timefilter.disableAutoRefreshSelector(); // remove time picker from top of page | |||
|
|||
$scope.indexPatterns = getIndexPatterns(); | |||
$scope.indexPatterns = getIndexPatterns().filter(o => !o.get('type')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here?
💔 Build Failed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment about the filter used for the ml index patterns edit.
@@ -48,7 +48,7 @@ module.controller('MlNewJobStepIndexOrSearch', | |||
timefilter.disableTimeRangeSelector(); // remove time picker from top of page | |||
timefilter.disableAutoRefreshSelector(); // remove time picker from top of page | |||
|
|||
$scope.indexPatterns = getIndexPatterns(); | |||
$scope.indexPatterns = getIndexPatterns().filter(indexPattern => !indexPattern.get('type')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to explicitly filter out index patterns why type==='rollup'
, rather than just anything where type
is set? Are rollups the only ones that curently set type
? In the future, if any new types were added they would get filtered out too, which not be the behavior we want. There is a constants file, x-pack/plugins/ml/common/constants/index_patterns.js
, which could have a TYPE_ROLLUP
const added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@peteharverson type
is a new field on index patterns, so rollup ones are the only ones that will have a value. I don't foresee any other types in the near future. This way of filtering is to avoid rollup references in OSS code, but I suppose it is different here since ML is x-pack 😃Thanks for the constants file path. I'll think on this a little more, including our policies around referencing x-pack plugins from another x-pack plugin, and may adjust this filtering in my next pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to bring this up later, but I would like to reopen the discussion around avoiding referencing rollup index patterns from within OSS Kibana (and from within X-Pack plugins by extension).
I think that it makes sense to avoid referencing X-Pack features from within OSS (e.g. monitoring, reporting), because those things literally cannot exist unless X-Pack is installed. An OSS installation of the stack will never encounter those things. But rollups are different, because they generate artifacts in the form of rollup indices, which can exist within OSS installations. I think it makes sense for OSS Kibana to have awareness of these artifacts and the capability of dealing with them intelligently. So I think this is one very good case for making an exception to the rule.
* Use new rollup capabilities keyed by rollup index name ES api, remove wildcard restriction * Adjust plugins to work with new typeMeta information on rollup index patterns * Disable rollup index patterns from: - Timelion `index` suggestions - ML - Graph * Set Discover to use default search strategy
* Use new rollup capabilities keyed by rollup index name ES api, remove wildcard restriction * Adjust plugins to work with new typeMeta information on rollup index patterns * Disable rollup index patterns from: - Timelion `index` suggestions - ML - Graph * Set Discover to use default search strategy
* Use new rollup capabilities keyed by rollup index name ES api, remove wildcard restriction * Adjust plugins to work with new typeMeta information on rollup index patterns * Disable rollup index patterns from: - Timelion `index` suggestions - ML - Graph * Set Discover to use default search strategy
* Use new rollup capabilities keyed by rollup index name ES api, remove wildcard restriction * Adjust plugins to work with new typeMeta information on rollup index patterns * Disable rollup index patterns from: - Timelion `index` suggestions - ML - Graph * Set Discover to use default search strategy
* Use new rollup capabilities keyed by rollup index name ES api, remove wildcard restriction * Adjust plugins to work with new typeMeta information on rollup index patterns * Disable rollup index patterns from: - Timelion `index` suggestions - ML - Graph * Set Discover to use default search strategy
* Use new rollup capabilities keyed by rollup index name ES api, remove wildcard restriction * Adjust plugins to work with new typeMeta information on rollup index patterns * Disable rollup index patterns from: - Timelion `index` suggestions - ML - Graph * Set Discover to use default search strategy
* Use new rollup capabilities keyed by rollup index name ES api, remove wildcard restriction * Adjust plugins to work with new typeMeta information on rollup index patterns * Disable rollup index patterns from: - Timelion `index` suggestions - ML - Graph * Set Discover to use default search strategy
* Use new rollup capabilities keyed by rollup index name ES api, remove wildcard restriction * Adjust plugins to work with new typeMeta information on rollup index patterns * Disable rollup index patterns from: - Timelion `index` suggestions - ML - Graph * Set Discover to use default search strategy
* Use new rollup capabilities keyed by rollup index name ES api, remove wildcard restriction * Adjust plugins to work with new typeMeta information on rollup index patterns * Disable rollup index patterns from: - Timelion `index` suggestions - ML - Graph * Set Discover to use default search strategy
* Use new rollup capabilities keyed by rollup index name ES api, remove wildcard restriction * Adjust plugins to work with new typeMeta information on rollup index patterns * Disable rollup index patterns from: - Timelion `index` suggestions - ML - Graph * Set Discover to use default search strategy
* Use new rollup capabilities keyed by rollup index name ES api, remove wildcard restriction * Adjust plugins to work with new typeMeta information on rollup index patterns * Disable rollup index patterns from: - Timelion `index` suggestions - ML - Graph * Set Discover to use default search strategy
* Use new rollup capabilities keyed by rollup index name ES api, remove wildcard restriction * Adjust plugins to work with new typeMeta information on rollup index patterns * Disable rollup index patterns from: - Timelion `index` suggestions - ML - Graph * Set Discover to use default search strategy
Part of #20004
Rollup index pattern creation changes:
getIndexPatternCreationQuery()
from registry, since we no longer need to filter list of indices in wizardgetAllowWildcards()
from registry, since we now allow wildcardsillegalCharacters()
from registry, this was only used to exclude*
for wildcardsgetFetchForWildcardOptions ()
to registry to allow additional information to be passed to/api/index_patterns/rollup/_fields_for_wildcard
endpointrollup.capabilitiesByRollupIndex
to JS libmergeJobConfigurations()
andareJobsCompatible()
helpers in case a rollup index has more than one job configuredtypeMeta
of a rollup index pattern has been changed to be friendlier for Kibana usage (keyed by agg then fields), it will now look something like like:Disable rollup index patterns from:
index
suggestionsDiscover
setPreferredSearchStrategyId()
to search source registry to force Discover to use_msearch
_msearch
/_search
not being able to handle range queries on rollup date histogram fieldsize: 0
limitation in_rollup_search
? Or revisit the necessity of supporting rollup index patterns in Discover.