Skip to content

Commit

Permalink
Migrations: Don't auto-create temp index (elastic#158182)
Browse files Browse the repository at this point in the history
## Summary

Try to fix
elastic#156117 (comment)

## Release notes
Fixes a race condition that could cause intermittent upgrade migration
failures when Kibana connects to a single node Elasticsearch cluster.

### Checklist

Delete any items that are not applicable to this PR.

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] Any UI touched in this PR is usable by keyboard only (learn more
about [keyboard accessibility](https://webaim.org/techniques/keyboard/))
- [ ] Any UI touched in this PR does not create any new axe failures
(run axe in browser:
[FF](https://addons.mozilla.org/en-US/firefox/addon/axe-devtools/),
[Chrome](https://chrome.google.com/webstore/detail/axe-web-accessibility-tes/lhdoppojpmngadmnindnejefpokejbdd?hl=en-US))
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This renders correctly on smaller devices using a responsive
layout. (You can test this [in your
browser](https://www.browserstack.com/guide/responsive-testing-on-local-server))
- [ ] This was checked for [cross-browser
compatibility](https://www.elastic.co/support/matrix#matrix_browsers)

### Risk Matrix

Delete this section if it is not applicable to this PR.

Before closing this PR, invite QA, stakeholders, and other developers to
identify risks that should be tested prior to the change/feature
release.

When forming the risk matrix, consider some of the following examples
and how they may potentially impact the change:

| Risk | Probability | Severity | Mitigation/Notes |

|---------------------------|-------------|----------|-------------------------|
| Multiple Spaces—unexpected behavior in non-default Kibana Space.
| Low | High | Integration tests will verify that all features are still
supported in non-default Kibana Space and when user switches between
spaces. |
| Multiple nodes—Elasticsearch polling might have race conditions
when multiple Kibana nodes are polling for the same tasks. | High | Low
| Tasks are idempotent, so executing them multiple times will not result
in logical error, but will degrade performance. To test for this case we
add plenty of unit tests around this logic and document manual testing
procedure. |
| Code should gracefully handle cases when feature X or plugin Y are
disabled. | Medium | High | Unit tests will verify that any feature flag
or plugin combination still results in our service operational. |
| [See more potential risk
examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx) |

### For maintainers

- [ ] This was checked for breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)

(cherry picked from commit 8e7e263)
  • Loading branch information
rudolf committed Jun 4, 2023
1 parent a82751c commit fc7e1ad
Show file tree
Hide file tree
Showing 12 changed files with 75 additions and 15 deletions.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Expand Up @@ -26,6 +26,12 @@ export interface BulkOverwriteTransformedDocumentsParams {
index: string;
operations: BulkOperation[];
refresh?: estypes.Refresh;
/**
* If true, we prevent Elasticsearch from auto-creating the index if it
* doesn't exist. We use the ES paramater require_alias: true so `index`
* must be an alias, otherwise the bulk index will fail.
*/
useAliasToPreventAutoCreate?: boolean;
}

/**
Expand All @@ -38,6 +44,7 @@ export const bulkOverwriteTransformedDocuments =
index,
operations,
refresh = false,
useAliasToPreventAutoCreate = false,
}: BulkOverwriteTransformedDocumentsParams): TaskEither.TaskEither<
| RetryableEsClientError
| TargetIndexHadWriteBlock
Expand All @@ -56,7 +63,7 @@ export const bulkOverwriteTransformedDocuments =
// probably unlikely so for now we'll accept this risk and wait till
// system indices puts in place a hard control.
index,
require_alias: false,
require_alias: useAliasToPreventAutoCreate,
wait_for_active_shards: WAIT_FOR_ALL_SHARDS_TO_BE_ACTIVE,
refresh,
filter_path: ['items.*.error'],
Expand Down
Expand Up @@ -265,6 +265,7 @@ describe('createInitialState', () => {
},
},
"tempIndex": ".kibana_task_manager_8.1.0_reindex_temp",
"tempIndexAlias": ".kibana_task_manager_8.1.0_reindex_temp_alias",
"tempIndexMappings": Object {
"dynamic": false,
"properties": Object {
Expand Down
Expand Up @@ -142,6 +142,7 @@ export const createInitialState = ({
versionAlias: `${indexPrefix}_${kibanaVersion}`,
versionIndex: `${indexPrefix}_${kibanaVersion}_001`,
tempIndex: getTempIndexName(indexPrefix, kibanaVersion),
tempIndexAlias: getTempIndexName(indexPrefix, kibanaVersion) + '_alias',
kibanaVersion,
preMigrationScript: Option.fromNullable(preMigrationScript),
targetIndexMappings,
Expand Down
Expand Up @@ -92,7 +92,7 @@ describe('createBatches', () => {
expect(
createBatches({
documents,
maxBatchSizeBytes: (DOCUMENT_SIZE_BYTES + 43) * 2, // add extra length for 'index' property
maxBatchSizeBytes: (DOCUMENT_SIZE_BYTES + 49) * 2, // add extra length for 'index' property
typeIndexMap: buildTempIndexMap(
{
'.kibana': ['dashboard'],
Expand All @@ -108,7 +108,7 @@ describe('createBatches', () => {
{
index: {
_id: '',
_index: '.kibana_8.8.0_reindex_temp',
_index: '.kibana_8.8.0_reindex_temp_alias',
},
},
{ type: 'dashboard', title: 'my saved object title ¹' },
Expand All @@ -117,7 +117,7 @@ describe('createBatches', () => {
{
index: {
_id: '',
_index: '.kibana_8.8.0_reindex_temp',
_index: '.kibana_8.8.0_reindex_temp_alias',
},
},
{ type: 'dashboard', title: 'my saved object title ²' },
Expand All @@ -128,7 +128,7 @@ describe('createBatches', () => {
{
index: {
_id: '',
_index: '.kibana_cases_8.8.0_reindex_temp',
_index: '.kibana_cases_8.8.0_reindex_temp_alias',
},
},
{ type: 'cases', title: 'a case' },
Expand All @@ -137,7 +137,7 @@ describe('createBatches', () => {
{
index: {
_id: '',
_index: '.kibana_cases_8.8.0_reindex_temp',
_index: '.kibana_cases_8.8.0_reindex_temp_alias',
},
},
{ type: 'cases-comments', title: 'a case comment #1' },
Expand All @@ -148,7 +148,7 @@ describe('createBatches', () => {
{
index: {
_id: '',
_index: '.kibana_cases_8.8.0_reindex_temp',
_index: '.kibana_cases_8.8.0_reindex_temp_alias',
},
},
{ type: 'cases-user-actions', title: 'a case user action' },
Expand Down
Expand Up @@ -56,7 +56,7 @@ export function buildTempIndexMap(
): Record<string, string> {
return Object.entries(indexTypesMap || {}).reduce<Record<string, string>>(
(acc, [indexAlias, types]) => {
const tempIndex = getTempIndexName(indexAlias, kibanaVersion!);
const tempIndex = getTempIndexName(indexAlias, kibanaVersion!) + '_alias';

types.forEach((type) => {
acc[type] = tempIndex;
Expand Down
Expand Up @@ -252,7 +252,9 @@ export const createBulkIndexOperationTuple = (
{
index: {
_id: doc._id,
...(typeIndexMap[doc._source.type] && { _index: typeIndexMap[doc._source.type] }),
...(typeIndexMap[doc._source.type] && {
_index: typeIndexMap[doc._source.type],
}),
// use optimistic concurrency control to ensure that outdated
// documents are only overwritten once with the latest version
...(typeof doc._seq_no !== 'undefined' && { if_seq_no: doc._seq_no }),
Expand Down
Expand Up @@ -102,6 +102,7 @@ describe('migrations v2 model', () => {
versionAlias: '.kibana_7.11.0',
versionIndex: '.kibana_7.11.0_001',
tempIndex: '.kibana_7.11.0_reindex_temp',
tempIndexAlias: '.kibana_7.11.0_reindex_temp_alias',
excludeOnUpgradeQuery: {
bool: {
must_not: [
Expand Down
Expand Up @@ -139,6 +139,7 @@ export const nextActionMap = (
Actions.createIndex({
client,
indexName: state.tempIndex,
aliases: [state.tempIndexAlias],
mappings: state.tempIndexMappings,
}),
READY_TO_REINDEX_SYNC: () => Actions.synchronizeMigrators(readyToReindex),
Expand All @@ -163,7 +164,13 @@ export const nextActionMap = (
REINDEX_SOURCE_TO_TEMP_INDEX_BULK: (state: ReindexSourceToTempIndexBulk) =>
Actions.bulkOverwriteTransformedDocuments({
client,
index: state.tempIndex,
/*
* Since other nodes can delete the temp index while we're busy writing
* to it, we use the alias to prevent the auto-creation of the index if
* it doesn't exist.
*/
index: state.tempIndexAlias,
useAliasToPreventAutoCreate: true,
operations: state.bulkOperationBatches[state.currentBatch],
/**
* Since we don't run a search against the target index, we disable "refresh" to speed up
Expand Down
Expand Up @@ -132,10 +132,16 @@ export interface BaseState extends ControlState {
*/
readonly versionIndex: string;
/**
* An alias on the target index used as part of an "reindex block" that
* A temporary index used as part of an "reindex block" that
* prevents lost deletes e.g. `.kibana_7.11.0_reindex`.
*/
readonly tempIndex: string;
/**
* An alias to the tempIndex used to prevent ES from auto-creating the temp
* index if one node deletes it while another writes to it
* e.g. `.kibana_7.11.0_reindex_temp_alias`.
*/
readonly tempIndexAlias: string;
/**
* When upgrading to a more recent kibana version, some saved object types
* might be conflicting or no longer used.
Expand Down
Expand Up @@ -118,7 +118,7 @@ describe('migration v2', () => {
await root.preboot();
await root.setup();
await expect(root.start()).rejects.toMatchInlineSnapshot(
`[Error: Unable to complete saved object migrations for the [.kibana] index: The document with _id "canvas-workpad-template:workpad-template-061d7868-2b4e-4dc8-8bf7-3772b52926e5" is 1715312 bytes which exceeds the configured maximum batch size of 1015275 bytes. To proceed, please increase the 'migrations.maxBatchSizeBytes' Kibana configuration option and ensure that the Elasticsearch 'http.max_content_length' configuration option is set to an equal or larger value.]`
`[Error: Unable to complete saved object migrations for the [.kibana] index: The document with _id "canvas-workpad-template:workpad-template-061d7868-2b4e-4dc8-8bf7-3772b52926e5" is 1715318 bytes which exceeds the configured maximum batch size of 1015275 bytes. To proceed, please increase the 'migrations.maxBatchSizeBytes' Kibana configuration option and ensure that the Elasticsearch 'http.max_content_length' configuration option is set to an equal or larger value.]`
);

await retryAsync(
Expand All @@ -131,7 +131,7 @@ describe('migration v2', () => {
expect(
records.find((rec) =>
rec.message.startsWith(
`Unable to complete saved object migrations for the [.kibana] index: The document with _id "canvas-workpad-template:workpad-template-061d7868-2b4e-4dc8-8bf7-3772b52926e5" is 1715312 bytes which exceeds the configured maximum batch size of 1015275 bytes. To proceed, please increase the 'migrations.maxBatchSizeBytes' Kibana configuration option and ensure that the Elasticsearch 'http.max_content_length' configuration option is set to an equal or larger value.`
`Unable to complete saved object migrations for the [.kibana] index: The document with _id "canvas-workpad-template:workpad-template-061d7868-2b4e-4dc8-8bf7-3772b52926e5" is 1715318 bytes which exceeds the configured maximum batch size of 1015275 bytes. To proceed, please increase the 'migrations.maxBatchSizeBytes' Kibana configuration option and ensure that the Elasticsearch 'http.max_content_length' configuration option is set to an equal or larger value.`
)
)
).toBeDefined();
Expand Down
Expand Up @@ -68,6 +68,7 @@ describe('migration actions', () => {
await createIndex({
client,
indexName: 'existing_index_with_docs',
aliases: ['existing_index_with_docs_alias'],
mappings: {
dynamic: true,
properties: {
Expand Down Expand Up @@ -151,7 +152,9 @@ describe('migration actions', () => {
expect(res.right).toEqual(
expect.objectContaining({
existing_index_with_docs: {
aliases: {},
aliases: {
existing_index_with_docs_alias: {},
},
mappings: expect.anything(),
settings: expect.anything(),
},
Expand All @@ -168,7 +171,9 @@ describe('migration actions', () => {
expect(res.right).toEqual(
expect.objectContaining({
existing_index_with_docs: {
aliases: {},
aliases: {
existing_index_with_docs_alias: {},
},
mappings: {
// FIXME https://github.com/elastic/elasticsearch-js/issues/1796
dynamic: 'true',
Expand Down Expand Up @@ -1947,6 +1952,30 @@ describe('migration actions', () => {
}
`);
});
it('resolves left index_not_found_exception if the index does not exist and useAliasToPreventAutoCreate=true', async () => {
const newDocs = [
{ _source: { title: 'doc 5' } },
{ _source: { title: 'doc 6' } },
{ _source: { title: 'doc 7' } },
] as unknown as SavedObjectsRawDoc[];
await expect(
bulkOverwriteTransformedDocuments({
client,
index: 'existing_index_with_docs_alias_that_does_not_exist',
useAliasToPreventAutoCreate: true,
operations: newDocs.map((doc) => createBulkIndexOperationTuple(doc)),
refresh: 'wait_for',
})()
).resolves.toMatchInlineSnapshot(`
Object {
"_tag": "Left",
"left": Object {
"index": "existing_index_with_docs_alias_that_does_not_exist",
"type": "index_not_found_exception",
},
}
`);
});
it('resolves left target_index_had_write_block if there are write_block errors', async () => {
const newDocs = [
{ _source: { title: 'doc 5' } },
Expand Down

0 comments on commit fc7e1ad

Please sign in to comment.