index rollover cronjob fails on openshift-logging operator #859

tucsolo · 2022-03-11T14:23:16Z

As documentation suggests, we installed the 5.3.5-20 OpenShift Logging and Elasticsearch operators on our 4.9.0-0.okd-2022-02-12-140851 OKD. Unfortunately, during a oc get pods routine check we noticed a failing pod, and it's the elasticsearch-im-app-xxx, from the elasticsearch-im-app cronjob. During the delete-then-rollover script execution, delete process goes OK, but then during the rollover it gets stuck during the process on some random index:

OK Process:

Index management rollover process starting for app-.orphaned

Current write index for app-.orphaned-write: app-openshift-config-000047
/tmp/scripts/indexManagementClient.py:74: ElasticsearchDeprecationWarning: [types removal] The parameter include_type_name should be explicitly specified in rollover requests to prepare for 7.0. In 7.0 include_type_name will default to 'false', which means requests must omit the type name in mapping definitions.
  response = es_client.indices.rollover(alias=alias, body=decoded)
Checking results from _rollover call
Next write index for app-.orphaned-write: app-openshift-config-000047
Checking if app-openshift-config-000047 exists
/tmp/scripts/indexManagementClient.py:86: ElasticsearchDeprecationWarning: [types removal] The parameter include_type_name should be explicitly specified in get indices requests to prepare for 7.0. In 7.0 include_type_name will default to 'false', which means responses will omit the type name in mapping definitions.
  return es_client.indices.exists(index=index)
Checking if app-openshift-config-000047 is the write index for app-.orphaned-write
Done!

Faulty process:

Index management rollover process starting for app-openshift-config

Current write index for app-openshift-config-write: app-openshift-config-000046
/tmp/scripts/indexManagementClient.py:74: ElasticsearchDeprecationWarning: [types removal] The parameter include_type_name should be explicitly specified in rollover requests to prepare for 7.0. In 7.0 include_type_name will default to 'false', which means requests must omit the type name in mapping definitions.
  response = es_client.indices.rollover(alias=alias, body=decoded)
Checking results from _rollover call
Calculating next write index based on current write index...
/tmp/scripts/indexManagement: line 94: openshift: unbound variable
Next write index for app-openshift-config-write: app-
Checking if app- exists
/tmp/scripts/indexManagementClient.py:86: ElasticsearchDeprecationWarning: [types removal] The parameter include_type_name should be explicitly specified in get indices requests to prepare for 7.0. In 7.0 include_type_name will default to 'false', which means responses will omit the type name in mapping definitions.
  return es_client.indices.exists(index=index)
{"acknowledged": false, "shards_acknowledged": false, "old_index": "app-openshift-config-000047", "new_index": "app-openshift-config-000048", "rolled_over": false, "dry_run": false, "conditions": {"[max_age: 8h]": false, "[max_size: 120gb]": false, "[max_docs: 122880000]": false}}

of course we noticed that

/tmp/scripts/indexManagement: line 94: openshift: unbound variable
Next write index for app-openshift-config-write: app-
Checking if app- exists

Now I noticed that the problem is in lines 346-347 of index management scripts, because we've got indexes like app-openshift-something-012345 and the cut command does not get "app-openshift-something" and "012345" but it gets "app" and "openshift", then failing the counter advance at 012346.

What to do? Do I have to notify someone else? In the OKD github they told me to open an issue in https://issues.redhat.com/projects/LOG/ but I can't actually open an issue on that.

The text was updated successfully, but these errors were encountered:

tucsolo · 2022-03-11T14:35:41Z

possible solution is to replace the cut -d'-' -f2 command with grep -o '[0-9]\+$'

or maybe the entire line with:

$ echo $test
app-test-000342-aa-we-000044123

$ echo ${test%-*}
app-test-000342-aa-we

$ echo ${test##*-}
000044123

tucsolo · 2022-05-19T09:33:47Z

Is there any update about this? We updated the operator to the last version but this issue is still there.

btaani · 2022-05-19T16:19:31Z

hi @tucsolo , thanks for reporting this.
we will take a look at the issue

xperimental · 2022-05-30T17:58:10Z

The issue has been migrated to JIRA: https://issues.redhat.com/browse/LOG-2644

xperimental · 2022-05-30T17:59:03Z

/close

openshift-ci · 2022-05-30T17:59:41Z

@xperimental: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

tucsolo · 2022-05-30T18:02:42Z

Just to know, I tried to open and issue on JIRA as someone else suggested me, buy there wasn't any way to do it.

openshift-ci bot closed this as completed May 30, 2022

btaani mentioned this issue Jun 20, 2022

fix IM rollover name generation #900

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

index rollover cronjob fails on openshift-logging operator #859

index rollover cronjob fails on openshift-logging operator #859

tucsolo commented Mar 11, 2022

tucsolo commented Mar 11, 2022 •

edited

tucsolo commented May 19, 2022

btaani commented May 19, 2022

xperimental commented May 30, 2022

xperimental commented May 30, 2022

openshift-ci bot commented May 30, 2022

tucsolo commented May 30, 2022

index rollover cronjob fails on openshift-logging operator #859

index rollover cronjob fails on openshift-logging operator #859

Comments

tucsolo commented Mar 11, 2022

tucsolo commented Mar 11, 2022 • edited

tucsolo commented May 19, 2022

btaani commented May 19, 2022

xperimental commented May 30, 2022

xperimental commented May 30, 2022

openshift-ci bot commented May 30, 2022

tucsolo commented May 30, 2022

tucsolo commented Mar 11, 2022 •

edited