Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI Failure (tsh ssh command failure) in RollingRestartTest.test_rolling_restart #18065

Closed
vbotbuildovich opened this issue Apr 24, 2024 · 1 comment
Labels
area/htt auto-triaged used to know which issues have been opened from a CI job ci-failure ci-rca/infra CI Root Cause Analysis - Infrastructure Issue team/devprod display on zenhub workspace for devprod team

Comments

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Apr 24, 2024

https://buildkite.com/redpanda/vtools/builds/13144

Module: rptest.redpanda_cloud_tests.rolling_restart_test
Class: RollingRestartTest
Method: test_rolling_restart
test_id:    RollingRestartTest.test_rolling_restart
status:     FAIL
run time:   1033.944 seconds

CalledProcessError(1, ['tsh', 'ssh', '--proxy=proxy.tp.redpanda.com:443', '--auth=okta', '--identity=/tmp/machine-id/identity', 'redpanda@cok8qgc8m3o00sumpqi0-agent', 'kubectl', 'get', 'pods', '-n', 'redpanda', '-o', 'json'])
Traceback (most recent call last):
  File "/home/ubuntu/redpanda/tests/rptest/services/cluster.py", line 103, in wrapped
    r = f(self, *args, **kwargs)
  File "/home/ubuntu/redpanda/tests/rptest/redpanda_cloud_tests/rolling_restart_test.py", line 35, in test_rolling_restart
    self.redpanda.rolling_restart_pods()
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 1750, in rolling_restart_pods
    self.restart_pod(pod_name, pod_timeout)
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 1721, in restart_pod
    self.kubectl.cmd(delete_cmd)
  File "/home/ubuntu/redpanda/tests/rptest/clients/kubectl.py", line 179, in cmd
    return self._cmd(cmd)
  File "/home/ubuntu/redpanda/tests/rptest/clients/kubectl.py", line 160, in _cmd
    return subprocess.check_output(remote_cmd, stderr=subprocess.PIPE)
  File "/usr/lib/python3.10/subprocess.py", line 421, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['tsh', 'ssh', '--proxy=proxy.tp.redpanda.com:443', '--auth=okta', '--identity=/tmp/machine-id/identity', 'redpanda@cok8qgc8m3o00sumpqi0-agent', 'kubectl', 'delete', 'pod', 'rp-cok8qgc8m3o00sumpqi0-2', '-n=redpanda']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 184, in _do_run
    data = self.run_test()
  File "/opt/.ducktape-venv/lib/python3.10/site-packages/ducktape/tests/runner_client.py", line 276, in run_test
    return self.test_context.function(self.test)
  File "/home/ubuntu/redpanda/tests/rptest/services/cluster.py", line 126, in wrapped
    redpanda.raise_on_crash(log_allow_list=log_allow_list)
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 2057, in raise_on_crash
    active, _, _ = self.get_redpanda_pods_presorted()
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 1640, in get_redpanda_pods_presorted
    all_pods = self.get_redpanda_pods()
  File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 1669, in get_redpanda_pods
    self.kubectl.cmd('get pods -n redpanda -o json').decode())
  File "/home/ubuntu/redpanda/tests/rptest/clients/kubectl.py", line 179, in cmd
    return self._cmd(cmd)
  File "/home/ubuntu/redpanda/tests/rptest/clients/kubectl.py", line 160, in _cmd
    return subprocess.check_output(remote_cmd, stderr=subprocess.PIPE)
  File "/usr/lib/python3.10/subprocess.py", line 421, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['tsh', 'ssh', '--proxy=proxy.tp.redpanda.com:443', '--auth=okta', '--identity=/tmp/machine-id/identity', 'redpanda@cok8qgc8m3o00sumpqi0-agent', 'kubectl', 'get', 'pods', '-n', 'redpanda', '-o', 'json']' returned non-zero exit status 1.

JIRA Link: CORE-2656

@vbotbuildovich vbotbuildovich added auto-triaged used to know which issues have been opened from a CI job ci-failure labels Apr 24, 2024
@michael-redpanda michael-redpanda added ci-rca/infra CI Root Cause Analysis - Infrastructure Issue team/devprod display on zenhub workspace for devprod team labels Apr 25, 2024
@michael-redpanda michael-redpanda changed the title CI Failure (key symptom) in RollingRestartTest.test_rolling_restart CI Failure (tsh ssh command failure) in RollingRestartTest.test_rolling_restart Apr 25, 2024
@rpdevmp
Copy link
Contributor

rpdevmp commented May 16, 2024

Duplicate of #18513

@rpdevmp rpdevmp marked this as a duplicate of #18513 May 16, 2024
@rpdevmp rpdevmp closed this as completed May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/htt auto-triaged used to know which issues have been opened from a CI job ci-failure ci-rca/infra CI Root Cause Analysis - Infrastructure Issue team/devprod display on zenhub workspace for devprod team
Projects
None yet
Development

No branches or pull requests

3 participants