Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI Failure (RpkException - SASL authentication failed) in SchemaRegistryAutoAuthTest.test_schema_id_validation #11203

Closed
andijcr opened this issue Jun 5, 2023 · 8 comments · Fixed by #11390
Assignees
Labels
area/kafka ci-failure kind/bug Something isn't working sev/low Bugs which are non-functional paper cuts, e.g. typos, issues in log messages

Comments

@andijcr
Copy link
Contributor

andijcr commented Jun 5, 2023

https://buildkite.com/redpanda/vtools/builds/7916#0188856f-e4e9-4ee3-944e-95a01dd43a36

Module: rptest.tests.schema_registry_test
Class:  SchemaRegistryAutoAuthTest
Method: test_schema_id_validation
Arguments:
{
  "client_type": 1,
  "payload_class": "com.redpanda.A.B.C.D.NestedPayload",
  "protocol": 2,
  "subject_name_strategy": "io.confluent.kafka.serializers.subject.TopicRecordNameStrategy",
  "validate_schema_id": true
}
====================================================================================================
test_id:    rptest.tests.schema_registry_test.SchemaRegistryAutoAuthTest.test_schema_id_validation.protocol=PROTOBUF.client_type=Python.validate_schema_id=True.subject_name_strategy=SubjectNameStrategyCompat.TOPIC_RECORD_NAME.payload_class=com.redpanda.A.B.C.D.NestedPayload
status:     FAIL
run time:   7.094 seconds


    RpkException('command /opt/redpanda/bin/rpk topic --brokers ip-172-31-8-110:9092,ip-172-31-4-136:9092,ip-172-31-11-200:9092 --user admin --password admin --sasl-mechanism SCRAM-SHA-256 create serde-topic-PROTOBUF-Python --partitions 1 --replicas 1 --topic-config confluent.value.schema.validation:true --topic-config confluent.value.subject.name.strategy:io.confluent.kafka.serializers.subject.TopicNameStrategy returned 1, output: ', 'unable to create topics [serde-topic-PROTOBUF-Python]: SASL authentication failed: security: Invalid credentials: SASL_AUTHENTICATION_FAILED: SASL Authentication failed.\n', 1)
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 135, in run
    data = self.run_test()
  File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 227, in run_test
    return self.test_context.function(self.test)
  File "/usr/local/lib/python3.10/dist-packages/ducktape/mark/_mark.py", line 481, in wrapper
    return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
  File "/home/ubuntu/redpanda/tests/rptest/services/cluster.py", line 79, in wrapped
    r = f(self, *args, **kwargs)
  File "/home/ubuntu/redpanda/tests/rptest/tests/schema_registry_test.py", line 1295, in test_schema_id_validation
    self._create_topic(
  File "/home/ubuntu/redpanda/tests/rptest/tests/schema_registry_test.py", line 228, in _create_topic
    rpk_tools.create_topic(topic=topic,
  File "/home/ubuntu/redpanda/tests/rptest/clients/rpk.py", line 232, in create_topic
    wait_until_result(create_topic,
  File "/home/ubuntu/redpanda/tests/rptest/util.py", line 88, in wait_until_result
    wait_until(wrapped_condition, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/ducktape/utils/util.py", line 53, in wait_until
    raise e
  File "/usr/local/lib/python3.10/dist-packages/ducktape/utils/util.py", line 44, in wait_until
    if condition():
  File "/home/ubuntu/redpanda/tests/rptest/util.py", line 75, in wrapped_condition
    cond = condition()
  File "/home/ubuntu/redpanda/tests/rptest/clients/rpk.py", line 229, in create_topic
    raise e
  File "/home/ubuntu/redpanda/tests/rptest/clients/rpk.py", line 223, in create_topic
    output = self._run_topic(cmd)
  File "/home/ubuntu/redpanda/tests/rptest/clients/rpk.py", line 739, in _run_topic
    return self._execute(cmd, stdin=stdin, timeout=timeout)
  File "/home/ubuntu/redpanda/tests/rptest/clients/rpk.py", line 857, in _execute
    raise RpkException(
rptest.clients.rpk.RpkException: RpkException<command /opt/redpanda/bin/rpk topic --brokers ip-172-31-8-110:9092,ip-172-31-4-136:9092,ip-172-31-11-200:9092 --user admin --password admin --sasl-mechanism SCRAM-SHA-256 create serde-topic-PROTOBUF-Python --partitions 1 --replicas 1 --topic-config confluent.value.schema.validation:true --topic-config confluent.value.subject.name.strategy:io.confluent.kafka.serializers.subject.TopicNameStrategy returned 1, output:  error: unable to create topics [serde-topic-PROTOBUF-Python]: SASL authentication failed: security: Invalid credentials: SASL_AUTHENTICATION_FAILED: SASL Authentication failed.
 returncode: 1>

@andijcr andijcr added kind/bug Something isn't working ci-failure labels Jun 5, 2023
@andijcr
Copy link
Contributor Author

andijcr commented Jun 5, 2023

@andijcr
Copy link
Contributor Author

andijcr commented Jun 9, 2023

@andijcr
Copy link
Contributor Author

andijcr commented Jun 9, 2023

@NyaliaLui NyaliaLui self-assigned this Jun 9, 2023
@NyaliaLui
Copy link
Contributor

May be related to #11141

@BenPope
Copy link
Member

BenPope commented Jun 9, 2023

Similar errors exist in SchemaRegistryAutoAuthTest.test_serde_client

@NyaliaLui
Copy link
Contributor

NyaliaLui commented Jun 12, 2023

So I sync'd with @BenPope and we both agreed that this failure is not happening because of a problem with AutoAuth or ephemeral credentials. Instead, this problem is happening because of a timing issue between RedpandaService auto-creating the admin user and RPK issuing a CreateTopic request.

I continued the investigation to see if using a different client to create topics solves the problem, so I tried using KafkaCliTools.
The result was that all test runs passed.

I continued my experiments by forcing the RpkTool to retry on SASL_AUTHENTICATION_FAILED. The result was that all test runs passed.

This tells me that our RpkTool may be issuing a CreateTopics request before the admin user finishes replicating to other nodes or something like that.

Nevertheless, this seems like a test issue. Therefore, I am assigning sev/low tag and I am looking into ways to have our tests "wait-to-issue-kafka-requests" until the admin user has replicated across the cluster. A simple solution would be to merge the retry on CreateTopics; however, I would like to see if there is another solution that may be more robust.

@NyaliaLui NyaliaLui added the sev/low Bugs which are non-functional paper cuts, e.g. typos, issues in log messages label Jun 12, 2023
NyaliaLui added a commit to NyaliaLui/redpanda that referenced this issue Jun 13, 2023
In tests that enable SASL auth, it is possible for RPK to issue a
CreateTopics requests before the user has replicated across the cluster.
The solution is to retry the request.

Fixes redpanda-data#11203
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kafka ci-failure kind/bug Something isn't working sev/low Bugs which are non-functional paper cuts, e.g. typos, issues in log messages
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants