Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

controller: fix Redpanda 22.3 failing to start when license was loaded with 22.2 Redpanda #7312

Merged
merged 2 commits into from
Nov 16, 2022

Conversation

graphcareful
Copy link
Contributor

@graphcareful graphcareful commented Nov 16, 2022

Backport Required

  • not a bug fix
  • issue does not exist in previous branches
  • papercut/not impactful enough to backport
  • v22.3.x
  • v22.2.x
  • v22.1.x

release notes

  • Fix for bug that would prevent the start of redpanda if a license was loaded in a previous version

# Upload a license
self.admin.put_license(license)

# Update all nodes to newest version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need a followup to fix the version combinations: this is assuming that upgrades from 22.2->HEAD is what is wanted (i.e. on dev it will be skipping 22.3.1)).

jcsp
jcsp previously approved these changes Nov 16, 2022
@jcsp jcsp changed the title Fix for bug where license could not be interpreted by newer versions of redpanda controller: fix Redpanda 22.3 failing to start when license was loaded with 22.2 Redpanda Nov 16, 2022
@jcsp jcsp mentioned this pull request Nov 16, 2022
6 tasks
- Ensures licenses can be read/written by cluster between versions
@graphcareful
Copy link
Contributor Author

graphcareful commented Nov 16, 2022

One unrelated test failure detected #6903:

test_id:    rptest.tests.schema_registry_test.SchemaRegistryAutoAuthTest.test_delete_subject
  | status:     FAIL
  | run time:   10.935 seconds
  |  
  |  
  | <NodeCrash docker-rp-8: ==25284==ERROR: AddressSanitizer: SEGV on unknown address (pc 0x7f7282e57976 bp 0x7fff54148ae0 sp 0x7fff54148840 T0)
  | >
  | Traceback (most recent call last):
  | File "/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py", line 670, in urlopen
  | httplib_response = self._make_request(
  | File "/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py", line 426, in _make_request
  | six.raise_from(e, None)
  | File "<string>", line 3, in raise_from
  | File "/usr/local/lib/python3.10/dist-packages/urllib3/connectionpool.py", line 421, in _make_request
  | httplib_response = conn.getresponse()
  | File "/usr/lib/python3.10/http/client.py", line 1374, in getresponse
  | response.begin()
  | File "/usr/lib/python3.10/http/client.py", line 318, in begin
  | version, status, reason = self._read_status()
  | File "/usr/lib/python3.10/http/client.py", line 287, in _read_status
  | raise RemoteDisconnected("Remote end closed connection without"
  | http.client.RemoteDisconnected: Remote end closed connection without response

@jcsp jcsp merged commit 9db9127 into redpanda-data:dev Nov 16, 2022
@BenPope
Copy link
Member

BenPope commented Nov 16, 2022

/backport v22.3.x

@jcsp
Copy link
Contributor

jcsp commented Nov 16, 2022

Was manually backported in #7318

@vbotbuildovich
Copy link
Collaborator

Failed to run cherry-pick command. I executed the below command:

git cherry-pick -x 84fa18eb59e5f779b8a131810d5408932ddb2202 1efea4c56872a6c0d7fe32e27d197613a8ea8c79

Workflow run logs.

@@ -108,3 +108,41 @@ def test_basic_upgrade(self):

# Install license
assert self.admin.put_license(license).status_code == 200


class UpgradeMigratingLicenseVersion(RedpandaTest):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have a follow here on a Matrix annotation that does a to-from upgrades of the last 12 months upgrades to latest so ~4 or 5 upgrades.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emaxerrno : AFAIK, we have upgrades from oldest supported release v21.11.x to intervening jumps to latest major's latest minor for the overall upgrade test.

I think what we might need here is to ensure that each such jump is upgrading from all-features-configured cluster to all-feature-in-next-version-cluster and so on.

And as @dotnwat pointed out having the binary compat testing would have caught this as well.

@@ -62,7 +62,8 @@ inline std::ostream& operator<<(std::ostream& os, license_type lt) {
return os;
}

struct license : serde::envelope<license, serde::version<1>> {
struct license
: serde::envelope<license, serde::version<1>, serde::compat_version<0>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a cool fix.

@jcsp , @dotnwat and @mmaslankaprv - should we make compat_version<> explicit. No implicit versioning, in fact code should fail without compat-version?

maybe we check in with @felixguendling - seems like the issue was it compiled without serde::compat*

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the issue was that the compat version had the wrong default. i think it is now explicit, iiuc.

@jcsp
Copy link
Contributor

jcsp commented Nov 16, 2022

@jcsp , @dotnwat and @mmaslankaprv - should we make compat_version<> explicit. No implicit versioning, in fact code should fail without compat-version?

Yes, that was the immediate mitigation that I mentioned on slack, it is happening in #7314

can we have a follow here on a Matrix annotation that does a to-from upgrades of the last 12 months upgrades to latest so ~4 or 5 upgrades.

The need to avoid skipping intermediate versions is tracked in #7310

What we need in addition for that is a "fully loaded" upgrade test that tries to configure/enable all the features we can think of, and then traverses that series of supported upgrade steps. We have a bunch of upgrade tests but they tend to focus on a feature at as time, and some of them skip straight from some older version straight to HEAD.

@emaxerrno
Copy link
Contributor

gosh you are all so smart and competent. thanks for already filing tix, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants