Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't consider deleted schema when checking compatibility #4669

Merged
merged 4 commits into from
Jul 8, 2019

Conversation

ivankelly
Copy link
Contributor

@ivankelly ivankelly commented Jul 3, 2019

feca5bb changed topic delete logic to delete the schema when the
topic is deleted (though this only seems to be enabled for idle topic
GC). This exposed a bug in compatibility checking whereby if the a
subscription tries to attach to the topic, even if using the same
schema as had been used previously, a compatibility exception will be
thrown.

This is because the topic still appears to have a schema, even though
there is no actual schema data, just a tombstone. I've changed the logic
to return no schema if the schema read back is a tombstone.

The issue doesn't affect producers because the check was already
correct there.

I've also updated the check for transitive compatibility to remove the
prefix of schemas before the deleted schema. Previously this was
throwing an NPE on the broker as it couldn't decode the deleted
schema.

This issue was discovered by failures in the healthcheck. The check
period (5 minutes) was longer than the GC period (60 seconds). I would
expect it to hit quite often in other scenarios also.

feca5bb changed topic delete logic to delete the schema when the
topic is deleted (though this only seems to be enabled for idle topic
GC). This exposed a bug in compatibility checking whereby if the a
subscription tries to attach to the topic, even if using the same
schema as had been used previously, a compatibility exception will be
thrown.

The root cause was a logic bomb in the compatibility check.
!(A || B) && C where it should have been !(A || B) || C.

The issue doesn't affect producers because the check was already
correct there.

I've also updated the check for transitive compatibility to remove the
prefix of schemas before the deleted schema. Previously this was
throwing an NPE on the broker as it couldn't decode the deleted
schema.

This issue was discovered by failures in the healthcheck. The check
period (5 minutes) was longer than the GC period (60 seconds). I would
expect it to hit quite often in other scenarios also.
@ivankelly ivankelly added type/bug The PR fixed a bug or issue reported a bug component/schemaregistry area/broker labels Jul 3, 2019
@ivankelly ivankelly added this to the 2.5.0 milestone Jul 3, 2019
@ivankelly ivankelly self-assigned this Jul 3, 2019
@ivankelly
Copy link
Contributor Author

CI won't pass until #4667 is merged.

@merlimat
Copy link
Contributor

merlimat commented Jul 3, 2019

retest this please

Copy link
Contributor

@merlimat merlimat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@merlimat merlimat modified the milestones: 2.5.0, 2.4.1 Jul 3, 2019
@merlimat
Copy link
Contributor

merlimat commented Jul 3, 2019

run cpp tests

@ivankelly
Copy link
Contributor Author

rerun java8 tests

Copy link
Member

@sijie sijie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. the problem was introduced by my change at 8997375 . I fixed the problem at producer side (putSchemaIfAbsent), but I made the mistake at the consumer side (checkCompatibilityWithLatest).

Nice work!

@codelipenghui
Copy link
Contributor

rerun java8 tests

@jiazhai
Copy link
Member

jiazhai commented Jul 4, 2019

rerun java8 tests

@sijie
Copy link
Member

sijie commented Jul 7, 2019

run java8 tests

The previous fix didn't take into account the case where if a producer
is actively publishing to a topic, a consumer should not be allowed to
attach with a schema.
@ivankelly
Copy link
Contributor Author

@sijie @jiazhai @codelipenghui @merlimat you you take another look? I had to change the approach to strip the deleted schemas in getSchema, because a consumer should be prevented from attaching with a schema if there is a producer currently producing without a schema.

@ivankelly
Copy link
Contributor Author

rerun java8 tests

@ivankelly
Copy link
Contributor Author

rerun integration tests

Copy link
Contributor

@merlimat merlimat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@ivankelly ivankelly merged commit d77980d into apache:master Jul 8, 2019
easyfan pushed a commit to easyfan/pulsar that referenced this pull request Jul 26, 2019
feca5bb changed topic delete logic to delete the schema when the
topic is deleted (though this only seems to be enabled for idle topic
GC). This exposed a bug in compatibility checking whereby if the a
subscription tries to attach to the topic, even if using the same
schema as had been used previously, a compatibility exception will be
thrown.

This is because the topic still appears to have a schema, even though
there is no actual schema data, just a tombstone. I've changed the logic
to return no schema if the schema read back is a tombstone.

The issue doesn't affect producers because the check was already
correct there.

I've also updated the check for transitive compatibility to remove the
prefix of schemas before the deleted schema. Previously this was
throwing an NPE on the broker as it couldn't decode the deleted
schema.

This issue was discovered by failures in the healthcheck. The check
period (5 minutes) was longer than the GC period (60 seconds). I would
expect it to hit quite often in other scenarios also.
jiazhai pushed a commit that referenced this pull request Aug 28, 2019
feca5bb changed topic delete logic to delete the schema when the
topic is deleted (though this only seems to be enabled for idle topic
GC). This exposed a bug in compatibility checking whereby if the a
subscription tries to attach to the topic, even if using the same
schema as had been used previously, a compatibility exception will be
thrown.

This is because the topic still appears to have a schema, even though
there is no actual schema data, just a tombstone. I've changed the logic
to return no schema if the schema read back is a tombstone.

The issue doesn't affect producers because the check was already
correct there.

I've also updated the check for transitive compatibility to remove the
prefix of schemas before the deleted schema. Previously this was
throwing an NPE on the broker as it couldn't decode the deleted
schema.

This issue was discovered by failures in the healthcheck. The check
period (5 minutes) was longer than the GC period (60 seconds). I would
expect it to hit quite often in other scenarios also.
(cherry picked from commit d77980d)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/broker type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants