Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix race condition caused by lazy population of r.descriptors #40

Merged
merged 3 commits into from
Sep 21, 2022

Conversation

jhump
Copy link
Member

@jhump jhump commented Sep 20, 2022

Because of the possibly cyclic nature of some references, we can't easily construct all descriptors in a simple top-down pass over the descriptor proto hierarchy.

For this reason, the current linker result implementation is lazy: it only constructs resolved descriptors as they are accessed, and then they are cached in a map r.descriptors.

But that means that it's possible for some descriptors to remain unpopulated/uncached at the time the result is handed off to a consumer, for possible use from another goroutine. If that other goroutine then access an unpopulated descriptor, causing it to be lazily populated, this results in unsafe concurrent access to r.descriptors and data races. Similarly, the lazy resolution can cause the r.usedImports field to be written to from concurrent goroutines (which is a fatal error).

This overhauls the implementation so it is no longer lazy/"just in time". It now does two passes (to handle cyclic references) -- the first just computes the entire hierarchy, and the third one does mutations as needed to resolve the non-tree edges of the graph.

@@ -1,9 +1,15 @@
module github.com/bufbuild/protocompile

go 1.16
go 1.18
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a generic function, to handle reserved ranges for both enums and messages. So that required 1.18.

// its `source_code_info` field populated. This is typically a post-process
// step separate from linking, because computing source code info requires
// interpreting options (which is done after linking).
PopulateSourceCodeInfo()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, we were populating source code info lazily on first access. But since that poses thread-safety issues, we now explicitly populate it after computing the underlying source code info protos.

@@ -477,7 +380,7 @@ func (r *result) resolveFieldTypes(handler *reporter.Handler, s *Symbols, fqn pr
// need to validate that it's not an illegal reference. To be valid, the field
// must be repeated and the entry type must be nested in the same enclosing
// message as the field.
isValid = dsc.FullName() == fqn && fld.GetLabel() == descriptorpb.FieldDescriptorProto_LABEL_REPEATED
isValid = dsc.FullName() == f.FullName() && fld.GetLabel() == descriptorpb.FieldDescriptorProto_LABEL_REPEATED
Copy link
Member Author

@jhump jhump Sep 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is not right. I've just updated fqn here for the altered context. But this condition will never be true. It should be trying to verify that the field and the message belong to the same parent message. I will address in a subsequent PR as well as add a test that would have caught this.

@jhump
Copy link
Member Author

jhump commented Sep 20, 2022

@pkwarren, this is pretty dang hairy. So apologies for the large and complicated diff.

I was apparently mistaken when I thought this had a net reduction of ~60 LOC. So it's not as simplifying as I had hoped.

But it does overhaul the construction to be entirely up-front -- nothing lazy/just-in-time which might trigger under-the-hood book-keeping mutations after the things has already been published to other goroutines.

imps[int(publicIndex)].IsPublic = true
}
for _, weakIndex := range fd.WeakDependency {
imps[int(weakIndex)].IsPublic = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be IsWeak = true?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YES! Good catch. I'm glad you have eagle eyes. Some of the characteristics of these descriptor impls are clearly not fully tested.

for _, r := range e.s {
if r.GetStart() <= int32(n) && r.GetEnd() >= int32(n) {
for _, r := range e.ranges {
if r[0] <= n && r[0] >= n {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if r[0] <= n && r[0] >= n {
if r[0] <= n && r[1] >= n {

I think this logic isn't correct - r[1] should be the equivalent of r.GetEnd() from before, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right again. Thanks for spotting that!

}

func (f *fldDescriptor) Enum() protoreflect.EnumDescriptor {
if f.proto.GetType() != descriptorpb.FieldDescriptorProto_TYPE_ENUM {
return nil
}
return f.file.ResolveEnumType(protoreflect.FullName(f.proto.GetTypeName()))
if f.enumType == nil {
f.enumType = f.file.ResolveEnumType(protoreflect.FullName(f.proto.GetTypeName()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we don't initialize fields like this up front as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh! We do initialize them up front. I just forgot to update these methods (here, below in Message(), and further below for method's input and output types). Fixed.

@@ -1759,11 +1830,17 @@ func (m *mtdDescriptor) Options() protoreflect.ProtoMessage {
}

func (m *mtdDescriptor) Input() protoreflect.MessageDescriptor {
return m.file.ResolveMessageType(protoreflect.FullName(m.proto.GetInputType()))
if m.inputType == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here - can we initialize these up front in createMethodDescriptor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, my bad.

This change was from the first commit, before I did a big refactor. And I forgot to come back and update some of these spots. Got 'em now.

if d.Options != nil {
if err := r.resolveOptions(handler, "one-of", fqn, d.Options.UninterpretedOption, scopes); err != nil {
case *fldDescriptor:
elemType := "field"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Inconsistent between case statements on assigning this separately vs. just passing it as an argument below. I think if it is only used in one place we might just pass it in the function call instead of creating a separate variable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. This was a hold-over from when extension and field were in the same case.

@jhump jhump merged commit 90f3889 into main Sep 21, 2022
@jhump jhump deleted the jh/fix-race branch September 21, 2022 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants