Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Message Validators #55

Merged
merged 27 commits into from
Jan 24, 2018
Merged

Message Validators #55

merged 27 commits into from
Jan 24, 2018

Conversation

vyzo
Copy link
Collaborator

@vyzo vyzo commented Jan 13, 2018

Implements asynchronous message validators.
Continuation of #45 which was merged and rebased in a feature branch so that I can continue work on it.
Closes #46.
Closes #56.

cc @whyrusleeping @keks

keks and others added 11 commits January 13, 2018 12:12
used for range variable inside goroutine, now passed as argument
- make validators time out after 100ms
  - add context param to validator functions
  - add type Validator func(context.Context, *Message) bool
- drop message if more than 10 messages are already being validated
@ghost ghost assigned vyzo Jan 13, 2018
@ghost ghost added the in progress label Jan 13, 2018
@vyzo
Copy link
Collaborator Author

vyzo commented Jan 13, 2018

Removed the global validation throttle and cleaned up the subscription validation logic.
Let's reopen review?

floodsub.go Outdated
return
}

go func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hrm... spawning a goroutine for each message being sent is a bit wonky. Maybe:

select {
case p.sendMsg <- .....:
default:
  go func() {
    p.sendMsg <- ....
  }()
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, we only have to spawn a new goroutine if it's a publish, in which case the push happens within the event loop.
I'll update.

Copy link
Collaborator Author

@vyzo vyzo Jan 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are always in the event loop actually. I added a buffer to sendMsg so that we almost always avoid the goroutine.

floodsub.go Outdated
@@ -54,6 +58,9 @@ type PubSub struct {
// topics tracks which topics each of our peers are subscribed to
topics map[string]map[peer.ID]struct{}

// sendMsg handles messages that have been validated
sendMsg chan sendReq
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking that this should be a chan *sendReq for consistency with the other channels.

// NewFloodSub returns a new FloodSub management object
func NewFloodSub(ctx context.Context, h host.Host) *PubSub {
func NewFloodSub(ctx context.Context, h host.Host, opts ...Option) (*PubSub, error) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could avoid the interface change if we don't use any options -- do we want this for future proofing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep it, it's a nice interface; also we may want to pass an option for global validation throttle.

subscription.go Outdated
vctx, cancel := context.WithTimeout(ctx, sub.validateTimeout)
defer cancel()

go func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

annoying that we have to use an extra goroutine for this, but probably okay.

Copy link
Collaborator Author

@vyzo vyzo Jan 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that's how it go es :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not just trust that validate will respect its context?

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

subscription.go Outdated
vctx, cancel := context.WithTimeout(ctx, sub.validateTimeout)
defer cancel()

go func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not just trust that validate will respect its context?

floodsub.go Outdated
select {
case p.sendMsg <- sreq:
default:
go func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of this pattern (too easy to open a bunch of go routines). Could we just call maybePublishMessage directly and rename the sendMsg channel name to validatedMsgs or something?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a buffer to sendMsg so that we should almost never overflow.
But you are right, we can just call maybePublishMessage!
I'll keep the sendMsg channel named as it is, because it indicates a message that is ready to send.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I skipped the dance, and called maybePublishMessage directly .

floodsub.go Outdated
<-p.validateThrottle
}()
default:
log.Warningf("message validation throttled; dropping message from %s", src)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll probably want per-peer or per-topic throttles in the future but, for now, this is probably fine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we already have those! That was part of the improvements in this PR (although most of the stuff got rewritten).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see... I'm worried that it's still possible to starve a topic by maxing out the global validator first. Ideally, we'd take the per-topic ticket first, then the global one.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be hard to do, because a throttled subscription would release the semaphore quickly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I improved the comments around the throttle to more clearly describe the intention and existence of per-topic throttles.

floodsub.go Outdated
continue
}

rch := make(chan bool, 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd just use a single channel and count results from it. Allocating channels isn't that expensive but it's certainly not free.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough; it was really convenient to write this way though :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright, implemented!

@Stebalien Stebalien mentioned this pull request Jan 13, 2018
@vyzo
Copy link
Collaborator Author

vyzo commented Jan 13, 2018

so f1be0f1, which removed the external goroutine enforcing the timeout bombs two of keks' tests -- they explicitly test the external goroutine timing out.

@vyzo
Copy link
Collaborator Author

vyzo commented Jan 14, 2018

Removed the two offending tests, as they test faulty behaviour.

More to the point, per discussion with @Stebalien on irc: if we timeout externally, the faulty validator is still running, while we unthrottle and leak a goroutine. It is preferable to have the faulty validator to cause the topic to throttle, at which point it can be identified as a bug and be debuged, rather than having a silent goroutine leak.

@vyzo
Copy link
Collaborator Author

vyzo commented Jan 17, 2018

@Stebalien ping -- let's move this forward?

@Stebalien
Copy link
Member

I don't see how this PR is a step in the right direction.

Validators should either be a property of the topic (registered at initialization, preferably) or validators should be a property of the subscription in which case a message's validity for subscription A shouldn't affect the message's validity for subscription B. Given that making validators a property of the subscription makes it difficult to determine if we should forward the message, validators should be a property of the topic.

@vyzo
Copy link
Collaborator Author

vyzo commented Jan 18, 2018

Ok, I will rewrite to implement per topic validators then.

@vyzo
Copy link
Collaborator Author

vyzo commented Jan 18, 2018

I have removed the per-subscription validators and implemented per-topic validations.
Only a single validator per topic is allowed, registered with RegisterTopicValidator; supporting multiple validators per topic would unnecessarily complicate the api and implementation and didn't seem worth it for now.

@vyzo
Copy link
Collaborator Author

vyzo commented Jan 18, 2018

summoning @Stebalien @whyrusleeping

// pushMsg pushes a message performing validation as necessary
func (p *PubSub) pushMsg(vals []*topicVal, src peer.ID, msg *Message) {
if len(vals) > 0 {
// validation is asynchronous and globally throttled with the throttleValidate semaphore.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment still true?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is still true.
I can remove the global throttle if we deem so, I just like to have a limit on how much goncurrency we can spawn as a result of incoming network traffic.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although, subscription reference below! that's not true, I will fix the comment.

Copy link
Contributor

@whyrusleeping whyrusleeping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My current understanding is that we still have global validators, but no global throttling right? I still see a pubsub level validate throttle though, so theres still some global throttling?

@whyrusleeping
Copy link
Contributor

Oh wait, I was thinking backwards, no global validators, but still global throttling.

@vyzo
Copy link
Collaborator Author

vyzo commented Jan 18, 2018

My current understanding is that we still have global validators, but no global throttling right? I still see a pubsub level validate throttle though, so theres still some global throttling?
Oh wait, I was thinking backwards, no global validators, but still global throttling.

Right, we have a global throttle that applies to all validators, that's set to a reasonably high limit (8192 active goroutines). This is to protect us from blowing our resource usage from big bursts of traffic that trigger validations.

Each individual topic validator has its own throttle, that limits the resource usage per topic. This throttle by default is set much lower, to a mere 100 active goroutines.

As I said we can remove the global throttle if we deem so, and we can also tweak the defaults accordingly.

@Stebalien Stebalien mentioned this pull request Jan 18, 2018
Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be fine accepting this if @whyrusleeping is as I consider it a step in the right direction this time but I'm still not happy with it (see #58). Why are you so opposed to having validators decide which topics they care about? The way this currently works, subscribing to the same topic twice will be painful.

@vyzo
Copy link
Collaborator Author

vyzo commented Jan 19, 2018

Why are you so opposed to having validators decide which topics they care about?

Hrm, I am not terminally opposed to them -- I just find the interface a little clunky.

The way this currently works, subscribing to the same topic twice will be painful.

Why? The validator registration is decoupled from the subscription and it's something we do at initialization.

@jvsteiner
Copy link

jvsteiner commented Jan 20, 2018

Hey, I'm a brand-newbie in looking at this project, but for me, in the use cases I have in mind, I'd like to be able to validate by topic (although I can see why "by subscription" would also be useful) - but perhaps more importantly, I'd like not to propagate messages to other peers, when they do not validate. I might even want to drop peers who send invalid messages. I guess peer attribution might be a bit outside the scope of what's currently possible - but that's my $0.02.

@vyzo
Copy link
Collaborator Author

vyzo commented Jan 20, 2018

@jvsteiner with this pr, we do exactly that: we don't broadcast messages that don't validate on a per-topic basis. We have rejected subscription-based validators because of incoherent semantics.

The global validators will simply add an additional layer below per topic validators: they will allow us to define validator classes for pubsub namespaces and will be implemented in follow up work.

@Stebalien
Copy link
Member

@whyrusleeping I believe we're waiting on you for the final signoff.

@Stebalien Stebalien merged commit c82e67d into master Jan 24, 2018
@Stebalien Stebalien deleted the feat/validators branch January 24, 2018 06:03
@ghost ghost removed the in progress label Jan 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants