Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decide NULL object handling #2839

Closed
nigoroll opened this issue Nov 21, 2018 · 20 comments
Closed

decide NULL object handling #2839

nigoroll opened this issue Nov 21, 2018 · 20 comments

Comments

@nigoroll
Copy link
Member

nigoroll commented Nov 21, 2018

while working on #2830, I thought it might be about time to make a decision on how to handle NULL (uninitialized) objects (for example, because a new statement was not reached or the constructor failed). I see two basic options

a) advise vmod authors to not assert on null and instead call VRT_fail() or otherwise handle the case of the object pointer being NULL safely
b) require that all objects be initialized when vcl_init {} returns or fail the vcl load

@rezan
Copy link
Member

rezan commented Nov 21, 2018

Is a null object an object which skipped its init call? If so, then I will chime in :) Init calls cannot be placed in subs, if statements, or be a part of an expression. Vcc can detect and throw an error on unsafe init usage. This would make it impossible for init to be skipped. This is exactly how other languages handle this, at compile time: https://en.m.wikipedia.org/wiki/Uninitialized_variable#Use_in_languages

@nigoroll
Copy link
Member Author

@rezan I agree and think that you are basically describing a variant of option b). But a static check at VCC time still leaves a window for an initialization to fail at vcl load time and will probably be harder to implement than simply checking after vcl_init completion if all declared objects were initialized.
From the user's perspective, vcl.load would fail in both cases, so the net effect should be identical.

@rezan
Copy link
Member

rezan commented Nov 21, 2018

Not sure I understand this?

If you eject out of VCL, only call fini on objects where init was called. Part of initialization is registering the fini (see my patch), so in theory this can baked into the implementation.

How exactly do you plan on checking if init was called on all variables after the fact? You will do this at compile time or runtime?

@nigoroll
Copy link
Member Author

First of all the only call fini on objects where init was called is broken at the moment, see #2297
I did not make detailed plans, but VCC knows which objects were declared so it can generate the code to check if they are all non-NULL after vcl_init succeeded otherwise.
As explained before, I think vcl.load time checking would suffice.

@slimhazard
Copy link
Contributor

slimhazard commented Nov 21, 2018

What exactly is the problem we're trying to solve here? Is there internal code that works with VMODs that might crash if an object is NULL? That doesn't happen for method invocations, since the object pointer is passed into the C function that implements the method.

If the rest of Varnish is not necessarily safe when an object pointer is NULL, then I can see the rationale in one of these solutions. But if that's not the problem, then I don't understand why we need to do anything about it at all.

Why shouldn't a VMOD author choose to work with NULL with some sort of significance? Say, here's what the VMOD does with an uninitialized object. How can we know that that won't ever be an appropriate design choice for a VMOD? Why shouldn't VCL authors have the freedom to make the initialization of an object dependent on a condition?

To be sure, a NULL object pointer could lead to a segfault or an assertion failure, but this is C, after all, which famously gives you enough rope to hang yourself. There are numerous ways that a VMOD can get Varnish to crash, whether or not objects are NULL. So it has always seemed obvious to me that VMODs only exist on the understanding that the programmers have to know how to do it right. I don't think that programmer hand-holding should be the responsibility of this project.

@rezan
Copy link
Member

rezan commented Nov 21, 2018

@slimhazard totally agree!

@nigoroll
Copy link
Member Author

nigoroll commented Nov 21, 2018

@slimhazard @rezan I apologize for apparently having been too terse in an initial issue description again, I should try to never again assume that anything is so obvious that anyone from the dev team will be able to make sense of a brief description.

The current doctrine in vmods is to CHECK_OBJ_NOTNULL(obj) in vmod objects, see https://github.com/varnishcache/varnish-cache/blob/master/lib/libvmod_blob/vmod_blob.c#L240 for one example of probably dozens of cases in various vmods.

So right now, a

vcl_init {
  if (false) {
    new xblob = blob.blob(...);
  }
}
vcl_recv {
  ... = xblob.get();
}

crashes varnish for all practical purposes. This contradicts the design requirement that pure VCL should be safe.

So my question for clarification is if we should

a) handle NULL and not panic, iow require vmod authors to not assert on null and instead call VRT_fail()
b) make sure that objects are never null, iow require that all objects be initialized when vcl_init {} returns or fail the vcl load

@slimhazard

Why shouldn't a VMOD author choose to work with NULL with some sort of significance?

in the java world, what would be the semantics of NULL.get() (if anything like this existed, but you'll get the point).

Calling a method on a NULL object does not seem to be a valid concept to me, but, sure, we're free to define NULL as a valid object in vcl, if we were inclined to do so. That's option a).

I don't think that programmer hand-holding should be the responsibility of this project.

I agree. But our coding paradigm should be a safe example, and right now it is not.

@Dridi
Copy link
Member

Dridi commented Nov 21, 2018

c) have varnish keep track of objects initialization to ensure that constructors are called exactly once, otherwise fail vcl_init.

@rezan
Copy link
Member

rezan commented Nov 21, 2018

@Dridi I believe that is a runtime solution. We should always opt for compile time solutions for perf reasons. We compile once, but "run" many orders of magnitude more times. Over time, these runtime checks will add overhead, accumulate, and then leave us with no options but to accept slowness as a fact of the Varnish runtime. Unless you want to introduce a JIT into the runtime which would optimize out these checks under certain conditions :)

@nigoroll
Copy link
Member Author

nigoroll commented Nov 21, 2018

@rezan I don't think your comment applies. vcl_init is not runtime in that sense.

@rezan
Copy link
Member

rezan commented Nov 21, 2018

@nigoroll I guess we disagree on the definition of runtime and compile time then...

@nigoroll
Copy link
Member Author

@rezan I fail to follow in any way. Can you explain please why you consider vcl_init runtime? It's called after the vcl is compiled to C by vcc, compiled into a shared object and loaded during the warmup.

You said:

We compile once, but "run" many orders of magnitude more times

This would only apply if we transitioned the vcl from cold to warm many orders of magnitude more often than loading it.

In the context of performance, this argument does not click with me. In particular not with regard to a check which should take <1us per object.

@rezan
Copy link
Member

rezan commented Nov 21, 2018

Right, so your saying that the runtime penalty for checking for NULL objects is low. Might be true.

But remember, this is software. Our decisions needs to be forward looking, otherwise, we lose the much needed ability to keep evolving, competing, and solving problems.

What happens when we introduce request scoped objects? Are we still in a situation where the runtime overhead for NULL checking is low?

This is exactly runtime vs compile time!

@nigoroll
Copy link
Member Author

I feel I'm the wrong student for the lecture on performance fundamentals. ;) Please let's focus on the initial question and maybe reach a conclusion during the next bugwash

@slimhazard
Copy link
Contributor

I suggest that we look into making the bundled VMODs robust in the case of an uninitialized object, using existing means, such as calling VRT_fail() in the NULL case.

Other than that, IMO the rationale given here is indeed hand-holding, and I don't think it should be the business of the project -- in particular, we shouldn't be imposing technical restrictions on VMODs, or forbidding otherwise legal language syntax (such as initializations within an if in vcl_init). So I'm sort of for (a), except I would replace "require" with "advise", and instead of insisting on "call VRT_fail()", I'd say "handle the NULL case as is fitting and proper for your VMOD, which might be to call VRT_fail()".

Every time I try to articulate my reasons for that, my bad habit of getting long-winded and philosophical gets in the way. So @nigoroll, instead of me writing an enormous comment that's too much work for anyone to read, I suggest that we discuss it the next time we see each other in person.

@Dridi
Copy link
Member

Dridi commented Nov 21, 2018

I don't think vcl_init runs more than once, more specifically I don't think it runs on every VCL warmup. On the contrary, once vcl_init succeeds it receives its first WARM event by default and then may receive future COLD/WARM events to get a chance to release/reacquire relevant resources post-initialization.

Keeping track of constructor calls and enforcing exactly one object instantiation per vcl_init not only solves @nigoroll's example of a new statement in a dead branch but also multiple new statements for a single symbol:

import foo;

sub vcl_init {
    if (logic) {
        new bar = foo.baz("once");
    }
    if (broken-logic) {
        new bar = foo.baz("again");
    }
}

That gives the choice to VMODs to handle NULL as they wish for example to turn an object into a no-op under certain circumstances and solves the current lack of "static analysis" of arbitrary VCL code misusing a native VCL construct regardless of the underlying VMOD!

We already have a "VRT_fail'd" check after _every single VCL statement_ so arguing about null checks or keeping track of exactly-once object instantiations for performance reasons is a fruitless discussion.

edit: the systematic VRT_fail'd check also happens at run time, during proper transactions.

@Dridi
Copy link
Member

Dridi commented Nov 21, 2018

I wrote a test case:

varnishtest "fuzzy object"

server s1 "" -start

varnish v1 -vcl+backend {
	import directors;

	sub vcl_init {
		if (true) {
			new rr = directors.round_robin();
		}
		if (true) {
			new rr = directors.round_robin();
		}
	}
} -start

And it yields an error:

**** v1    0.1 CLI RX|Message from VCC-compiler:
**** v1    0.1 CLI RX|Instance 'rr' redefined.
**** v1    0.1 CLI RX|('<vcl.inline>' Line 12 Pos 29)
**** v1    0.1 CLI RX|                        new rr = directors.round_robin();
**** v1    0.1 CLI RX|----------------------------##---------------------------
**** v1    0.1 CLI RX|
**** v1    0.1 CLI RX|First definition:
**** v1    0.1 CLI RX|('<vcl.inline>' Line 9 Pos 29)
**** v1    0.1 CLI RX|                        new rr = directors.round_robin();
**** v1    0.1 CLI RX|----------------------------##---------------------------
**** v1    0.1 CLI RX|
**** v1    0.1 CLI RX|Running VCC-compiler failed, exited with 2
**** v1    0.1 CLI RX|VCL compilation failed

So good news, we can't have the broken logic described previously. The bad news is that we may still have a symbol not going through a proper initialization. The "bad" news is we can't have alternate instantiations.

Assuming option c) where we enforce that all objects must run their constructors, either we use the actual object's pointer to do the exactly-once check at the end of vcl_init or we add another variable in the generated code. VMOD developers can always use a sentinel static pointer to get a null/no-op object to effectively disable an object.

I understand the appeal of being able to skip a constructor in VCL, but it doesn't make sense to use a symbol at runtime that didn't go through its new statement at init time, conceptually.

@nigoroll
Copy link
Member Author

@slimhazard your interpretation of a) is what I originally intended, so I've changed the description (it's always good to have a native speaker on the team :) ) The only relevant point was not to assert for option a), I did not mean to dictate the exact means of error handlig.

@Dridi thank you: yes I was wrong about vcl_init being called for a warm event - it does not get called for a vcl warm. But the context of that misinformation was the performance argument, so that's even less relevant.

@nigoroll
Copy link
Member Author

bugwash: b), but a) can be activated in the vcc file. @bsdphk to write it up

@bsdphk
Copy link
Contributor

bsdphk commented Nov 26, 2018

Bugwash conclusion:

We will annotate $Object in .vcc files to allow them to be NULL. If not so marked, vcl_init{} will fail if they are not initialized.

@bsdphk bsdphk closed this as completed in 9ceeed8 Feb 5, 2019
Dridi pushed a commit that referenced this issue Jul 10, 2019
Allow VMOD writers to permit with NULL_OK flag.

Only call object destructor on initialized objects.

Fixes #2839

Conflicts:
	lib/libvmod_debug/vmod.vcc
hermunn pushed a commit to hermunn/varnish-cache that referenced this issue Nov 3, 2020
Allow VMOD writers to permit with NULL_OK flag.

Only call object destructor on initialized objects.

Fixes varnishcache#2839

Conflicts:
	lib/libvmod_debug/vmod.vcc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants