New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'struct rtrie' has no member named 'lock' #245

Closed
ydahhrk opened this Issue May 16, 2017 · 0 comments

Comments

Projects
None yet
1 participant
@ydahhrk
Member

ydahhrk commented May 16, 2017

Compilation error on ARM (apparently):

rtrie.c:17:54: error: 'struct rtrie' has no member named 'lock'

If anyone gets this, the temporal workaround is to replace lockdep_is_held(&trie->lock) for true. It should look like this:

#define deref_reader(node) \
	rcu_dereference_bh(node)
#define deref_updater(trie, node) \
	rcu_dereference_protected(node, true)
#define deref_both(trie, node) \
	rcu_dereference_bh_check(node, true)

Here's my explanation:

The second argument to the rcu_dereference functions is intended to validate the "dereference" is happening with some kind of concurrency protection enabled. If the condition (2nd argument) evaluates to true, then the function assumes that the code is protected and carries on normally. If it evaluates to false, the function assumes there's some kind of programming error and throws a kernel panic.

This is desired because we'd rather crash the kernel (which would be an immediately identifiable bug) than let a concurrence problem slip by (which would be a serious error that could be extremely difficult to even identify, let alone track).

The thing is, depending on compilation options, the condition might or might not be evaluated.

And here is the catch: &trie->lock is not even valid code because the "trie" structure does not contain a "lock" field. &trie->lock is a remnant from an old implementation of rtrie, and it should always crash because it's no longer valid. It is perfectly normal to be getting this error. The fact that it's not crashing for most people is what is actually puzzling.

The fact that it's not crashing means that the condition is very rarely being evaluated, which in turn means that concurrence is very rarely being validated during runtime. This is the core problem that needs fixing.

Replacing lockdep_is_held(&trie->lock) into an immediate true should remove the compilation stopper. It will also remove the validation, but since clearly nobody is using it anyway, it should not have any repercussions other than ones we might already have. I claimed to have "reviewed the [calling] code", so there should not be any problems.

(The check validates code, not input, and the code seems fine, so there's not much a user can do to break something.)

If anyone is paranoid, all one needs to do to completely zeroize any chances of a concurrence problem is to never edit the EAM table while traffic is being translated. Preventing these two things from happening at the same time is the only purpose of these protections.

Update: This was reported by someone who's messaging via the dev mailing list, so credits will remain anonymous until further notice.

@ydahhrk ydahhrk added this to the 3.5.4 milestone Jun 5, 2017

ydahhrk added a commit that referenced this issue Jul 20, 2017

Fix a few indentation-related compilation warnings
It appears that whoever wrote the JSON parser had very misled
formatting conventions. I cannot picture anyone indenting code like
this having good intentions:

	if (<condition>) <action-if-true>; <action-always>

Not sure what to make of this.

In other news, I finally managed to properly test #245. Though the
EAMT itself might err (as this was outside of the scope of 245),
the code that reviews it should detect certain types of improper
RCU API usage.

@ydahhrk ydahhrk closed this in cde934f Jul 20, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment