New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drivers/adxl345: fix ADXL345 driver #10088

Merged
merged 6 commits into from Oct 16, 2018

Conversation

@olegart
Copy link
Contributor

olegart commented Oct 1, 2018

Contribution description

Fixes the broken ADXL345 driver.

  1. Scale factor can be 4, not 3.9. 3.9 came from "typical scale factor is 3.9" from datasheets' Table 1, but accelerometer accuracy is never so good to really demand floating-point precision (adding hundreds bytes to firmware footprint on FPU-less controllers). Even datasheet itself states "maintaining 4 mg/LSB scale factor in all g ranges".

  2. There's no need to change scale factor. It's always 4.

  3. Byte reordering was never supposed to be done with arithmetic +

  4. Byte reordering is not needed on BE platforms (BTW, BYTE_ORDER macros used in RIOT is not exactly portable, are we ok with this?)

EDIT: more fixes

  • full_res settings fixed

EDIT 2:

  • scale calculation for 10-bit mode fixed
  • bloody mess with variously named #defines fixed

Testing

Will test the driver on an actual devices later this week.

The main question is: how this driver, being completely broken and unnecessary bloated, got into OS release?

@olegart olegart force-pushed the unwireddevices:riot-fix-adxl345 branch from 56fdebd to 0f1d817 Oct 1, 2018

@dylad

This comment has been minimized.

Copy link
Member

dylad commented Oct 1, 2018

Scale factor can be 4, not 3.9. 3.9 came from "typical scale factor is 3.9" from datasheets' Table 1, but accelerometer accuracy is never so good to really demand floating-point precision (adding hundreds bytes to firmware footprint on FPU-less controllers). Even datasheet itself states "maintaining 4 mg/LSB scale factor in all g ranges".

I'm okay with that.

There's no need to change scale factor. It's always 4.

This is not actually true, the 4 mg is maintained ONLY if FULL_RES bit is set. User may want to have a FULL_RES to 0.

Byte reordering was never supposed to be done with arithmetic +

Where does this assumption come from ?

@olegart

This comment has been minimized.

Copy link
Contributor Author

olegart commented Oct 1, 2018

Where does this assumption come from ?

Textbooks?.. You'll have all kinds of a mess dealing this way with signed numbers, because compiler will interpret each byte as a separate signed number.

Assume you have 0xFFFF 16-bit signed two's complement big endian. 0xFFFF = -1 decimal.

Let's convert it to two's complement little endian using arithmetic operations:
(int16_t)0xFF00 + (int8_t)0xFF = 255+ (-1) = 254 = 0xFE00

Which is wrong, so don't use arithmetic operations unless you want compiler to treat each byte of a multibyte number as a separate and independent single-byte number - which it is not.

User may want to have a FULL_RES to 0

Yes, but why? I can't see any benefits of using 10-bit mode, I believe it was added to the sensor for some backward compatibility or data processing optimizatons on lowest-end systems. To comply with KISS, it's better to remove this setting too, as even when fully implemented, it wouldn't change actual data even a bit.

@dylad

This comment has been minimized.

Copy link
Member

dylad commented Oct 1, 2018

with signed numbers

I didn't see I used signed int8, sorry I got your point.

Yes, but why? I can't see any benefits of using 10-bit mode,

The feature is here and at this point I still see no reason to drop it. One solution could be splitting the driver into two submodules one minimalist, the other more complete but I think this will create too much overhead.

@olegart

This comment has been minimized.

Copy link
Contributor Author

olegart commented Oct 1, 2018

I didn't see I used signed int8, sorry I got your point.

You should never use arithmetic operations to shift bytes around, signed or unsigned. Period.

The feature is here and at this point I still see no reason to drop it

Meaninglessness of the feature is not a reason anymore?..

One solution could be splitting the driver into two submodules one minimalist, the other more complete but I think this will create too much overhead

Especially considering there will be no any actual difference between those submodules.

@dylad

This comment has been minimized.

Copy link
Member

dylad commented Oct 1, 2018

Meaninglessness of the feature is not a reason anymore?..

No, the feature is meaningless to you. Analog has its reason to put it there (backward compatibility or other I leave this point to specialists). You will not convince me by saying "I can't see benefits".

If other people want this feature to be removed and there is a consensus on it I won't oppose to it.

@olegart

This comment has been minimized.

Copy link
Contributor Author

olegart commented Oct 1, 2018

No, the feature is meaningless to you

Ok. So it's meaningful to you. And what actually changes on the workings of your driver with this setting?

@olegart

This comment has been minimized.

Copy link
Contributor Author

olegart commented Oct 1, 2018

Oh.

/* Basic device setup */
    reg = (dev->params->full_res | dev->params->range);
    i2c_write_reg(BUS, ADDR, ACCEL_ADXL345_DATA_FORMAT, reg, 0);

And dev->params->full_res equals ADXL345_PARAM_FULL_RES equals 1, which means it sets bit 0 to 1. The only problem is, according to datasheet, FULL_RES of the DATA_FORMAT register is bit 3.

Did you test the driver on an actual device? Did anyone? Before pushing this non-working piece of art to release version of RIOT?

@dylad

This comment has been minimized.

Copy link
Member

dylad commented Oct 1, 2018

And dev->params->full_res equals ADXL345_PARAM_FULL_RES equals 1, which means it sets bit 0 to 1. The only problem is, according to datasheet, FULL_RES of the DATA_FORMAT register is bit 3.

You spotted a bug, good catch.

But I'm still no convinced we should drop the FULL_RES params from this driver.

@olegart

This comment has been minimized.

Copy link
Contributor Author

olegart commented Oct 1, 2018

You spotted a bug, good catch.

Question remains: how this driver got to release OS version if no one actually tested it - and, surprisingly, it doesn't work?

@dylad

This comment has been minimized.

Copy link
Member

dylad commented Oct 1, 2018

Question remains: how this driver got to release OS version if no one actually tested it - and, surprisingly, it doesn't work?

We checked that the sensor provide output, detect TAP and/or DOUBLETAP. We didn't go into depth data analyze.

@olegart

This comment has been minimized.

Copy link
Contributor Author

olegart commented Oct 1, 2018

By "deep data analysis" you mean you didn't check if it shows 1g on some axis if left lying on the table.

Seriously, it's a shame.

@aabadie

This comment has been minimized.

Copy link
Contributor

aabadie commented Oct 1, 2018

Seriously, it's a shame.

Nobody is perfect and that kind of things can happen, so please be respectful regarding others work.

@dylad

This comment has been minimized.

Copy link
Member

dylad commented Oct 1, 2018

@olegart What I propose to you :
You fix all the current issues with this driver with this PR but without removing the scale_factor thing since we agree to disagree on this point so we can upstream this fix quickly. Then you (or me, as you want) can open another PR to discuss about the scale_factor thing.

What do you think ?

@olegart olegart force-pushed the unwireddevices:riot-fix-adxl345 branch from a12d8b3 to ad8ecd7 Oct 1, 2018

@olegart

This comment has been minimized.

Copy link
Contributor Author

olegart commented Oct 1, 2018

@dylad I'm not actually interested in discussing this topic as it is clear you have nothing to say on it, but just to get rid of it I've changed PR.

Also it's a common maintainers' mistake to think I'm somehow interested in getting my PR to mainline. I'm not. We have our own RIOT fork, if you prefer to stay with your faulty code - you are welcome, I don't care.

@aabadie this driver is plainly and clearly broken. Broken from the day 0 and until now. It doesn't work. It doesn't provide the most basic functionaly. Nobody ever tested it. Its code is written with textbook errors. Its author can't explain why he wrote it this way.

While in other PRs there are months-long discussions about vague Codacy warnings or proper placement of spaces (but mostly silence, as nobody cares), this piece of art goes straight into release.

I wonder how many other RIOT internals have the same code quality and review thoroughness?

#endif
#ifndef ADXL345_PARAM_OFFSET
#define ADXL345_PARAM_OFFSET { 0, 0, 0 }
#endif
#ifndef ADXL345_PARAM_SCALE_FACTOR

This comment has been minimized.

@dylad

dylad Oct 1, 2018

Member

Thanks for your fix. Could you also readd this #ifndef please ?
We should be good to close this afterward.

@@ -159,7 +159,7 @@ typedef struct {
typedef struct {
adxl345_params_t *params; /**< Device configuration */
adxl345_interrupt_t interrupt; /**< Interrupts configuration */
float scale_factor; /**< Scale factor for converting value to mg */
int16_t scale_factor; /**< Scale factor for converting value to mg */

This comment has been minimized.

@dylad

dylad Oct 1, 2018

Member

This must also stay in float for now.

This comment has been minimized.

@olegart

olegart Oct 1, 2018

Author Contributor

"Must"?

Ah, screw it. Fix your shitcode by yourself.

All the more reasons to maintain own fork instead of participating in nonsense like this.

This comment has been minimized.

@aabadie

aabadie Oct 1, 2018

Contributor

Well, I think your reaction is not appropriate and could be considered as a violation of your code of conduct. Again, please calm down and keep this kind of comment for yourself.
I (and I'm sure others as well) appreciate your interest in RIOT and the effort you are putting to make it better.

This comment has been minimized.

@olegart

olegart Oct 1, 2018

Author Contributor

@aabadie otherwise what?

You'll never accept my PR again and will stay with shitcode like this? Take a look at my last commit here, it's absolutely marvellous.

BTW, is there a way to list all commits made by @dylad? I'm dead serious to check them all or at least mark as potentially dangerous in our fork. That's not production code, that's not even beta code, that's shamefull code - so badly its written.

This comment has been minimized.

@gebart

gebart Oct 2, 2018

Member

@olegart of course there is, did you try reading the fine manual for git log? https://git-scm.com/docs/git-log#git-log---authorltpatterngt Do whatever you like with that info.

This comment has been minimized.

@olegart

olegart Oct 2, 2018

Author Contributor

Oh, I know exactly what to do with that info.

With that info I'll return in a year to see if someone has bothered to fix the shitcode here.

I'm kind of know what I see then, but I'm pretty sure Code of Conduct will be vastly improved.

This comment has been minimized.

@topin89

topin89 Oct 2, 2018

@dylad
I'm kinda curious, why it must stay float and for how long?
It's clearly not about precision, driver was not able to show 1g correctly.
Is it for backwards compatibility?

@kYc0o kYc0o self-assigned this Oct 1, 2018

@olegart olegart force-pushed the unwireddevices:riot-fix-adxl345 branch from 6befbe4 to 4fa5b8e Oct 1, 2018

@tcschmidt

This comment has been minimized.

Copy link

tcschmidt commented Oct 2, 2018

@olegart thanks for identify this bug and contributing a fix. This helps to improve RIOT, and we value your technical sharpness. There is no need to believe that we don't recognize your arguments or just push back from merging. Having a review discussion is a normal procedure in the RIOT open source community, and it may happen that implementation choices and algorithmic details only become apparent through the exchange of arguments.
We also value your sharp arguments. On the other hand, please understand that others may get upset if they sense an aggressive voice. Even if you didn't mean to be aggressive, your correspondents here took it that way.
The reviewing process in RIOT seeks a consensus for solution between author and reviewer(s) at the highest reachable quality. This does not always work out successfully. If you feel that a reviewer does not understand your contribution nor recognizes your arguments, you are always free to ask for a review from another person.
But please don't attack people for their arguments: They may have reasons, they may be also wrong - but the constructive spirit of the community and also the RIOT code of conduct guides all of us to collaborate with you as well as with all others. This makes the productive community happen and is - finally - a lot of fun.

@olegart

This comment has been minimized.

Copy link
Contributor Author

olegart commented Oct 2, 2018

@tcschmidt the problem is review process is flawed. I'm not talking about this specific discussion, I'm talking about how this code was accepted at all. Obviously, nobody ever reviewed it - and it seems nobody cares too.

For a year and a half, you had non-working badly written code in all the release versions. And nobody cares.

Are you familiar in any way with the process of scientific paper review process? It's far from perfect, but if the author was caught on some real bullshit, they don't "seek a consensus". They don't "politely ask". They don't "care about his feelings". They don't talk about "constructive spirit". They fire him, they revoke his paper, they recheck his older papers and revoke it too.

Did you hear, for example, about Jonah Lehrer and, recently, Brian Wansink? They may had their reasons too, but they are out of community now an nobody is going to collaborate. Lehrer wrote weekly scientific columns in Wired and New Yorker once, now all he's writing is quarterly post in his own blog.

But, still, may be you care about constructive spirit, not about code quality at all. That will explain.

@gebart

This comment has been minimized.

Copy link
Member

gebart commented Oct 2, 2018

@olegart I know you want to provoke a reaction from the community here, but when you express yourself the way you are doing in this and other threads, you are really only hurting your own standing. Others will not listen to you or respect you more because you are angry, but rather disregard your technical statements and only focus on the emotional content of your posts, which is not helpful at all.
Why are you angry? Do you want praise for finding a bug in a device driver? I'm sure this is not the first software development project you have been involved in and I am sure that you have seen other bugs being fixed in code, there usually isn't more than a short "thanks" after a bugfix has been merged. There is definitely no need to start attacking the person who introduced a problematic line/commit. They can still contribute other useful code and valuable knowledge and opinions.

This driver has some quality issues, but seem to have been working well enough in the use case that @dylad was using it for. If you need the exact amplitude, then the old implementation gives you some value which may or may not be accurate, and the proposed change gives you another value which may or may not be accurate, but wouldn't you need to calibrate the individual device in the environment that it is meant to run in if you need the exact value? The data sheet lists the scale factor as 3.5 < s < 4.3, so it's quite a big spread.

Using floating point here looks to me like an unnecessary complication of the implementation, but it would have been more obvious that this is the purpose of the change if it were not for your abuse and self-praise that steal focus from the factual content of this PR.

@olegart

This comment has been minimized.

Copy link
Contributor Author

olegart commented Oct 2, 2018

This driver has some quality issues, but seem to have been working well enough in the use case

You mean:

  • drivers' most basic functionality doesn't work at all
  • drivers' author didn't bother to test the most basic functionality
  • drivers' output data is complete garbage
  • drivers' code has dozens of textbook errors

Who, exactly, put this in the release? Who, exactly, reviewed this? What have been done to prevent such screwups from happening in the future? When other code submitted by @dylad and/or reviewed by the maintainer who put it into release are scheduled for additional reviews?

Is there anyone who can answer these simple questions?

Or everyone is so obsessed with the Code of Conduct so nobody cares about the code, I mean, programming code anymore?

but wouldn't you need to calibrate the individual device

And now you are starting to invent reasons in the area you know precisely nothing about.

@topin89

This comment has been minimized.

Copy link

topin89 commented Oct 3, 2018

@olegart,
First:
I don't think "how this driver, being completely broken and unnecessary bloated, got into OS release?" is the most important question. It's more "how come no one noticed before?"
There is a hint, actually:

We checked that the sensor provide output, detect TAP and/or DOUBLETAP. We didn't go into depth data analyze.

The way I see it, not many really used it for any precise measurement. Or no one used FULL_RES, at least I did not found issues about it not working.
I want to be clear, the code is still bad, and your points are still valid.

they don't "seek a consensus". They don't "politely ask". They don't "care about his feelings". They don't talk about "constructive spirit". They fire him

You see, that is only for forgery. Or for constant screw up. Or maybe for denial of a screw up. One time mistake is a reason to say: "This code is bad because . Change it and don't do that again". And if there are no quality improvements, then fire.

Or everyone is so obsessed with the Code of Conduct so nobody cares about the code, I mean, programming code anymore?

The Code exist for one simple reason: so no one shout at each other and no one take defensive position after a round of shouting. Our brains are hard-wired to protect our position if they feel attacked, and they are hard-wired to feel attacked on direct and indirect insults.
It's OK to be mad seeing bad code. I think this is a reason why you are so good at your job. But please don't attack people, this won't help neither you nor anyone else.

Next part, please, at least name these Textbooks. It's a sad fact, but yes, some people don't really understand that float in kernel will lead to either software emulation or increased interrupt latencies or something else (actually, I would be glad to know what else).
And there is this:

And now you are starting to invent reasons in the area you know precisely nothing about.

This is indirect insult without clarification. It would be more productive to say "The scale is dynamic and depends on things like temperature or direction or actual force". Correct me if I am wrong, btw. People can't know everything, but you can help them know more. If they deliberately ignorant, then you can insult'em to your to your heart's content.

User may want to have a FULL_RES to 0

Yes, but why?

This maybe some hardcoded expectation of 10 bits. Maybe every example is for 10 bits and no one changes it because it works anyway. Given no one noticed FULL_RES isn't working correctly, maybe people are don't need it.

But I will stress it again, tone aside, your points are correct and should not be ignored.

@dylad

Byte reordering was never supposed to be done with arithmetic +
Aside from practical issues with signed ints, there is typical expectation. When people see "+", they immediately think of it as of addition, for some offset or some math. Using "+" for bytes manipulation is instant +1 wtf to a code.

@olegart is correct, there is no point in 3.9 scale. Quote from the datasheet:

When this bit is set to a value of 1, the device is in full resolution mode, where the output resolution increases with the g range set by the range bits to maintain a 4 mg/LSB scale factor.

Manufacturer itself said 4, not 3.9. But if you really need 3.9, please no float. You can use fixed point arithmetic, it's not that hard is it seems.

  1. go here and convert 0.9 to binary fraction, and copy first, say, 16 bits somewhere (it's 1110011001100110)
  2. then "|" with 3 << 16:
    int32_t scale = ((int32_t)3<<16) | (int32_t)0b1110011001100110;
  3. to get integer part after rescale:
    int32_t result = (input*scale)>>16;
  4. to convert to float (if you really really must):
    float result = (input * scale)*(1.0f/( (int32_t)1 <<16 ));

To everyone here, these are good questions:

  • Is there some code standards (like no float in kernel or anything)?
  • Do reviewer should actually test functionality with actual hardware?
  • Should code author at least describe tests and their results, so it was known that not all modes were tested?
@olegart

This comment has been minimized.

Copy link
Contributor Author

olegart commented Oct 3, 2018

@topin89

The way I see it, not many really used it for any precise measurement

You don't need any precision measurements for that. Fisrt, code is obviously broken, byte reordring with arithmetic operations, tons of unused and differently named defines, etc. Second, there's the most basic accelerometer test ever - lying flat on the table it must give you 1g on z and 0g on x/y. Flip it, see -1g on z, accelerometer is ok. Takes 5 seconds to check.

Actually we made some devices with ADXL345 like a year ago, with the original driver it shows something like 2g or 4g or some other nonsense.

So nobody ever tested even the most basic thing (Earths' gravity...) and nobody really looked into code.

Or no one used FULL_RES, at least I did not found issues about it not working

As I said, no one used anything. They just put it into release. Both resolution and scaling logic are completely broken in the driver, I didn't even bother to check what actual state accelerometer was into by default, it's a mess.

This is indirect insult without clarification

Because at this point the main point is not "is there real need to use float?".

The main point is "is RIOT worth using and safe to use"?

As I see it, the answer is the bold NO unless you put your own effort (time and money) into getting it straight and reviewing its code. That means RIOT may have some usefulness, but if you are starting a new project and looking for a RTOS, I'm strongly recommend to check other options and avoid using RIOT if possible.

  • RIOT review process is flawed and allows to completely broken, badly written code come to release without anyone raising a question
  • fixing it requires clearer review procedures and decisive actions to be taken if some contributor and/or maintainer caught on submitting broken code

And it seems nobody is willing to take any decisive actions ever. I'm not even sure there's anyone who's able to do it. Its community all over, isn't it?

@leandrolanzieri

This comment has been minimized.

Copy link
Contributor

leandrolanzieri commented Oct 9, 2018

I tested this PR on nucleo-f410rb, samr21-xpro and arduino-mega-2560 boards. Looks good. However I found a minor issue while reading the data (it is being swapped on little endian platforms, although this should be done only on big endian ones). Fix provided here unwireddevices#6

@cgundogan
Copy link
Member

cgundogan left a comment

ACK. Measurement values on current master are broken with several boards. The fixes provided by @olegart and @leandrolanzieri yield accurate values for the X, Y and Z dimensions. Let's see what murdock says.

@aabadie
Copy link
Contributor

aabadie left a comment

Codewise, I found one last thing that should be changed. There's also a merge commit in the commit history of the PR. Please rebase to drop it before we merge this into master.

#define ADXL345_PARAM_SCALE_FACTOR (3.9)
#endif

#define ADXL345_PARAM_SCALE_FACTOR (4)

This comment has been minimized.

@aabadie

aabadie Oct 12, 2018

Contributor

If this is not parametrizable anymore, this define should move somewhere else, in adxl345.c

This comment has been minimized.

@tcschmidt

tcschmidt Oct 12, 2018

@cgundogan can you complete the code review (incl. related label steps)?

@aabadie aabadie force-pushed the unwireddevices:riot-fix-adxl345 branch from 015f460 to f2c1355 Oct 15, 2018

@aabadie
Copy link
Contributor

aabadie left a comment

I (forced) pushed a rebased version (e.g without the merge commit) of this branch and containing an extra commit that moves ADXL345_PARAM_SCALE_FACTOR define in adxl345.c.

For me this PR is now OK. Thanks for the work @olegart.

ACK

@cgundogan

This comment has been minimized.

Copy link
Member

cgundogan commented Oct 15, 2018

@aabadie okay, there seems to be an unrelated build issue with cpp now

@jia200x

This comment has been minimized.

Copy link
Member

jia200x commented Oct 16, 2018

All green. 2 ACK.. Let's GO.

@jia200x jia200x merged commit b3ec93d into RIOT-OS:master Oct 16, 2018

3 checks passed

Codacy/PR Quality Review Up to standards. A positive pull request.
Details
Murdock The build succeeded. runtime: 25m:50s
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment