Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPIO Timing Issues for Bit Banging, e.g., for i2c (Wire Library) #1536

Closed
pasko-zh opened this issue Jan 30, 2016 · 6 comments
Closed

GPIO Timing Issues for Bit Banging, e.g., for i2c (Wire Library) #1536

pasko-zh opened this issue Jan 30, 2016 · 6 comments

Comments

@pasko-zh
Copy link

I observed that sometimes the wire library did not work as expected. It is---probably---a deeper, more fundamental issue related to GPIO timing. My issue description is thus a bit longer...

Hardware Setup, Versions
I'm using an Adafruit HUZZAH (esp-12) conected to an MCP 23017, with 1.5K pullups on both lines. Then, I'm using 1.6.5-r5 of the Arduino IDE and Version 2.0 of the esp8266 board manager on Win7/64.

In order to get closer to the problem I've dedided to not use the wire library and its related cores, instead I've created a very simple basic sketch.
It contains two functions with the same functionality to toggle the two pins, written in C and Assembler, toggle_blue_yellow__C() and toggle_blue_yellow__asm().
The pins are labeled yellow and blue to match the scope pics. The expected result would look like this: expected result

The high and low periods are done with a simple delay loop containing some NOPs. With CPU clock of 80 MHz it should give ca. 360 KHz and with 160 MHz ca. 720 KHz---yellow line on the scope pic. I explicitly did not adapt the delay loop to the CPU clock frequency. Also, the functions in C and assembler are written to be as similar as possible.

I've tested different variations of the basic sketch:

  • Compilation for 80 and 160 MHz, file name contains ..._160mhz_... and ..._80mhz_...
  • Versions with a declaration to a dummy variable, ..._dummy_... and without ..._x_...

For the each test I've included the rigol scope pics, the object file and a disassembly of it.

Results for C Versions
Repo-Folder

80 MHz with dummy variable, C_dummy_80mhz.* files:

enter image description here
enter image description here

First half cycle is OK. But then it should have the falling edge on the dotted position, lower pic.
But there's also a stretch on the blue one of ca. 840 nsec:
enter image description here

Also, I've noted that the stretching is about 2.08 usec (3.52 - 1.44):
enter image description here

160 MHz with dummy variable, C_dummy_160mhz.* files:
Bingo! Perfecto! :-) Correct timing, no stretching...

enter image description here

80 MHz without dummy variable, C_x_80mhz.* files:
First cycle is OK (ca. 370 KHz), then the low one is far too long!
enter image description here
enter image description here

160 MHz without dummy variable, C_x_160mhz.* files:
Bingo, perfecto again:
enter image description here
enter image description here

Results for Assembler Versions
Repo-Folder

80 MHz with dummy variable, asm_dummy_80mhz.* files:
Oookey, dobro :-)
enter image description here
enter image description here

160 MHz with dummy variable, asm_dummy_160mhz.* files:
Perfecto again...
enter image description here

80 MHz without dummy variable, asm_x_80mhz.* files:
Looks OK at first, but notice that the all cycles are too long! A half cycle is around 3.6 usec, i.e. stretched by ca. 2.2 usec (3.6 - 1.4).
enter image description here
enter image description here

160 MHz without dummy variable, asm_x_160mhz.* files:
First half cycle is OK, then stretches of again ca. 2.14 usec (2.860 - 0.72) usec:
enter image description here
enter image description here


Summary
Versions producing the expected results:

  • C Version
    • With dummy variable and 160 MhZ
    • Without dummy variable and 160 MhZ
  • Assembler Version
    • With dummy variable and 80 MhZ
    • With dummy variable and 160 MhZ

As you can see in the ino sketches I also tried to disable the interrupts (I didn't try the assembler rsil stuff) and the watchdog, but it didn't change any of the results. Therefore, I've just commented those line out.
Having a look at the dissassembly files, I could think of compiler optimizations causing these stretches of ca. 2.1 usec. However, I am bit confused that the CPU clock frequency could have impacted the results---note that the dissassembly files are identical.
My second thoughts are the issues are cauesed due to non-aligned memory access. Because, although I cannot reproduce it, there were cases, where esp-12 crashed with No 9 exception (LoadStoreAlignment Cause). But I only got the exception when there were those stretches.

Now, depending on how lucky you are, you won't face any issue with for instance the wire library. But, if you are unlucky and your compiled sketch is in one of the above bad cases, you will encounter "strange" issues with regard to i2c communication, because those stretches may introduce timing issues on SDA and SCL lines. (btw., I am aware of bus stalls but this is another topic).

My Questions...

  1. Can you reproduce this behaviour?
  2. Did you encounter such timing issues, too?
  3. How could I improve my Code---C or (inline) assembler---that this is no longer happening?
  4. How could we dig deeper/further to find out what is going to happen in these 2.1 usec stalls?
  5. Since I suspect it is due to compiler optimizations, can I change something in those settings, or exclude parts of the code from, e.g., memory optimizations?

Thanks for your thoughts, help, improvements :-)
paško

@Links2004
Copy link
Collaborator

big analyses 👍
If i find time i will make some tests too.

for 5. I can give you a example

#pragma GCC push_options
#pragma GCC optimize ("O0")

your code

#pragma GCC pop_options

or

void __attribute__((optimize("O0"))) foo(unsigned char data) {
    // unmodifiable compiler code
}

Note: the attribute make sometimes problems in gcc with function declaration,
better use the pragma style.

i am not sure if ETS_INTR_LOCK really disable all Interrupts, dark remember that there where something wrong with it.

may try

#define interrupts() xt_rsil(0)
#define noInterrupts() xt_rsil(15)

@pasko-zh
Copy link
Author

Markus, thanks for that quick reply! I will try it out ASAP.

@pasko-zh
Copy link
Author

pasko-zh commented Feb 3, 2016

I did some more tests... and it still confuses me :-/ My further observations are:

  • I flashed and flashed and reflashed (poor esp flash memory ;) ... and sometimes the issue was gone when changed the serial upload speed (!)
  • Then it was there when I recompiled for 80 MHz.... then changed to 160 MHz and it was ok after flashing and reflashing and ...

So, really strange!! wtF ...

My only work around so far is, repeat the steps above until your waveforms look ok.

PS: I've tried Makrus' disable Interrupt hint and the pragma style. Didn't change any noticable behaviour, or sometimes it did, but again, I think something is going a bit wrong sometimes while flashing the esp...

@Links2004
Copy link
Collaborator

may it has something to do with the Flash to RAM mirroring.
the first 32KB of the image are always in ram.
all other stuff is loaded dynamical to a other 32KB "buffer" in ram.
the dynamical buffer hold always the last used functions (memory addresses).
may the 2.14 us you see are the loading from flash to ram,
the code needs to be in ram to be executed.

have you tried to force all functions to RAM?

void ICACHE_RAM_ATTR always_in_ram() {

}

void in_flash_and_may_mirrord_in_ram() {

}

@pasko-zh
Copy link
Author

pasko-zh commented Feb 3, 2016

Excellent! I owe you a beer :-) So, if you are around Zurich, drop me a message!

THAT DID IT!

However, I have an understanding question because up to now I always thought that Functions decorated with ICACHE_FLASH_ATTR are compiled to the irom section,CPU will read the function code out of FLASH chip if needed. It will be loaded in CACHE and run only if it is called.
Functions without ICACHE_FLASH_ATTR will be load to IRAM since power on
espressif FAQ.

So, it is vice versa with the arduino tool chain? i.e. if I do not put ICACHE_RAM_ATTR before functions, then they will be put into flash memory and loaded into RAM when needed?

@Links2004
Copy link
Collaborator

yes, on the Arduino port its the other way around, default is flash.
since its hard if you need to add ICACHE_FLASH_ATTR too all your functions,
not to talk about cross platform compatibility to Arduino code or librarys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants