Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HostTests stalling on esp8266 Waiting for timer1 callback test to complete #2763

Closed
mikee47 opened this issue Apr 12, 2024 · 4 comments · Fixed by #2764
Closed

HostTests stalling on esp8266 Waiting for timer1 callback test to complete #2763

mikee47 opened this issue Apr 12, 2024 · 4 comments · Fixed by #2764
Labels

Comments

@mikee47
Copy link
Contributor

mikee47 commented Apr 12, 2024

This problem has been tracked back to PR #2712. Prior to that, the test code didn't actually wait for the timer callback to fire before continuing, but it still doesn't fire there.

Simplest way to verify this issue is to remove all tests from include/modules.h except Timers. Can also build with DISABLE_NETWORK=1 - setting makes no difference to the issue but reduces code size.

With the following changes and the problem goes away:

diff --git a/tests/HostTests/modules/Timers.cpp b/tests/HostTests/modules/Timers.cpp
 void REGISTER_TEST(Timers)
 {
-	registerGroup<CallbackTimerApiTest<Timer1TestApi>>();
+	// registerGroup<CallbackTimerApiTest<Timer1TestApi>>();
 	registerGroup<CallbackTimerApiTest<OsTimerApi>>();
 	registerGroup<CallbackTimerApiTest<OsTimer64Api<Timer>>>();
 
-	registerGroup<CallbackTimerSpeedTest<HardwareTimerTest>>();
+	// registerGroup<CallbackTimerSpeedTest<HardwareTimerTest>>();
 	registerGroup<CallbackTimerSpeedTest<SimpleTimer>>();
 	registerGroup<CallbackTimerSpeedTest<Timer>>();
 
 	registerGroup<CallbackTimerTest>();
 }

The issue appears to occur when the hardware timer is disarmed, then rearmed multiple times.
I don't know whether this is a hardware problem with my specific device, but unlikely.

Investigations continue...

@mikee47
Copy link
Contributor Author

mikee47 commented Apr 12, 2024

Though a bit of trial and error I discovered that this fixes the problem:

__forceinline void IRAM_ATTR hw_timer1_detach_interrupt(void)
{
	hw_timer1_disable();
	// ETS_FRC_TIMER1_NMI_INTR_ATTACH(NULL);
	ETS_FRC_TIMER1_INTR_ATTACH(NULL, NULL);
}

The ETS_FRC_TIMER1_NMI_INTR_ATTACH macro calls the SDK function NmiTimSetFunc. It does two things:

  1. Sets some bits in a DPORT register.
  2. Stores the interrupt routine address

Passing a NULL value doesn't actually block interrupts. Espressif's RTOS SDK uses NMI for tick handling and it provides a definition for the relevant register NMI_INT_ENABLE_REG. To disable NMI interrupts all we need to do is clear that register.

@mikee47
Copy link
Contributor Author

mikee47 commented Apr 12, 2024

Also found a reference to this bug in the SDK "that Timer can not be used with non-NMI interrupt once NMI has been used". esp8266/Arduino#4598 (comment).

@mikee47
Copy link
Contributor Author

mikee47 commented Apr 13, 2024

Here is a modified Basic_Blink application which I used to test this at a low level.

Note that if the hw_timer1_detach_interrupt call is removed the LED continues to blink.
However, the rate doesn't change as expected: that means the NMI interrupt is still the active one.

#include <SmingCore.h>
#include <HardwareTimer.h>

/**
 * Demonstrate bug switching timer1 from NMI to FRC interrupts.
 * The LED will flash 4 times, then stop.
 * The 'ALIVE' messages continue to indicate system hasn't actually crashed.
 * What happened is that the NMI is still enabled and blocks FRC callbacks.
 */

#define LED_PIN 2

namespace
{
using Timer1TestApi = Timer1Api<TIMER_CLKDIV_16, eHWT_NonMaskable>;

SimpleTimer statusTimer;
bool createMaskedTimer;
const unsigned interval{500};
unsigned ticks;
unsigned count;

void initTimer();

void IRAM_ATTR blink1(void*)
{
	digitalWrite(LED_PIN, count & 1);
	++count;
	if(count % 8 == 0) {
		System.queueCallback(initTimer);
	}
}

// Blink LED at twice normal rate
void IRAM_ATTR blink2(void*)
{
	blink1(nullptr);
	hw_timer1_write(ticks / 2);
}

void initTimer()
{
	ticks = HardwareTimer::Millis::timeToTicks(interval);
	Serial << _F("Initialising timer, source = ") << (createMaskedTimer ? "FRC1" : "NMI") << _F(", Interval = ")
		   << interval << _F("ms, ticks = ") << ticks << endl;
	auto source = createMaskedTimer ? TIMER_FRC1_SOURCE : TIMER_NMI_SOURCE;

	// >>>> This is the call which breaks switching from NMI to FRC
	hw_timer1_detach_interrupt();
	//<<<<

	hw_timer1_attach_interrupt(source, createMaskedTimer ? blink1 : blink2, nullptr);

	hw_timer1_enable(TIMER_CLKDIV_16, TIMER_EDGE_INT, true);
	hw_timer1_write(ticks);

	createMaskedTimer = !createMaskedTimer;
}

} // namespace

void init()
{
	Serial.begin(COM_SPEED_SERIAL);
	pinMode(LED_PIN, OUTPUT);

	statusTimer.initializeMs<1000>([]() { Serial << SystemClock.now() << " ALIVE!" << endl; });
	statusTimer.start();

	initTimer();
}

@mikee47 mikee47 added the Bug label Apr 13, 2024
@mikee47
Copy link
Contributor Author

mikee47 commented Apr 13, 2024

Tested the above application against various updated SDK versions:

tag v3.0.4 b1c14cdb : no change
tag v3.0.5 7b5b35da : no change
branch release/v3.0.5 38fda5f9 : bootloop crashing
branch master : doesn't boot

Inspection shows all of them have the same NmiTimSetFunc implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant