Skip to content
A lightweight threading library for microcontrollers
Branch: master
Clone or download
Type Name Latest commit message Commit time
Failed to load latest commit information.

CopyThreads: A lightweight threading library (not only for microcontrollers)

tl;dr Create your threads by Scheduler.start(start_func, arg), concurent loops with Scheduler.startLoop(loopX), and Scheduler.yield() instead of blocking.

The main idea of CopyThreads is to copy the stack to the heap when switching to another thread. There the name "CopyThreads" comes from.

CopyThreads is open source, hosted on GitHub While in a very early stage with much room for improvements, it already seams to work reliable. So, use at your own risk! It should work on every architecture with a stack growing downwards. So far it is tested on x86_64 and atmega328 (Arduino) only. Comments, bug reports, improvements, pull requests are very welcome!


Usually, every thread has its own stack. As the maximum stack size is mostly only roughly known, you have to choose conservative plus a little more for every thread. This means, that tiny threads and even stackless threads consume much more memory for the stack space than necessary. You have to choose a stack size which fits the worst case thread. In RAM constrained environments, like micro controllers, this mostly prevent the usage of threads for concurrent programming.

Keywords: Green Threads, cooperative threads, cooperative multitasking, non-preempted

Arduino usage

Include the CopyThreads header to your project

#include <Cth.h>

Start your concurrent loop() functions from within setup() with Scheduler.startLoop(loopX). If defined, the traditional loop() is also called in a loop without an explicit startLoop(loop).

void setup() {

Run once functions could be implemented as threads. A thread entry is a void function with at most one argument fitting inside a void pointer. It is started with Scheduler.start(athread, some_arg) and runs until the function returns. You can start additional threads at any time.

/* Will print:
 * "Now and than."
void loopY() {
	// print in 1sec:
	Scheduler.start(threadPrintDelayed, "than.");
	Serial.print("Now and ");

void threadPrintDelayed(const char *text) {

Inside the loopX() and also inside the threads you should call ONLY non blocking functions (e.g. Stream.available() or but NOT Stream.readString() without setTimeout(0)). Time consuming or busy waiting loops should call Scheduler.yield() to give cpu time to other threads. delay() is allowed, as it implicitly calls Scheduler.yield(). But you should prefer Scheduler.delay() because the overhead when switching between waiting threads is much smaller with Scheduler.delay().

Here an example of a blinking LED loop

void loopBlink(void) {
	digitalWrite(LED_PIN, HIGH);

	// Progress with other threads,
	// and continue in 1000ms:

	digitalWrite(LED_PIN, LOW);


Waiting for input from the serial interface. This is when Serial.available() returns true.

void loopSerial(void) {
	// Wait for Serial.available():

	// Wait with a method. Here the long version of Scheduler.wait_available():

	// Print as numeric value
	int ch =;

In general, use Scheduler.wait() to wait for something:

unsigned long something = 0;

int compareWith(unsigned long arg) {
	return something == arg;

int buttonXPressed(void) {
	return !digitalRead(BUTTONX);

void LoopOrThread(void) {
	// Wait for Serial.available():
	// Wait with a method. Here the long version of Scheduler.wait_available():
	// Wait with a C function. e.g. until a global var reach 5
	Scheduler.wait(compareWith, 5);
	// Wait with a C function. e.g. until a button is pressed:
	// Dont wait, but give other threads the chance to proceed.
	// e.g. busy wait for input:
	while (!Serial.available()) Scheduler.yield();


Plain C

Include the CopyThreads header cthread.h to your project. Define a thread with a void function with one void * argument. Yield with cth_yield() to switch to the next thread. Start a thread with cth_start(athread, an_arg). Run all threads with cth_run().

#include <cthread.h>

void athread(void *arg) {
	char *name = (char *)arg;
	unsigned i;

	for (i = 0; i < 4; i++) {
		printf("Thread %s loop %u\n", name, i);

int main(int argc, char **argv)
	cth_start(athread, "th1");
	cth_start(athread, "th2");


	printf("All threads are done\n");
	return 0;



  • pure C core
  • C++ API intended for Arduino projects
  • small RAM overhead (about 60 bytes (ToDo: measure) per thread on AVR MCUs).
  • Threads with large stacks are allowed (if needed. You should avoid).
  • Only the used part of the stack is copied (when switching the context). Tiny threads only require little heap memory.
  • Yield from everywhere in the calling stack.


  • A context switch might be expensive (in cpu time) on "big" threads.
  • References to stack memory (local variables) are only valid in the current thread.
  • Hard to predict overall memory usage. The heap might get very fragmented.


  • Avoid to many local variables. They are hosted on the stack.
  • Avoid yielding (context switching) within a deep calling tree.
  • Save complex state on the heap and hold only a reference to it on the stack.

Missing features / ToDos

  • A scheduler. Currently all threads in state "running" are resumed round robin.
  • Direct context switch from current thread to the next thread without an intermediate longjmp to cth_run().

Alternative approaches for concurrent programming

You can’t perform that action at this time.