The purpose of this exercise is to explore both the world of embedded development and zig programming language.
I choose the bottom-up approach, this way exercises will be incrementally more complicated.
Each exercise will contain information i discovered during it's implementation and problems i faced.
- https://github.com/haydenridd/stm32-baremetal-zig - example project
- https://blog.thea.codes/the-most-thoroughly-commented-linker-script/ - about linker script
arm-none-eabi-objdump -D main.elf > objdump
ORllvm-objdump-15 -D main.elf > objdump
openocd -f board/st_nucleo_f4.cfg
to connect to boardlldb-15 zig-out/bin/main.elf
to start lldb with selected executable(lldb) gdb-remote localhost:3333
to connect to remote board
- 001_asm_led_minimal
- 002_asm_blink
- 003_asm_led_button
- 004_asm_blink_button
- 011_led_minimal
- 012_blinky_minimal
- 021_led_registers
- 022_led_library
- 023_blink
- 031_usart
- 032_usart_writer
- 041_adc
- 042_adc_fraction
- 051_tim_blink
- 052_tim_output
- 100_regs_blink
- 101_regs_usart
- 200_build_blinky
- 201_build_openocd
- 202_build_lib
- 210_build_protobuf
A good place to start is to implement the minimal possible program. It will use assembly, to begin with the simplest language. Plus assembly still will be used for startup files in future. At this point i will also try to remove any unused sections, flags etc. They will be added in future examples if needed.
Files used:
main.s
- contains both vector table and code to blink LEDlinker.ld
- linker script file
Before using build system, program will be built with command line call
zig build-exe main.s -target thumb-freestanding-none -mcpu cortex_m4 -O ReleaseSafe -Tlinker.ld --name main.elf --verbose-link --verbose-cc -fstrip -fno-compiler-rt
.
-target
and -mcpu
to define where code will be flashed
--verbose-link
and --verbose-cc
to view the compiler and linker flags (--verbose-cc
will not produce any output if compilation is cached) :
zig clang -fno-caret-diagnostics -target thumb-unknown-unknown-unknown -mcpu=cortex-m4 -ffreestanding -c -o main.o main.s
ld.lld -error-limit=0 --lto-O3 -O3 -z stack-size=16777216 -T linker.ld --gc-sections -m armelf_linux_eabi -Bstatic -o main.elf main.o libc.a --as-needed --allow-shlib-undefined
both of verbose params will be omitted in future examples
-fstrip
to omit debug info in elf file
-fno-compiler-rt
to remove lazy loaded compiler-rt.a
openocd -f board/st_nucleo_f4.cfg -c "program build/main.elf verify reset exit"
to flash on device
_start
section disappeared
When disassembling theelf
file byllvm-objdump -D main.elf
.text
section were missing from disassembly. Due to optimization, compiler removed code frommain.s
file, to solve this problemKEEP()
keyword need to be added to prevent from optimizing this part, so we haveKEEP(*(.text))
in linker script. This caught me off guard becausegcc
wasn't making such optimization, probably due to different flags provided to linker.+1
address offset
Cortex M got this weird thing when it usesthumb
instruction set, the reference to label should be one bit higher than the actual address. Simple way to do it is to add actual+1
to assembly code e.g..word _start +1
. To take more systematic approach.thumb_func
label should be added to called function instead of+1
:
.thumb_func
_start:
_start
not exposed
ld.lld: warning: cannot find entry symbol _start; not setting start address
is occurring when compiling the code.
_start
function should be exposed by adding.global
in assembly file.
.global _start
- binary size
While trying to solve#1
problem i digged into.elf
to solve the issue. When investigating, the question raised: 'why are elf files so huge in comparison to actual work been done?'. File contains.comment
,.symtab
,.shstrtab
,.strtab
sections. Apparently only a part of sections are being flashed into device (still need to figure what rules are used for it). - reset sequence
One last thing to know when starting is a reset sequence. This will explain the need for.word _start
when writing code. Best described by paragraph from cortex-m3/4 book:
After reset and before the processor starts executing the program, the Cortex-M processors read the first two words from the memory. The beginning of the memory space contains the vector table, and the first two words in the vector table are the initial value for the Main Stack Pointer (MSP), and the reset vector, which is the starting address of the reset handler. After these two words are read by the processor, the processor then sets up the MSP and the Program Counter (PC) with these values.
.ARM.attributes
.ARM.attributes
section holding specific instruction arm instructions needed to view them in objdump.
in case.ARM.attributes
is removed from binary, disassembly instructions will be shown as invalid ones.- sections of objdump that can be omited:
.comment
.symtab
.strtab
blinking onboard LED.
Period is defined by delay 0xFFFFF
which is about 1 mil,
loop is taking few cycles.
With 16 MHz default clock speed, it would give us few blinks in a second.
- Compare instructions
Compare instructions such ascmp
updatescpsr
register, which is giving branch instructions such asbne
to check a compare results and make a decision.
Turn on LED if onboard button is pushed.
- Button State
If button is not pushed, IDR register outputs 1, otherwise 0.
LED starts blinking when onboard button is pushed.
Now to use zig i'll add main.zig
file to put there logic for turning on led.
Important thing is to add export
to main function as it will allow to expose it to compiler, same logic as with .global
in assembly.
Files used:
main.s
- contains vector tablemain.zig
code to blink LEDlinker.ld
- linker script file
commands to build and flash program:
zig build-exe main.zig startup.s -target thumb-freestanding-none -mcpu cortex_m4 -O ReleaseSafe -Tlinker.ld --name main.elf --verbose-link --verbose-cc --strip -fno-compiler-rt
openocd -f board/st_nucleo_f4.cfg -c "program main.elf verify reset exit"
-
.ARM.exidx
missing region
When compiling, error from linker with messageno memory region specified for section '.ARM.exidx'
occurs. This section is needed for 'unwinding the stack', procedure for handling exceptions. Given that zig have no exceptions,exidx
sections is useless. Nevertheless it's required for linker script, for some reason. To work around this issue, add-fno-unwind-tables
to issue. -
Code from startup file not appearing in disassembly
Either linker or compiler removing startup file part from object file. Changing section name from.text
to.isr_vector
helped the issue.
Probably startup file require different section name than the main one.
.c
alternative
Ways to access memory in both languages:
(*(volatile unsigned int *) (0x12345678)) |= 0x1;
-.c
version
@intToPtr(*volatile u32, 0x12345678)).* |= 0x1;
-.zig
version (old)
(@as(*volatile u32, @ptrFromInt(0x12345678))).* |= 0x1;
-.zig
version (0.13.0)
Same as 012_blinky_minimal but with blinky led
Adding complexity by presenting Memory Mapped structures. In previous examples i was using pure values when accessing memory regions. Default approach is to map memory into structure, this way everything is organized and in one place.
Files used:
main.s
- contains vector tablemain.zig
code to blink LEDregisters.zig
- file with memory mapped structureslinker.ld
- linker script file
Also added _estack = ORIGIN(SRAM) + LENGTH(SRAM);
to linker script. _estack
is required symbol
and in case if it's unavailable we'll get an error error: ld.lld: undefined symbol: _estack
.
_estack
is a stack that is pointing to end of SRAM memory.
- packed struct
zig 0.10
still have some issues with packed structs, specifically when nesting them together. For this reason registers will have flat structure for now.
- bit-banding
Accessing of a single bit of MMIO is called bit-banding.
Cortex-m4 does support this (Arm Cortex-M4 Processor About bit-banding), but when zig (or LLVM) compiles code it does so that bit-banding is not working.
So then this it will not work.
MODER: packed struct {
MODER0: u1,
MODER1: u1,
MODER2: u1,
MODER3: u1,
...
},
Minimum length of bit is should be u8
:
MODER: packed struct {
MODER0: u8,
MODER1: u8,
MODER2: u8,
MODER3: u8,
...
},
Move every hex literal and bit shift from right side of =
to registers.zig
to have cleaner code in main:
regs.RCC.AHB1ENR |= 0x1;
--> regs.RCC.AHB1ENR |= regs.RCC_AHB1ENR_GPIOAEN;
In this exercise i'll try to explore the concept of using variables and loops to count down the blink time.
- Optimization
When a simple loopwhile (count > 0) : (count -= 1) {}
compiled, disassembly looks weird, and the program will not perform as expected. Objdump:
8000040: 6801 ldr r1, [r0]
8000042: f081 0120 eor r1, r1, #32
8000046: 6001 str r1, [r0]
8000048: 6801 ldr r1, [r0]
800004a: f081 0120 eor r1, r1, #32
800004e: 6001 str r1, [r0]
8000050: 6801 ldr r1, [r0]
8000052: f081 0120 eor r1, r1, #32
8000056: 6001 str r1, [r0]
8000058: 6801 ldr r1, [r0]
800005a: f081 0120 eor r1, r1, #32
800005e: 6001 str r1, [r0]
To resolve this issue, adding line that prevents optimization is required:
while (count > 0) : (count -= 1) {
@import("std").mem.doNotOptimizeAway(count);
}
And the trick to make this 2 variable as array(slice) of bytes:
const base: [*]u8 = @ptrFromInt(_heap_start);
const heap = base[0.._heap_start - _heap_end];
Where [*]u8
is many-item pointer to unknown number of items.
- Heap Allocation Strategies Apperently heap allocaton is complex topic which does not have 1 clear solution, it's either simple but very specific or broad (general, that can fit different problems) but quite complex at it's implementaiton and subsequently slower than simple sulution.
USART/UART (Universal Synchronous/Asynchronous Receiver-Transmitter)
there are 6 available in STM32F446RE (USART1, USART2, USART3, USART6, UART4, UART5)
Sending char (u8) through USART.
Data is received in terminal via screen /dev/ttyACM0 115200
, 115200 is baud rate.
This example is limited to sending only one char u8
.
Files used:
main.s
main.zig
registers.zig
linker.ld
Point of this example is to show how to send any type of data through USART.
To achieve this goal, writer
from standard library can be used (std.io.Writer
).
It will allow to use print
method to convert any types to string and send it through.
Also transform raw values to constants from registers file.
- Sending String
writer
function is required to return bytes written. It is very important to send the actual number of bytes, otherwise it will mess up the sending.
- USART type
Data is sent using
[]u8
, transformation fromu32
to[]u8
is required. It is done viaprint
function.
Convert input analog signal to digital, result is sent to USART.
Files used:
main.s
main.zig
registers.zig
linker.ld
- building this program requires
compiler-rt
libraries.
Command will look like this after removing-fno-compiler-rt
:
zig build-exe main.zig startup.s -target thumb-freestanding-none -mcpu cortex_m4 -O ReleaseSafe -Tlinker.ld --name main.elf -fstrip
Convert input analog signal to digital, result is sent to USART. The result from register is previously adjusted to sent range [0..1000] instead of [0..4095]
Files used:
main.s
main.zig
registers.zig
linker.ld
Using general purpose timers, instead of loop, to blink onboard LED. Timer is set by prescaler and auto reload registers.
prescaler
- value which is used to divide clock speed (16MHz/PSC)auto reload register (ARR)
- start value that is used to count down After timer is counted from 0 to ARR, it requires reset by updating SR register (first bit).
Files used:
main.s
main.zig
registers.zig
linker.ld
Configure TIM to blink LED not by software, but automatically by outputting TIM signal straight to PIN5. PA5 need to be configured as an alternate function for TIM2. This can also be considered as PWM mode.
Changing the pulse with with potentiometer.
Move ADC and USART enabling to separate functions.
For this example, approach with generated mmio file will be taken.
to generate file we need source file and a tool, as source file .svd
is used,
can find one in this repository https://github.com/cmsis-svd/cmsis-svd-data,
for pareser use regs
from microzig
https://github.com/ZigEmbeddedGroup/microzig.
generated file also requires microzig
as dependecie, this can be avoided by adding function
to generated file:
const mmio = struct {
pub fn Mmio(comptime PackedT: type) type {
...
...
}
...
...
};
pub const types = struct {
pub const peripherals = struct {
/// Digital camera interface
pub const DCMI = extern struct {
/// control register 1
CR: mmio.Mmio(packed struct(u32) {
/// Capture enable
CAPTURE: u1,
...
}
...
}
...
}
...
};
0.13.0
version of mmio
also had bugged .toggle
and can be fixed by changing for loop,
where commented lines are old ones.
pub inline fn toggle(addr: *volatile Self, fields: anytype) void {
var val = read(addr);
inline for (fields) |field| {
// inline for (@typeInfo(@TypeOf(fields)).Struct.fields) |field| {
@field(val, @tagName(field)) = if (@field(val, @tagName(field)) == 1) 0 else 1;
// @field(val, @tagName(field.default_value.?)) = !@field(val, @tagName(field.default_value.?));
}
write(addr, val);
}
same as 032_usart_writer but with using generated regs
while(periph.USART2.SR.read().TXE == 0) {}
periph.USART2.DR.modify(.{ .DR = byte });
// while((regs.USART2.SR & regs.USART_SR_TXE) != regs.USART_SR_TXE) {}
// regs.USART2.DR = byte;
read()
method ofMMIO
returns full struct with actual bits:
pub inline fn read(addr: *volatile Self) PackedT {
return @bitCast(addr.raw);
}
...
AHB2RSTR: Mmio(packed struct(u32) {
DCMIRST: u1,
reserved7: u6,
OTGFSRST: u1,
padding: u24,
}),
- The equivalent of
regs.USART2.BRR = value
isperiph.USART2.BRR.write_raw(value)
introduce build zig's build system into our flow
start with simple blinky program
build for command:
zig build-exe main.zig startup.s -target thumb-freestanding-none -mcpu cortex_m4 -O ReleaseSafe -Tlinker.ld --name main.elf -fstrip -fno-compiler-rt
build-exe main.zig
b.addExecutable(.{
.root_source_file = b.path("main.zig"),
}
--name main.elf
b.addExecutable(.{
.name = "main.elf",
});
-target thumb-freestanding-none -mcpu cortex_m4
const target = b.resolveTargetQuery(.{
.cpu_arch = .thumb,
.os_tag = .freestanding,
.abi = .eabi,
.cpu_model = std.zig.CrossTarget.CpuModel{
.explicit = &std.Target.arm.cpu.cortex_m4
},
});
-O ReleaseSafe
const exe = b.addExecutable(.{
.optimize = .ReleaseSafe
});
startup.s
exe.addAssemblyFile(b.path("startup.s"));
-Tlinker.ld
exe.setLinkerScript(b.path("linker.ld"));
-fstrip
b.addExecutable(.{
.strip = true,
});
-fno-compiler-rt
exe.bundle_compiler_rt = false;
And the final result will be:
pub fn build(b: *std.Build) void {
const target = b.resolveTargetQuery(.{
.cpu_arch = .thumb,
.os_tag = .freestanding,
.abi = .eabi,
.cpu_model = std.zig.CrossTarget.CpuModel{
.explicit = &std.Target.arm.cpu.cortex_m4,
},
});
const exe = b.addExecutable(.{
.name = "main.elf",
.root_source_file = b.path("main.zig"),
.target = target,
.optimize = .ReleaseSafe,
.strip = true,
});
exe.addAssemblyFile(b.path("startup.s"));
exe.setLinkerScript(b.path("linker.ld"));
exe.bundle_compiler_rt = false;
b.install_prefix = "";
b.installArtifact(exe);
}
Added step in build system to allow user to call openocd
command: openocd -f board/st_nucleo_f4.cfg -c "program zig-out/bin/main.elf verify reset exit"
zig build
zig build opeocd
const run_cmd = b.addSystemCommand(&[_][]const u8{"openocd", "-f", "board/st_nucleo_f4.cfg", "-c", "program zig-out/bin/main.elf verify reset exit"});
b.step("openocd", "runs openocd to flash file into the board").dependOn(&run_cmd.step);
This does require to define every argument of openocd
one by one
Instread of using STM32F446.zig
file inside the directory of example, use one from parent directory.
This will require to add module in build system.
const mmio_mod = b.addModule("mmio", .{
.root_source_file = b.path("../STM32F446.zig"),
});
exe.root_module.addImport("mmio", mmio_mod);
Structure:
/202_build_lib
- main.zig
- build.zig
STM32F446.zig
Sending protobuf structure over USART
Project structure
- 210_build_protobuf
- build.zig
- build.zig.zon // external dependencies
- linker.ld
- main.zig
- simple.pb.zig // generated protobuf schema
- lib
- cobs.zig // algorithm for encoding data
- protobuf
- simple
- simple.proto // protobuf schema
For this example will be added build.zig.zon
file for external dependencies.
Protobuf is good protocol for communication between embedded devices, as it's binary nature
save computation on sending less data.
The downside of using protobuf is that it requires to use shared schema.
Also protobuf does not have any support to determine start or end of messages,
so when sending it over USART we do require some sort of abstraction overhead before sending.
To deal with this problem we can use COBS (Consistent Overhead Byte Stuffing)
and it will be implemented as library libs/cobs
so it could be reused in other examples.
https://github.com/Arwalk/zig-protobuf is used for generating and serializing.
To generate protobuf schema simple.pb.zig
we need to add script into build.zig
:
// make 'protobuf' dependency as a module
const protobuf_dep = b.dependency("protobuf", .{
.target = target,
.optimize = optimize,
});
exe.root_module.addImport("protobuf", protobuf_dep.module("protobuf"));
// add protobuf generation commmand
const gen_proto = b.step("gen-proto", "generates zig files from protocol buffer definitions");
const protoc_step = protobuf.RunProtocStep.create(b, protobuf_dep.builder, b.standardTargetOptions(.{}), .{
// out directory for the generated zig files
.destination_directory = b.path("."),
.source_files = &.{
"../protobuf/simple/sibmple.proto",
},
.include_directories = &.{},
});
gen_proto.dependOn(&protoc_step.step);
To add libs/cobs.zig
:
// add 'cobs' as module
const cobs_mod = b.addModule("cobs", .{
.root_source_file = b.path("../libs/cobs.zig"),
});
exe.root_module.addImport("cobs", cobs_mod);
var fba = std.heap.FixedBufferAllocator.init(heap);
const allocator = fba.allocator();
is a way to initialize allocator. FixedBufferAllocator is good fit for simple tasks as it itself is simple and have no overhead on allocating logic.
To retrieve the data on the other end of USART,
- exporting symbols form linker script
It is possible to pass variables from linker script
.ld
to our program.zig
.
Also need to keep in mind that exported symbol form linker script holds an address, not a value.
_heap_start = ORIGIN(SRAM);
_heap_end = _heap_start + 0x8000;
extern var _heap_start: anyopaque;
extern var _heap_end: anyopaque;
pub export fn _start() void {
const heap_start = @intFromPtr(&_heap_start);
const heap_size = @intFromPtr(&_heap_end);
Also you can do it in SECTIONS
region.
.bss : {
*(.bss)
*(.bss*)
. = ALIGN(4);
PROVIDE(_heap_start = .);
PROVIDE(_heap_size = 0x8000);
} > SRAM
error: ld.lld: undefined symbol: __aeabi_memset
when buildind executable.
Caused byexe.bundle_compiler_rt = false;
in build file, should be removed to solve problem.- When defining
_heap_start
and_heap_end
i put it intoSECTIONS
.
.bss : {
*(.bss)
*(.bss*)
} > SRAM
. = ALIGN(4);
PROVIDE(_heap_start = .);
PROVIDE(_heap_size = 0x8000);
The result is that heap memory was defined at FLASH
and couldn't be changed, as FLASH
memory is rx
(read, execute) only.
- floating point
@intToPtr(*volatile u32, 0xE000ED88).* = ((3 << 10*2)|(3 << 11*2));
send protobuf data over usart using microzig
regs
writer requires update for sending data