Skip to content
hasu@tmk edited this page May 12, 2021 · 16 revisions

Delay

https://www.nongnu.org/avr-libc/user-manual/group__util__delay.html

_delay_ms(double __ms)

The maximal possible delay is 262.14 ms / F_CPU in MHz. When the user request delay which exceed the maximum possible one, _delay_ms() provides a decreased resolution functionality. In this mode _delay_ms() will work with a resolution of 1/10 ms, providing delays up to 6.5535 seconds (independent from CPU frequency). The user will not be informed about decreased resolution.

When F_CPU is 16000000(16MHz):

  • 6553.5ms at max with full resolution
  • 16.38375ms at max with 1/10 resolution

This is for old and backward-compatibility implementation.

If the avr-gcc toolchain has __builtin_avr_delay_cycles() support, maximal possible delay is 4294967.295 ms/ F_CPU in MHz. For values greater than the maximal possible delay, overflows results in no delay i.e., 0ms.

When F_CPU is 16000000(16MHz):

  • 268435.4559375ms at max

This is be default now unless __DELAY_BACKWARD_COMPATIBLE__ macro is defined.

void
_delay_ms(double __ms)
{
        double __tmp ;
#if __HAS_DELAY_CYCLES && defined(__OPTIMIZE__) && \
  !defined(__DELAY_BACKWARD_COMPATIBLE__) &&       \
  __STDC_HOSTED__
        uint32_t __ticks_dc;
        extern void __builtin_avr_delay_cycles(unsigned long);
        __tmp = ((F_CPU) / 1e3) * __ms;

        #if defined(__DELAY_ROUND_DOWN__)
                __ticks_dc = (uint32_t)fabs(__tmp);

        #elif defined(__DELAY_ROUND_CLOSEST__)
                __ticks_dc = (uint32_t)(fabs(__tmp)+0.5);

        #else
                //round up by default
                __ticks_dc = (uint32_t)(ceil(fabs(__tmp)));
        #endif

        __builtin_avr_delay_cycles(__ticks_dc);

#else
        uint16_t __ticks;
        __tmp = ((F_CPU) / 4e3) * __ms;
        if (__tmp < 1.0)
                __ticks = 1;
        else if (__tmp > 65535)
        {
                //      __ticks = requested delay in 1/10 ms
                __ticks = (uint16_t) (__ms * 10.0);
                while(__ticks)
                {
                        // wait 1/10 ms
                        _delay_loop_2(((F_CPU) / 4e3) / 10);
                        __ticks --;
                }
                return;
        }
        else
                __ticks = (uint16_t)__tmp;
        _delay_loop_2(__ticks);
#endif
}

_delay_us(double __us)

The maximal possible delay is 768 us / F_CPU in MHz.

When F_CPU is 16000000(16MHz):

  • 48us at max

This is for old and backward-compatibility implementation.

If the avr-gcc toolchain has __builtin_avr_delay_cycles() support, maximal possible delay is 4294967.295 us/ F_CPU in MHz. For values greater than the maximal possible delay, overflow results in no delay i.e., 0us.

When F_CPU is 16000000(16MHz):

  • 268435.4559375us at max

This is be default now unless __DELAY_BACKWARD_COMPATIBLE__ macro is defined.

void
_delay_us(double __us)
{
        double __tmp ;
#if __HAS_DELAY_CYCLES && defined(__OPTIMIZE__) && \
  !defined(__DELAY_BACKWARD_COMPATIBLE__) &&       \
  __STDC_HOSTED__
        uint32_t __ticks_dc;
        extern void __builtin_avr_delay_cycles(unsigned long);
        __tmp = ((F_CPU) / 1e6) * __us;

        #if defined(__DELAY_ROUND_DOWN__)
                __ticks_dc = (uint32_t)fabs(__tmp);

        #elif defined(__DELAY_ROUND_CLOSEST__)
                __ticks_dc = (uint32_t)(fabs(__tmp)+0.5);

        #else
                //round up by default
                __ticks_dc = (uint32_t)(ceil(fabs(__tmp)));
        #endif

        __builtin_avr_delay_cycles(__ticks_dc);

#else
        uint8_t __ticks;
        double __tmp2 ;
        __tmp = ((F_CPU) / 3e6) * __us;
        __tmp2 = ((F_CPU) / 4e6) * __us;
        if (__tmp < 1.0)
                __ticks = 1;
        else if (__tmp2 > 65535)
        {
                _delay_ms(__us / 1000.0);
        }
        else if (__tmp > 255)
        {
                uint16_t __ticks=(uint16_t)__tmp2;
                _delay_loop_2(__ticks);
                return;
        }
        else
                __ticks = (uint8_t)__tmp;
        _delay_loop_1(__ticks);
#endif
}

_delay_loop_1(uint8_t __count)

Delay loop using an 8-bit counter \c __count, so up to 256 iterations are possible. (The value 256 would have to be passed as 0.) The loop executes three CPU cycles per iteration, not including the overhead the compiler needs to setup the counter register.
Thus, at a CPU speed of 1 MHz, delays of up to 768 microseconds can be achieved.

void
_delay_loop_1(uint8_t __count)
{
        __asm__ volatile (
                "1: dec %0" "\n\t"
                "brne 1b"
                : "=r" (__count)
                : "0" (__count)
        );
}

_delay_loop_2(uint16_t __count)

Delay loop using a 16-bit counter \c __count, so up to 65536 iterations are possible. (The value 65536 would have to be passed as 0.) The loop executes four CPU cycles per iteration, not including the overhead the compiler requires to setup the counter register pair. Thus, at a CPU speed of 1 MHz, delays of up to about 262.1 milliseconds can be achieved.

void
_delay_loop_2(uint16_t __count)
{
        __asm__ volatile (
                "1: sbiw %0,1" "\n\t"
                "brne 1b"
                : "=w" (__count)
                : "0" (__count)
        );
}

void __builtin_avr_delay_cycles (unsigned long ticks)

https://gcc.gnu.org/onlinedocs/gcc/AVR-Built-in-Functions.html

Instruction Set

http://ww1.microchip.com/downloads/en/devicedoc/atmel-0856-avr-instruction-set-manual.pdf

Status Register (SREG)

SREG Status Register
C Carry Flag
Z Zero Flag
N Negative Flag
V Two’s complement overflow indicator
S N ⊕ V, for signed tests
H Half Carry Flag
T Transfer bit used by BLD and BST instructions
I Global Interrupt Enable/Disable Flag

Registers and Operands

Rd: Destination (and source) register in the Register File
Rr: Source register in the Register File
R: Result after instruction is executed
K: Constant data
k: Constant address
b: Bit in the Register File or I/O Register (3-bit)
s: Bit in the Status Register (3-bit)
X,Y,Z: Indirect Address Register (X=R27:R26, Y=R29:R28, and
Z=R31:R30)
A: I/O location address
q: Displacement for direct addressing (6-bit)

Interrupt Latency

the time that elapses from when an interrupt is generated to when the source of the interrupt is serviced.

https://en.wikipedia.org/wiki/Interrupt_latency (A Beginner’s Guide on Interrupt Latency - ARM)[https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/beginner-guide-on-interrupt-latency-and-interrupt-latency-of-the-arm-cortex-m-processors]

AVR Interrupts

https://microchipdeveloper.com/8avr:int

http://ww1.microchip.com/downloads/en/Appnotes/Atmel-8468-Using-External-Interrupts-for-megaAVR-Devices_ApplicationNote_AVR1200.pdf

Interrupt Response Time

The interrupt execution response for all the enabled AVR interrupts is five clock cycles minimum. After five clock cycles the program vector address for the actual interrupt handling routine is exe-cuted. During these five clock cycle period, the Program Counter is pushed onto the Stack. The vector is normally a jump to the interrupt routine, and this jump takes three clock cycles. If an interrupt occurs during execution of a multi-cycle instruction, this instruction is completed before the interrupt is served. If an interrupt occurs when the MCU is in sleep mode, the interrupt exe-cution response time is increased by five clock cycles. This increase comes in addition to the start-up time from the selected sleep mode.

These five clock cycles include: • Three cycles for pushing the Program Counter (PC) value into the stack • One cycle for updating the stack pointer • One cycle for clearing the Global interrupt enable (I) bit

A return from an interrupt handling routine takes five clock cycles. During these five clock cycles, the Program Counter (three bytes) is popped back from the Stack, the Stack Pointer is incre-mented by three, and the I-bit in SREG is set.

Minimizing Interrupt Response Time:

http://web.engr.oregonstate.edu/~traylor/ece473/pdfs/minimize_interrupt_response_time.pdf

GCC Inline Assembly

https://www.nongnu.org/avr-libc/user-manual/inline_asm.html

https://gcc.gnu.org/onlinedocs/gcc/Using-Assembly-Language-with-C.html#Using-Assembly-Language-with-C

Constaraints

https://gcc.gnu.org/onlinedocs/gcc/Constraints.html#Constraints

AVR GCC ABI

https://gcc.gnu.org/wiki/avr-gcc

Register Layout

Values that occupy more than one 8-bit register start in an even register.

Fixed Registers

Fixed Registers are registers that won't be allocated by GCC's register allocator. Registers R0 and R1 are fixed and used implicitly while printing out assembler instructions:

R0

is used as scratch register that need not to be restored after its usage. It must be saved and restored in interrupt service routine's (ISR) prologue and epilogue. In inline assembler you can use tmp_reg for the scratch register.

R1

always contains zero. During an insn the content might be destroyed, e.g. by a MUL instruction that uses R0/R1 as implicit output register. If an insn destroys R1, the insn must restore R1 to zero afterwards. This register must be saved in ISR prologues and must then be set to zero because R1 might contain values other than zero. The ISR epilogue restores the value. In inline assembler you can use zero_reg for the zero register.

T

the T flag in the status register (SREG) is used in the same way like the temporary scratch register R0. User-defined global registers by means of global register asm and / or -ffixed-n won't be saved or restored in function pro- and epilogue.

Call-Used Registers

The call-used or call-clobbered general purpose registers (GPRs) are registers that might be destroyed (clobbered) by a function call.

R18–R27, R30, R31

These GPRs are call clobbered. An ordinary function may use them without restoring the contents. Interrupt service routines (ISRs) must save and restore each register they use.

R0, T-Flag

The temporary register and the T-flag in SREG are also call-clobbered, but this knowledge is not exposed explicitly to the compiler (R0 is a fixed register).

Call-Saved Registers

R2–R17, R28, R29

The remaining GPRs are call-saved, i.e. a function that uses such a registers must restore its original content. This applies even if the register is used to pass a function argument.

R1

The zero-register is implicity call-saved (implicit because R1 is a fixed register).

Frame Layout

During compilation the compiler may come up with an arbitrary number of ''pseudo registers'' which will be allocated to ''hard registers'' during register allocation.

  • Pseudos that don't get a hard register will be put into a stack slot and loaded / stored as needed.
  • In order to access stack locations, avr-gcc will set up a 16-bit frame pointer in R29:R28 (Y) because the stack pointer (SP) cannot be used to access stack slots.
  • The stack grows downwards. Smaller addresses are at the bottom of the drawing below.
  • Stack pointer and frame pointer are not aligned, i.e. 1-byte aligned.
  • After the function prologue, the frame pointer will point one byte below the stack frame, i.e. Y+1 points to the bottom of the stack frame.
  • Any of "incoming arguments", "saved registers" or "stack slots" in the drawing below may be empty.
  • Even "return address" may be empty which happens for functions that are tail-called.

Frame Layout after Function Prologue:

start address/top
:
:
+-------------------------------------+
|incoming arguments                   |
+-------------------------------------+
|return address (2–3 bytes)           |
+-------------------------------------+
|saved registers                      |
+-------------------------------------+
|stack slots                          |
+-------------------------------------+ < Y+1
:
:
last address/bottom

Calling Convention

  • An argument is passed either completely in registers or completely in memory.
  • To find the register where a function argument is passed, initialize the register number Rn with R26 and follow this procedure:
  1. If the argument size is an odd number of bytes, round up the size to the next even number.
  2. Subtract the rounded size from the register number Rn.
  3. If the new Rn is at least R8 and the size of the object is non-zero, then the low-byte of the argument is passed in Rn. Subsequent bytes of the argument are passed in the subsequent registers, i.e. in increasing register numbers.
  4. If the new register number Rn is smaller than R8 or the size of the argument is zero, the argument will be passed in memory.
  5. If the current argument is passed in memory, stop the procedure: All subsequent arguments will also be passed in memory.
  6. If there are arguments left, goto 1. and proceed with the next argument.
  • Return values with a size of 1 byte up to and including a size of 8 bytes will be returned in registers. Return values whose size is outside that range will be returned in memory.
  • If a return value cannot be returned in registers, the caller will allocate stack space and pass the address as implicit first pointer argument to the callee. The callee will put the return value into the space provided by the caller.
  • If the return value of a function is returned in registers, the same registers are used as if the value was the first parameter of a non-varargs function. For example, an 8-bit value is returned in R24 and an 32-bit value is returned R22...R25.
  • Arguments of varargs functions are passed on the stack. This applies even to the named arguments.

For example, suppose a function with the following prototype:

int func (char a, long b);

then

  • a will be passed in R24.
  • b will be passed in R20, R21, R22 and R23 with the LSB in R20 and the MSB in R23.
  • The result is returned in R24 (LSB) and R25 (MSB).

Reduced Tiny

On the Reduced Tiny cores (16 GPRs only) several modifications to the ABI above apply:

  • Call-saved registers are: R18–R19, R28–R29.
  • Fixed Registers are R16 (__tmp_reg__) and R17 (__zero_reg__).
  • Registers used to pass arguments to functions and return values from functions are R25...R18 (instead of R25...R8).

There is only limited library support both from libgcc and AVR-LibC, for example there is no float support and no printf support.

Clone this wiki locally