Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ARMv8] C语言的一些有趣的特性 #203

Open
carloscn opened this issue Nov 24, 2023 · 2 comments
Open

[ARMv8] C语言的一些有趣的特性 #203

carloscn opened this issue Nov 24, 2023 · 2 comments
Assignees

Comments

@carloscn
Copy link
Owner

变量定义的位置

在C语言中,变量的定义位置取决于编码风格和标准。在C89(ANSI C)标准中,要求在函数的开始部分声明所有变量。但在C99及之后的标准中,允许在需要时在函数内任何位置声明变量。

C89标准

在C89标凈中,所有变量必须在函数的任何执行语句之前定义。这意味着所有的变量声明都应该在函数的开始部分,通常是在任何逻辑或计算操作之前。

#include <stdio.h>

int main(void)
{
    int ret = 0;
    int a = 9, b = 0, c = 7, d = 6;

    a = 1;
    for (b = 0; b < 88; b ++) {
        c = 7;
        d += a * c * b;
        a += 3;
    }

    return ret;
}

C99及后续标准

C99标准引入了在需要时声明变量的概念。这意味着你可以在函数中的任何位置声明变量,通常是在第一次使用该变量之前。

#include <stdio.h>

int main(void)
{
    int ret = 0;
    int a = 9, d = 6;

    a = 1;
    for (int b = 0; b < 88; b ++) {
        int c = 7;
        d += a * c * b;
        a += 3;
    }

    return ret;
}

最佳实践

  • 可读性和维护性:在需要时声明变量可以提高代码的可读性和维护性。这样可以更容易地追踪变量的用途和生命周期。
  • 性能考虑:在现代编译器中,变量的声明位置对性能的影响微乎其微。编译器优化通常可以处理不同位置的变量声明。
  • 代码风格和一致性:遵循项目或团队的编码风格很重要。如果代码库中其他部分遵循特定的风格,最好保持一致。

关于性能的一些讨论

我们对上面的两个代码的进行objdump输出汇编信息:

image

可以看到在汇编上,提前定义变量的C89风格会多让处理器执行指令,其他的部分几乎一模一样。因此,最好是使用变量的最小生命周期。

@carloscn carloscn self-assigned this Nov 24, 2023
@carloscn
Copy link
Owner Author

关于ARM大小端模式和CPU有关还是编译器有关

结论,ARM大小端模式和CPU有关也和编译器有关系。

ARM默认状态配置为小端模式,编译器不指定编译模式也默认是小端模式。但有些ARM是可以配置为大端模式的。例如:

  • ARMv7-A: In ARMv7-A, the mapping of instruction memory is always little-endian.
  • ARMv7-R: SCTLR.IE, bit[31], that indicates the instruction endianness configuration.
  • ARMv7-M: Armv7-M supports a selectable endian model in which, on a reset, a control input determines whether the endianness is big endian (BE) or little endian (LE). This endian mapping has the following restrictions: The endianness setting only applies to data accesses. Instruction fetches are always little endian. All accesses to the SCS are little endian, see System Control Space (SCS) on page B3-595.
  • ARMv8-aarch32: aarch32: When using AArch32, having the CPSR.E bit have a different value to the equivalent System Control register EE bit when in EL1, EL2, or EL3 is now deprecated
  • ARMv8-aarch64: aarch64: This data endianness is controlled independently for each Execution level. For EL3, EL2 and EL1, the relevant register of SCTLR_ELn.

如果在ARM上面配置了大端模式,gcc编译器则需要增加参数-mbig-endian

-mlittle-endian
Generate code for a processor running in little-endian mode. This is the default for all standard configurations.
-mbig-endian
Generate code for a processor running in big-endian mode; the default is to compile code for a little-endian processor.

与CPU配置保持一致。

参考文献:

https://gcc.gnu.org/onlinedocs/gcc-4.3.2/gcc/ARM-Options.html
https://developer.arm.com/documentation/den0024/a/ARMv8-Registers/Endianness?lang=en
https://developer.arm.com/documentation/ddi0406/c/Application-Level-Architecture/Application-Level-Memory-Model/Endian-support/Instruction-endianness

@carloscn
Copy link
Owner Author

carloscn commented Nov 24, 2023

Volatile关键字

面试的时候,很多面试官,en,会问,volatile有啥用啊?今天我们就来了基于ARM架构,聊聊在ARMv7/v8架构上,volatile到底是什么东西。

在编译器配置较高优化级别的时候,可能程序中会出现一些问题,这些问题在较低的优化不太明显,volatile易失性限定符可以告诉编译器,不要对该变量做过多的优化。这种优化可以通过以下场景复现:

  • 在轮询的时候,代码可能会卡在循环中
  • 多线程代码出现奇怪的行为
  • 一些人为的延迟代码被优化掉。

volatile标记会被编译器识别,可以在实现外部随时修改变量,例如操作系统、其他执行线程(中断、信号处理)或硬件,这样就间接给这个变量增加了一个保护,外部随时修改的变量是可能被外面更改的,因此每次在代码中引用该值的时候,需要从内存中去读取,而不是缓存到寄存器中。(将某个变量缓存到寄存器里面是ARM处理器的一种优化手段)。在实现睡眠和计时延迟的上下文,需要将变量声明为volatile告诉编译器需要特定类型的行为。

The two versions of the routine differ only in the way that buffer_full is declared. The first routine version is incorrect. Notice that the variable buffer_full is not qualified as volatile in this version. In contrast, the second version of the routine shows the same loop where buffer_full is correctly qualified as volatile.

Nonvolatile version of buffer loop Volatile version of buffer loop
image-20220413213827285 image-20220413213842311

Table 13 shows the corresponding disassembly of the machine code produced by the compiler for each of the sample versions in Table 8, where the C code for each implementation has been compiled using the option -O2.

image-20220413214136517

如果采用-O2的优化等级,还是看出差异了,左边的r1寄存器被load了r1的值,之后就没有再load了,然后就在L1.12的分支里面无限循环。而在加了volatile的版本,每次都要LDR这个值。

要注意以下情况需要volatile:

  • 访问内存映射的外围设备
  • 在多个线程中共享的全局变量
  • 在中断程序或者信号处理中访问的全局变量

Ref

https://developer.arm.com/documentation/dui0472/c/compiler-coding-practices/compiler-optimization-and-the-volatile-keyword

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant