Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible bug in xtensa-esp32-elf-g++ (crosstool-NG esp-2021r2-patch5) 8.4.0 #1375

Closed
calint opened this issue May 10, 2024 · 4 comments
Closed

Comments

@calint
Copy link

calint commented May 10, 2024

I am a novice at reporting issues, forgive me if I am not precise.

The compiler generates code resulting in a bug which is solved by placing an asm("nop") between two lines of code. Below is the generated code for the version without the nop exhibiting the reproducible bug and the version generated with the nop which fixes the issue. The comments in the code contain further explanation.

Tested on 2 different ESP32s with same result: ESP32-D0WD-V3 revision 3.1 and ESP32-S3 revision 0.2
OS: Ubuntu 24.04 LTS

Note that in Arduino IDE 2.3.2 the bug does not happen.

version with re-producible bug

auto allocate_instance() -> Type * {
    if (free_ptr_ >= free_end_) {
   3:   0000a1          l32r    a10, fffc0004 <_ZN6bullet21on_death_by_collisionEv+0xfffc0004>
   6:   2a98            l32i.n  a9, a10, 8
   8:   3a88            l32i.n  a8, a10, 12
   a:   7bb987          bgeu    a9, a8, 89 <_ZN6bullet21on_death_by_collisionEv+0x89>
      return nullptr;
    }
    Type *inst = *free_ptr_;
   d:   0988            l32i.n  a8, a9, 0
    free_ptr_++;
    *alloc_ptr_ = inst;
   f:   5ab8            l32i.n  a11, a10, 20
    free_ptr_++;
  11:   994b            addi.n  a9, a9, 4
  13:   2a99            s32i.n  a9, a10, 8
    *alloc_ptr_ = inst;
  15:   0b89            s32i.n  a8, a11, 0
  explicit game_object(object_class c) : cls{c} {}
  17:   000091          l32r    a9, fffc0018 <_ZN6bullet21on_death_by_collisionEv+0xfffc0018>
    inst->alloc_ptr = alloc_ptr_;
    // asm("nop"); // !! fixes bug
    // !! bug does not set inst->alloc_ptr in -O3
    // !! print statement between these 2 lines also fixes the bug
    // !! possible UB code somewhere else?
    alloc_ptr_++;
  1a:   bb4b            addi.n  a11, a11, 4
  1c:   5ab9            s32i.n  a11, a10, 20
  1e:   5899            s32i.n  a9, a8, 20
  20:   6899            s32i.n  a9, a8, 24
  22:   7899            s32i.n  a9, a8, 28
  24:   8899            s32i.n  a9, a8, 32
  26:   9899            s32i.n  a9, a8, 36
  28:   a899            s32i.n  a9, a8, 40
  2a:   390c            movi.n  a9, 3
    if (free_ptr_ >= free_end_) {
  2c:   0000c1          l32r    a12, fffc002c <_ZN6bullet21on_death_by_collisionEv+0xfffc002c>
  2f:   304892          s8i     a9, a8, 48

It seems that the buggy version does not generate anything for inst->alloc_ptr = alloc_ptr_;. Does the compiler falsely assume that it is irrelevant?

version with bug fixed by a nop:

auto allocate_instance() -> Type * {
    if (free_ptr_ >= free_end_) {
   3:   000091          l32r    a9, fffc0004 <_ZN6bullet21on_death_by_collisionEv+0xfffc0004>
   6:   29a8            l32i.n  a10, a9, 8
   8:   3988            l32i.n  a8, a9, 12
   a:   023a87          bltu    a10, a8, 10 <_ZN6bullet21on_death_by_collisionEv+0x10>
   d:   002146          j       96 <_ZN6bullet21on_death_by_collisionEv+0x96>
      return nullptr;
    }
    Type *inst = *free_ptr_;
  10:   0a88            l32i.n  a8, a10, 0
    free_ptr_++;
    *alloc_ptr_ = inst;
  12:   59b8            l32i.n  a11, a9, 20
    free_ptr_++;
  14:   aa4b            addi.n  a10, a10, 4
  16:   29a9            s32i.n  a10, a9, 8
    *alloc_ptr_ = inst;
  18:   0b89            s32i.n  a8, a11, 0
    inst->alloc_ptr = alloc_ptr_;
  1a:   18b9            s32i.n  a11, a8, 4
    asm("nop"); // !! fixes bug
  1c:   f03d            nop.n
    // !! bug does not set inst->alloc_ptr in -O3
    // !! print statement between these 2 lines also fixes the bug
    // !! possible UB code somewhere else?
    alloc_ptr_++;
  1e:   59c8            l32i.n  a12, a9, 20
  explicit game_object(object_class c) : cls{c} {}
  20:   0000b1          l32r    a11, fffc0020 <_ZN6bullet21on_death_by_collisionEv+0xfffc0020>
  23:   cc4b            addi.n  a12, a12, 4
  25:   59c9            s32i.n  a12, a9, 20
  27:   58b9            s32i.n  a11, a8, 20
  29:   68b9            s32i.n  a11, a8, 24
  2b:   78b9            s32i.n  a11, a8, 28
  2d:   88b9            s32i.n  a11, a8, 32
  2f:   98b9            s32i.n  a11, a8, 36
  31:   a8b9            s32i.n  a11, a8, 40
  33:   3b0c            movi.n  a11, 3
    if (free_ptr_ >= free_end_) {
  35:   0000a1          l32r    a10, fffc0038 <_ZN6bullet21on_death_by_collisionEv+0xfffc0038>
  38:   3048b2          s8i     a11, a8, 48

The source of the class containing the code:

#pragma once
//
// implements a O(1) store of objects
//
// template parameters:
// * 'Type' is object type. 'Type' must contain public field 'Type **alloc_ptr'
// * Size is number of preallocated objects
// * StoreId used for debugging
// * InstanceSizeInBytes is custom size of instance to fit largest object in an
//   object hierarchy or 0 if 'Type' sizeof is used
//
// note. no destructor since life-time is program life-time
//

// reviewed: 2024-05-01

template <typename Type, const int Size, const int StoreId = 0,
          const int InstanceSizeInBytes = 0>
class o1store {
  Type *all_ = nullptr;
  Type **free_bgn_ = nullptr;
  Type **free_ptr_ = nullptr;
  Type **free_end_ = nullptr;
  Type **alloc_bgn_ = nullptr;
  Type **alloc_ptr_ = nullptr;
  Type **del_bgn_ = nullptr;
  Type **del_ptr_ = nullptr;
  Type **del_end_ = nullptr;

public:
  o1store() {
    if (InstanceSizeInBytes) {
      all_ = static_cast<Type *>(calloc(Size, InstanceSizeInBytes));
    } else {
      all_ = static_cast<Type *>(calloc(Size, sizeof(Type)));
    }
    free_ptr_ = free_bgn_ = static_cast<Type **>(calloc(Size, sizeof(Type *)));
    alloc_ptr_ = alloc_bgn_ =
        static_cast<Type **>(calloc(Size, sizeof(Type *)));
    del_ptr_ = del_bgn_ = static_cast<Type **>(calloc(Size, sizeof(Type *)));

    if (!all_ || !free_bgn_ || !alloc_bgn_ || !del_bgn_) {
      printf("!!! o1store %d: could not allocate arrays\n", StoreId);
      exit(1);
    }

    free_end_ = free_bgn_ + Size;
    del_end_ = del_bgn_ + Size;

    // write pointers to instances in the 'free' list
    Type *all_it = all_;
    for (Type **free_it = free_bgn_; free_it < free_end_; free_it++) {
      *free_it = all_it;
      if (InstanceSizeInBytes) {
        all_it = reinterpret_cast<Type *>(reinterpret_cast<char *>(all_it) +
                                          InstanceSizeInBytes);
      } else {
        all_it++;
      }
    }
  }

  // allocates an instance
  // returns nullptr if instance could not be allocated
  // !! note.
  // !! issue when built in platformio 6.1.15
  // !! not an issue when built in arduino ide 2.3.1
  // !! __attribute__((optimize("O3"))) fixes the bug below
  // !! __attribute__((always_inline)) triggers bug
  // !! __attribute__((noinline)) fixes bug
  auto allocate_instance() -> Type * {
    if (free_ptr_ >= free_end_) {
      return nullptr;
    }
    Type *inst = *free_ptr_;
    free_ptr_++;
    *alloc_ptr_ = inst;
    inst->alloc_ptr = alloc_ptr_;
    asm("nop"); // !! fixes bug
    // !! bug does not set inst->alloc_ptr in -O3
    // !! print statement between these 2 lines also fixes the bug
    // !! possible UB code somewhere else?
    alloc_ptr_++;
    return inst;
  }

  // adds instance to list of instances to be freed with 'apply_free()'
  void free_instance(Type *inst) {
    if (del_ptr_ >= del_end_) {
      printf("!!! o1store %d: free overrun\n", StoreId);
      exit(1);
    }
    *del_ptr_ = inst;
    del_ptr_++;
  }

  // deallocates the instances that have been freed
  void apply_free() {
    for (Type **it = del_bgn_; it < del_ptr_; it++) {
      Type *inst_deleted = *it;
      alloc_ptr_--;
      Type *inst_to_move = *alloc_ptr_;
      inst_to_move->alloc_ptr = inst_deleted->alloc_ptr;
      *(inst_deleted->alloc_ptr) = inst_to_move;
      free_ptr_--;
      *free_ptr_ = inst_deleted;
    }
    del_ptr_ = del_bgn_;
  }

  // returns list of allocated instances
  inline auto allocated_list() const -> Type ** { return alloc_bgn_; }

  // returns length of list of allocated instances
  inline auto allocated_list_len() const -> int {
    return alloc_ptr_ - alloc_bgn_;
  }

  // returns one past the end of allocated instances list
  inline auto allocated_list_end() const -> Type ** { return alloc_ptr_; }

  // returns the list with all preallocated instances
  inline auto all_list() const -> Type * { return all_; }

  // returns the length of 'all' list
  constexpr auto all_list_len() const -> int { return Size; }

  // returns instance at index 'ix' from 'all' list
  inline auto instance(int ix) const -> Type * {
    if (!InstanceSizeInBytes) {
      return &all_[ix];
    }
    // note. if instance size is specified do pointer shenanigans
    return reinterpret_cast<Type *>(reinterpret_cast<char *>(all_) +
                                    InstanceSizeInBytes * ix);
  }

  // returns the size of allocated heap memory in bytes
  constexpr auto allocated_data_size_B() const -> int {
    return InstanceSizeInBytes
               ? (Size * InstanceSizeInBytes + 3 * Size * sizeof(Type *))
               : (Size * sizeof(Type) + 3 * Size * sizeof(Type *));
  }
};

Kind regards

@Jason2866
Copy link
Contributor

Jason2866 commented May 10, 2024

Are you sure using the same version? The Arduino IDE 2.3.2 tells nothing about which Arduino core and compiler version used.
It is highly not the case since the used toolchains for ArduinoIDE and Platformio are the same when using the same core.

@calint
Copy link
Author

calint commented May 10, 2024

I looked at the verbose build output in Arduino and it seems to be the same compiler version and patch.

I am starting to think that the optimizer removes it due to some assumption that is false. It might be a bug.

@Jason2866
Copy link
Contributor

I would open an issue in this repo https://github.com/espressif/crosstool-NG
It is not a Platformio issue at all.

@calint
Copy link
Author

calint commented May 10, 2024

Ok. Thanks.

The solution is compiler flag "-flifetime-dse=1"

The compiler assumes something falsely so technically it is a bug.

@calint calint closed this as completed May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants