Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic in Result::unwrap() due to "InvalidShiftAmount" when analyzing STM32 Firmware #469

Closed
ljungnickel opened this issue May 28, 2024 · 3 comments · Fixed by #470
Closed
Assignees
Labels
bug Something isn't working

Comments

@ljungnickel
Copy link

cwe_checker --verbose PLC.elf thread 'main' panicked at src/cwe_checker_lib/src/intermediate_representation/bitvector.rs:74:14: called Result::unwrap() on an Err value: Error { kind: InvalidShiftAmount { shift_amount: ShiftAmount(32), width: BitWidth(32) }, message: "Encountered invalid shift amount of ShiftAmount(32) on bit-width with BitWidth(32) bits.", annotation: None } stack backtrace: 0: rust_begin_unwind 1: core::panicking::panic_fmt 2: core::result::unwrap_failed 3: <apint::apint::ApInt as cwe_checker_lib::intermediate_representation::bitvector::BitvectorExtended>::subpiece 4: cwe_checker_lib::abstract_domain::interval::simple_interval::Interval::subpiece_higher 5: <cwe_checker_lib::abstract_domain::interval::IntervalDomain as cwe_checker_lib::abstract_domain::RegisterDomain>::subpiece 6: cwe_checker_lib::abstract_domain::data::arithmetics::<impl cwe_checker_lib::abstract_domain::RegisterDomain for cwe_checker_lib::abstract_domain::data::DataDomain<T>>::subpiece 7: cwe_checker_lib::analysis::pointer_inference::state::access_handling::<impl cwe_checker_lib::analysis::pointer_inference::state::State>::eval_recursive 8: cwe_checker_lib::analysis::pointer_inference::state::access_handling::<impl cwe_checker_lib::analysis::pointer_inference::state::State>::eval_recursive 9: cwe_checker_lib::analysis::pointer_inference::context::trait_impls::<impl cwe_checker_lib::analysis::forward_interprocedural_fixpoint::Context for cwe_checker_lib::analysis::pointer_inference::context::Context>::update_def 10: cwe_checker_lib::analysis::fixpoint::Computation<T>::compute_with_max_steps 11: cwe_checker_lib::analysis::pointer_inference::PointerInference::compute 12: cwe_checker_lib::analysis::pointer_inference::run 13: cwe_checker_lib::pipeline::results::AnalysisResults::compute_pointer_inference 14: cwe_checker::main note: Some details are omitted, run with RUST_BACKTRACE=full for a verbose backtrace.

The firmware under test is from here: https://github.com/fuzzware-fuzzer/fuzzware/tree/main/examples/P2IM/PLC

My setup is working flawlessly for other ELF / BIN + firmware-config files, I guess there is something weird going on in this ELF.
PLC.zip

@Enkelmann
Copy link
Contributor

Well, shifting a 32-bit value by 32 bits is a weird way of computing zero. So this is either a very weird assembly instruction or a bug during disassembly of some particular instruction. Currently, I do not have time to look at it, but maybe @vobst has some time?

@vobst
Copy link
Collaborator

vobst commented Jun 3, 2024

@Enkelmann I will take a look :)

@vobst
Copy link
Collaborator

vobst commented Jun 13, 2024

tl;dr: Piecing together a full register assignment from a subregister assignment produces a semantically invalid expression if the subregister has the same name as the full register.

Bug (long version)

The offending CPU instruction and its low pcode are:

  08000272 85 fa     uadd8  r5,r5,r7
           47 f5
                     $Ud0700:4 = INT_ZEXT r5:1
                     $Ud0780:4 = INT_ZEXT r7:1
                     $Ud0880:4 = INT_ADD $Ud0700:4, $Ud0780:4
                     $Ud0900:4 = INT_ZEXT *[register]0x35:1
                     $Ud0980:4 = INT_ZEXT *[register]0x3d:1
                     $Ud0a80:4 = INT_ADD $Ud0900:4, $Ud0980:4
                     $Ud0b00:4 = INT_ZEXT *[register]0x36:1
                     $Ud0b80:4 = INT_ZEXT *[register]0x3e:1
                     $Ud0c80:4 = INT_ADD $Ud0b00:4, $Ud0b80:4
                     $Ud0d00:4 = INT_ZEXT *[register]0x37:1
                     $Ud0d80:4 = INT_ZEXT *[register]0x3f:1
                     $Ud0e80:4 = INT_ADD $Ud0d00:4, $Ud0d80:4
                     GE1 = INT_CARRY r5:1, r7:1
                     GE2 = INT_CARRY *[register]0x35:1, *[register]0x3d:1
                     GE3 = INT_CARRY *[register]0x36:1, *[register]0x3e:1
                     GE4 = INT_CARRY *[register]0x37:1, *[register]0x3f:1
                     r5:1 = SUBPIECE $Ud0880:4, 0:4
                     *[register]0x35:1 = SUBPIECE $Ud0a80:4, 0:4
                     *[register]0x36:1 = SUBPIECE $Ud0c80:4, 0:4
                     *[register]0x37:1 = SUBPIECE $Ud0e80:4, 0:4

For reference, here are also the semantics according to the ARM (ARM Reference Manual).

if ConditionPassed() then
    EncodingSpecificOperations();
    sum1 = UInt(R[n]<7:0>) + UInt(R[m]<7:0>);
    sum2 = UInt(R[n]<15:8>) + UInt(R[m]<15:8>);
    sum3 = UInt(R[n]<23:16>) + UInt(R[m]<23:16>);
    sum4 = UInt(R[n]<31:24>) + UInt(R[m]<31:24>);
    R[d]<7:0>   = sum1<7:0>;
    R[d]<15:8>  = sum2<7:0>;
    R[d]<23:16> = sum3<7:0>;
    R[d]<31:24> = sum4<7:0>;
    PSTATE.GE<0>  = if sum1 >= 0x100 then '1' else '0';
    PSTATE.GE<1>  = if sum2 >= 0x100 then '1' else '0';
    PSTATE.GE<2>  = if sum3 >= 0x100 then '1' else '0';
    PSTATE.GE<3>  = if sum4 >= 0x100 then '1' else '0';

The instruction essentially adds the four individual bytes of the registers separately.

Now this is what we get out of the Ghidra plugin

    DEF [id:instr_08000272_0 a:08000272] R<n:$Ucf780 s:4 v:1> = INT_ZEXT(R<n:r5 s:1 v:0>)
    DEF [id:instr_08000272_1 a:08000272] R<n:$Ucf800 s:4 v:1> = INT_ZEXT(R<n:r7 s:1 v:0>)
    DEF [id:instr_08000272_2 a:08000272] R<n:$Ucf900 s:4 v:1> = INT_ADD(R<n:$Ucf780 s:4 v:1>, R<n:$Ucf800 s:4 v:1>)
    DEF [id:instr_08000272_3 a:08000272] R<n:$Ucf980 s:4 v:1> = INT_ZEXT(R<n:$U{}_unnamed s:1 v:1>)
    DEF [id:instr_08000272_4 a:08000272] R<n:$Ucfa00 s:4 v:1> = INT_ZEXT(R<n:$U{}_unnamed s:1 v:1>)
    DEF [id:instr_08000272_5 a:08000272] R<n:$Ucfb00 s:4 v:1> = INT_ADD(R<n:$Ucf980 s:4 v:1>, R<n:$Ucfa00 s:4 v:1>)
    DEF [id:instr_08000272_6 a:08000272] R<n:$Ucfb80 s:4 v:1> = INT_ZEXT(R<n:$U{}_unnamed s:1 v:1>)
    DEF [id:instr_08000272_7 a:08000272] R<n:$Ucfc00 s:4 v:1> = INT_ZEXT(R<n:$U{}_unnamed s:1 v:1>)
    DEF [id:instr_08000272_8 a:08000272] R<n:$Ucfd00 s:4 v:1> = INT_ADD(R<n:$Ucfb80 s:4 v:1>, R<n:$Ucfc00 s:4 v:1>)
    DEF [id:instr_08000272_9 a:08000272] R<n:$Ucfd80 s:4 v:1> = INT_ZEXT(R<n:$U{}_unnamed s:1 v:1>)
    DEF [id:instr_08000272_10 a:08000272] R<n:$Ucfe00 s:4 v:1> = INT_ZEXT(R<n:$U{}_unnamed s:1 v:1>)
    DEF [id:instr_08000272_11 a:08000272] R<n:$Ucff00 s:4 v:1> = INT_ADD(R<n:$Ucfd80 s:4 v:1>, R<n:$Ucfe00 s:4 v:1>)
    DEF [id:instr_08000272_12 a:08000272] R<n:GE1 s:1 v:0> = INT_CARRY(R<n:r5 s:1 v:0>, R<n:r7 s:1 v:0>)
    DEF [id:instr_08000272_13 a:08000272] R<n:GE2 s:1 v:0> = INT_CARRY(R<n:$U{}_unnamed s:1 v:1>, R<n:$U{}_unnamed s:1 v:1>)
    DEF [id:instr_08000272_14 a:08000272] R<n:GE3 s:1 v:0> = INT_CARRY(R<n:$U{}_unnamed s:1 v:1>, R<n:$U{}_unnamed s:1 v:1>)
    DEF [id:instr_08000272_15 a:08000272] R<n:GE4 s:1 v:0> = INT_CARRY(R<n:$U{}_unnamed s:1 v:1>, R<n:$U{}_unnamed s:1 v:1>)
    DEF [id:instr_08000272_16 a:08000272] R<n:r5 s:1 v:0> = SUBPIECE(R<n:$Ucf900 s:4 v:1>, C<v:00000000 s:4 v:0>)
    DEF [id:instr_08000272_17 a:08000272] R<n:$U{}_unnamed s:1 v:1> = SUBPIECE(R<n:$Ucfb00 s:4 v:1>, C<v:00000000 s:4 v:0>)
    DEF [id:instr_08000272_18 a:08000272] R<n:$U{}_unnamed s:1 v:1> = SUBPIECE(R<n:$Ucfd00 s:4 v:1>, C<v:00000000 s:4 v:0>)
    DEF [id:instr_08000272_19 a:08000272] R<n:$U{}_unnamed s:1 v:1> = SUBPIECE(R<n:$Ucff00 s:4 v:1>, C<v:00000000 s:4 v:0>)

Here we also see an unrelated issue: Reading single bytes of registers is accomplished in pcode by directly accessing the Register Address Space via offsets, e.g., *[register]0x35:1 for reading the 2nd byte of r5, and we are (intentionally) mishandling this case already in our Ghidra plugin.

Now the unnormalized IR looks like this:

    DEF [instr_08000272_0] $Ucf780:4(temp) = IntZExt((r5:4)[0-0])
    DEF [instr_08000272_1] $Ucf800:4(temp) = IntZExt((r7:4)[0-0])
    DEF [instr_08000272_2] $Ucf900:4(temp) = ($Ucf780:4(temp) + $Ucf800:4(temp))
    DEF [instr_08000272_3] $Ucf980:4(temp) = IntZExt($U{}_unnamed:1(temp))
    DEF [instr_08000272_4] $Ucfa00:4(temp) = IntZExt($U{}_unnamed:1(temp))
    DEF [instr_08000272_5] $Ucfb00:4(temp) = ($Ucf980:4(temp) + $Ucfa00:4(temp))
    DEF [instr_08000272_6] $Ucfb80:4(temp) = IntZExt($U{}_unnamed:1(temp))
    DEF [instr_08000272_7] $Ucfc00:4(temp) = IntZExt($U{}_unnamed:1(temp))
    DEF [instr_08000272_8] $Ucfd00:4(temp) = ($Ucfb80:4(temp) + $Ucfc00:4(temp))
    DEF [instr_08000272_9] $Ucfd80:4(temp) = IntZExt($U{}_unnamed:1(temp))
    DEF [instr_08000272_10] $Ucfe00:4(temp) = IntZExt($U{}_unnamed:1(temp))
    DEF [instr_08000272_11] $Ucff00:4(temp) = ($Ucfd80:4(temp) + $Ucfe00:4(temp))
    DEF [instr_08000272_12] GE1:1 = ((r5:4)[0-0] IntCarry (r7:4)[0-0])
    DEF [instr_08000272_13] GE2:1 = ($U{}_unnamed:1(temp) IntCarry $U{}_unnamed:1(temp))
    DEF [instr_08000272_14] GE3:1 = ($U{}_unnamed:1(temp) IntCarry $U{}_unnamed:1(temp))
    DEF [instr_08000272_15] GE4:1 = ($U{}_unnamed:1(temp) IntCarry $U{}_unnamed:1(temp))
    DEF [instr_08000272_16] r5:4 = ((r5:4)[4-3] Piece ($Ucf900:4(temp))[0-0])
    DEF [instr_08000272_17] $U{}_unnamed:1(temp) = ($Ucfb00:4(temp))[0-0]
    DEF [instr_08000272_18] $U{}_unnamed:1(temp) = ($Ucfd00:4(temp))[0-0]
    DEF [instr_08000272_19] $U{}_unnamed:1(temp) = ($Ucff00:4(temp))[0-0]

Here we can see that:

    DEF [id:instr_08000272_16 a:08000272] R<n:r5 s:1 v:0> = SUBPIECE(R<n:$Ucf900 s:4 v:1>, C<v:00000000 s:4 v:0>)

got changed to

    DEF [instr_08000272_16] r5:4 = ((r5:4)[4-3] Piece ($Ucf900:4(temp))[0-0])

Which happens since we convert assignments to subregisters to assignments to the full registers by piecing together a large value from the current register value and the small value that is being assigned. However, here clearly something went wrong as (r5:4)[4-3] is supposed to be the upper 3 bytes of r5 but is in fact a subpiece of size 0. Subpieces of size 0 are illegal which is why the code panics much further down the line when we try to evaluate the expression over the PI abstract domain. Now, the reason why we produce the slice of size 0 is since in

fn replace_output_subregister(&mut self, def: Term<Def>) {
        match &def.term {
            Def::Assign { var, value } => {
                if let Some(register) = self.register_map.get(&var.name) {
                    // var = Variable { name: "r5", size: ByteSize(1), is_temp: false }
                    // value = ($Ucf900:4(temp))[0-0]
                    // register = RegisterProperties { register: "r5", base_register: "r5", lsb: ByteSize(0), size: ByteSize(4) }
                    if var.name != register.base_register || var.size < register.size { // false || true
                        if self.is_next_def_cast_to_base_register(var) { // false
                        ...
                        } else {
                            // base_register = RegisterProperties { register: "r5", base_register: "r5", lsb: ByteSize(0), size: ByteSize(4) }
                            let base_register: &RegisterProperties =
                                self.register_map.get(&register.base_register).unwrap();
                            let output_var: Variable = base_register.into();
                            let output_expression =
                                piece_base_register_assignment_expression_together( // panic in here
                                    value,
                                    base_register,
                                    register,
                                );
...

So the problem is that register (the subregister being assigned) and base_register (the full register) are the same. Of course this is wrong but it happens since they both share the name r5. This leads to the required piece of the full register needed to patch the value for a full-sized assignment being of length zero. Consequently piece_base_register_assignment_expression_together produces a semantically invalid expression.

Fix

As far as I can tell the code in piece_base_register_assignment_expression_together is correct under the following preconditions

register.size < base_register.size AND value.size == register.size

both of which are violated there. I guess an obvious way to fix this would be to set register.size = value.size unconditionally. Probably adding an assertion that value.size == var.size also would not hurt.

Discussion

  • The accidental finding that our treatment of the uadd8 instruction is unsound is quite worrying.
  • The underlying bug is in a part of the code that will likely be part of the big pcode-to-ir-translation-process refactoring. Thus, I'd not put too much love into it now.

@vobst vobst self-assigned this Jun 13, 2024
@vobst vobst added the bug Something isn't working label Jun 13, 2024
vobst pushed a commit to vobst/cwe_checker that referenced this issue Jun 14, 2024
Assignments to subregisters that have the same name as the full register
may lead to expressions that have SUBPIECE operations that would result
in zero-sized values. In general, the subregister substitution produces
semantically incorrect IR code in those cases.

I observed that this occurs when assignments are made to the lower
bytes of a larger register. Those cases should be handled by this
patch. If this may also happen for assignments to other subregisters,
then this patch is incomplete (see FIXME).

Reported-by: https://github.com/ljungnickel
Closes: fkie-cad#469
Signed-off-by: Valentin Obst <valentin.obst@fkie.fraunhofer.de>
@vobst vobst closed this as completed in 3868877 Jun 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants