Skip to content
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
111 lines (81 sloc) 5.66 KB


A Swiss Army knife of integer data type conversions

The SHUF instruction can be used for many different kinds of integer data type conversions, such as sign extension from smaller to larger data types, byte swizzling and unpacking of packed small data types.

The operation of SHUF is controlled by a 13-bit control word, which can either be given as an immediate value or as a register value.

    dest ← SHUF(src, ctrl)

For each sub-byte of the destination word, an individual sub-byte (0-3, where 0 is the least significant byte) can be selected freely from the source word. Furthermore each sub-byte in the destination register can either be copied from the given source sub-byte, or it can be filled with either zeros or the sign bit (bit 7) of the source sub-byte.

Whether a sub-byte should be filled or not is controlled by a single bit per sub-byte in the control word.

Whether a filled sub-byte should be zero- or sign-filled is selected by the sign mode bit (bit 12) in the control word.

As an example, the least significant signed byte of register S1 can be sign extended to a 32-bit word (stored in S2) using the following instruction (details below):

    shuf s2, s1, #0b1100100100000

More examples of different operations are given below.

Control word legend

Short Description Values
S Sign mode 0: zero fill, 1: sign extend
Fn Copy / fill mode n 0: copy byte, 1: fill byte
In Source byte index n 0-3, "--"=don't care

Sign extension

Signed byte to word

S F3 I3 F2 I2 F1 I1 F0 I0
1 1 00 1 00 1 00 0 00


  • 0x12349ABC0xFFFFFFBC
  • 0xDEF056780x00000078

Signed half-word to word

S F3 I3 F2 I2 F1 I1 F0 I0
1 1 01 1 01 0 01 0 00


  • 0x12349ABC0xFFFF9ABC
  • 0xDEF056780x00005678

Extract packed data

Extract the most significant, unsigned byte

S F3 I3 F2 I2 F1 I1 F0 I0
0 1 -- 1 -- 1 -- 0 11


  • 0x12349ABC0x00000012
  • 0xDEF056780x000000DE

Extract the most significant, signed half-word

S F3 I3 F2 I2 F1 I1 F0 I0
1 1 11 1 11 0 11 0 10


  • 0x12349ABC0x00001234
  • 0xDEF056780xFFFFDEF0

Reverse endianity

Reverse byte order

S F3 I3 F2 I2 F1 I1 F0 I0
0 0 00 0 01 0 10 0 11


  • 0x12349ABC0xBC9A3412
  • 0xDEF056780x7856F0DE

Reverse half-word order

S F3 I3 F2 I2 F1 I1 F0 I0
0 0 01 0 00 0 11 0 10


  • 0x12349ABC0x9ABC1234
  • 0xDEF056780x5678DEF0


Duplicate the least significant byte

S F3 I3 F2 I2 F1 I1 F0 I0
0 0 00 0 00 0 00 0 00


  • 0x12349ABC0xBCBCBCBC
  • 0xDEF056780x78787878

Convert RGBA to ARGB (32-bit color)

S F3 I3 F2 I2 F1 I1 F0 I0
0 0 00 0 11 0 10 0 01


  • 0x12349ABC0xBC12349A
  • 0xDEF056780x78DEF056
You can’t perform that action at this time.