Skip to content

Intrinsics are high level functions implemented in C language and are based in some ISAs. The mainly purpose is simulate these architectures in SiNUCA (Simulator of Non-Uniforme Caches)..

License

ascordeiro/intrinsics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

--------------------------------------------------------------------------------
-
- Aline Santana Cordeiro - ascordeiro@inf.ufpr.br
- LSE - Embedded Sistems Laboratory - 2018
- PPGInf - Federal University of Paraná
-
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-                               Intrinsics Library                             -
--------------------------------------------------------------------------------
-
-   The Intrinsics-HMC, Intrinsics-MIPS, and Intrinsics-VIMA Libraries were
- developed to simulate the ISA's execution of Hybrid Memory Cube (HMC),
- described in HMC 2.1 specification (http://www.hybridmemorycube.org), MIPS,
- and our Vector-in-Memory architecture, that is based on Processing-in-Memory
- architecture but allowing vector instructions execution. The mainly purpose
- is to write a C or C++ program using these libraries to generate a simulation
- trace to simulate the program's execution in one of these architectures. This
- task can be achieved by using the program's binary file as SiNUCA-tracer
- entry, that interprets the libraries instructions and generate the traces in
- a simulation specific format.
-
--------------------------------------------------------------------------------
-
- DATA TYPES -
-
- HMC:
- __h16u1:  16-bit unsigned integer;
- __h64u1:  64-bit unsigned integer;
- __h64u2:  Two 64-bit unsigned integers in a vector;
- __h128u1: 128-bit unsigned integer;
-
- MIPS:
- __m32s1: 32-bit signed integer;
- __m32u1: 32-bit unsigned integer;
- __m64s1: 64-bit signed integer;
- __m64u1: 64-bit unsigned integer;
-
- VIMA:
- __v32s: 32-bit signed integer;
- __v32u: 32-bit unsigned integer;
- __v64s: 64-bit signed integer;
- __v64u: 64-bit unsigned integer;
- __v32f: 32-bit signed float;
-
-   The data types were renamed inspired in Intel Intrinsics function pattern.
- Data types are identified by the first 2 underlines characteres. The third
- character identify the choosen architecture. The next 2 characteres indicates
- the variable size in bits. The next charactere indicates if is a signed or
- unsigned number (for VIMA, float is also an option). For HMC and MIPS, the
- last one indicates the number of variables assigned to that type (As VIMA is
- an vectorized instruction, the instruction lenght is assigned to instruction
- name).
-
--------------------------------------------------------------------------------
-
-------------------------------- HMC FUNCTIONS ---------------------------------
-
- The implementation is based on HMC 2.1 specification and does not follows
- exactly the real behave described in the specification.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- ARITHMETIC:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- Dual 8-Byte Signed Add Immediate:
- __h64u2 *_hmc64_saddimm_d(__h64u2 *mem_op, __h64u2 *imm_op)
- Sums two memory operands with two immediate operands (8-byte). The operands
- must have 4-bytes size and must be in two-complement with left zeroes padding
- (4-byte). The result is returned to the call function.
-
- Single 16-Byte Signed Add Immediate:
- __h128u1 _hmc128_saddimm_s(__h128u1 *mem_op, __h128u1 *imm_op)
- Sums a memory operand (16-byte) with an immediate operand in two-complement
- with left zeroes padding (16-byte). The result is returned to the call
- function.
-
- 8-Byte Increment:
- __h64u1 _hmc64_incr_s(__h64u1 *mem_op)
- Increments a memory operand (8-byte) in one unity. The result is returned to
- the call function.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- LOGIC:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- 8-Byte Bit Write:
- __h64u1 _hmc64_bwrite_s(__h64u1 mem_op, __h64u1 imm_op, __h64u1 mask)
- The mask field (8-Byte) selects which immediate operand bits (8-byte) must be
- written in the same positions of memory operand (8-Byte). The result is
- returned to the call function.
-
- 16-Byte Swap:
- __h128u1 _hmc128_bswap_s(__h128u1 mem_op, __h128u1 imm_op)
- Stores the immediate operand value into the memory operand address (16-Byte).
- Returns the original memory operand value (16-Byte).
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- BOOLEAN:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- 16-Byte AND:
- __h128u1 _hmc128_and_s(__h128u1 mem_op, __h128u1 imm_op)
- Stores the AND operation's result between the memory operand (16-byte) and
- immediate operand (16-byte) into the memory operand address. Returns the
- original memory operand value.
-
- 16-Byte NAND:
- __h128u1 _hmc128_nand_s(__h128u1 mem_op, __h128u1 imm_op)
- Stores the NAND operation's result between the memory operand (16-byte) and
- immediate operand (16-byte) into the memory operand address. Returns the
- original memory operand value.
-
- 16-Byte NOR:
- __h128u1 _hmc128_nor_s(__h128u1 mem_op, __h128u1 imm_op)
- Stores the NOR operation's result between the memory operand (16-byte) and
- immediate operand (16-byte) into the memory operand address. Returns the
- original memory operand value.
-
- 16-Byte OR:
- __h128u1 _hmc128_or_s(__h128u1 mem_op, __h128u1 imm_op)
- Stores the OR operation's result between the memory operand (16-byte) and
- immediate operand (16-byte)  into the memory operand address. Returns the
- original memory operand value.
-
- 16-Byte XOR:
- __h128u1 _hmc128_xor_s(__h128u1 mem_op, __h128u1 imm_op)
- Stores the XOR operation's result between the memory operand (16-byte) and
- immediate operand (16-byte) into the memory operand address. Returns the
- original memory operand value.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- COMPARISON:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- 8-Byte Compare and Swap if Greater Than:
- __h128u1 _hmc64_cmpswapgt_s(__h64u1 mem_op, __h64u1 imm_op)
- Stores into the memory operand address the greater value between the memory
- operand (8-byte) and immediate operand (8-byte). Returns the original memory
- operand value.
-
- 8-Byte Compare and Swap if Less Than:
- __h128u1 _hmc64_cmpswaplt_s(__h64u1 mem_op, __h64u1 imm_op)
- Stores into the memory operand address the smaller value between the memory
- operand (8-byte) and immediate operand (8-byte). Returns the original memory
- operand value.
-
- 16-Byte Compare and Swap if Greater Than:
- __h128u1 _hmc128_cmpswapgt_s(__h128u1 mem_op, __h128u1 imm_op)
- Stores into the memory operand address the greater value between the memory
- operand (16-byte) and immediate operand (16-byte). Returns the original memory
- operand value.
-
- 16-Byte Compare and Swap if Less Than:
- __h128u1 _hmc128_cmpswaplt_s(__h128u1 mem_op, __h128u1 imm_op)
- Stores into the memory operand address the smaller value between the memory
- operand (16-byte) and immediate operand (16-byte). Returns the original memory
- operand value.
-
- 8-Byte Compare and Swap if Equal:
- __h64u1 _hmc64_cmpswapeq_s(__h64u1 mem_op, __h64u1 imm_op, __h64u1 cmp_field)
- Compares the cmp_field (8-byte) with memory operand value (8-byte). If equal,
- stores the immediate operand value (8-byte) into the memory operand address.
- Returns the original memory operand value.
-
- 16-Byte Compare and Swap if Zero:
- __h128u1 _hmc128_cmpswapz_s(__h128u1 mem_op, __h128u1 imm_op)
- Compares the memory operand value (16-byte) with zero. If equal, stores the
- immediate operand value (16-byte) into the memory operand address. Returns
- the original memory operand value.
-
- 8-Byte Equal To:
- __h16u1 _hmc64_cmpswapgt_s(__h64u1 mem_op, __h64u1 imm_op)
- Verifies is the memory operand value (8-byte) is equal to the immediate
- operand value (8-byte). Returns 1 if equal, 0 if not.
-
- 16-Byte Equal To:
- __h16u1 _hmc128_cmpswapgt_s(__h128u1 mem_op, __h128u1 imm_op)
- Verifies is the memory operand value (16-byte) is equal to the immediate
- operand value (16-byte). Returns 1 if equal, 0 if not.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- CUSTOMIZED:
- These instructions were created to fit better in usual programming problems.
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- 8-Byte Compare if Greater Than or equal:
- __h16u1 _hmc64_cmpgteq_s(__h64u1 *mem_op, __h64u1 imm_op)
- Verifies if the memory operand value is minor or equal to the immediate
- operand value. If true, returns 0 to the call function, otherwise, returns 1.
-
- 8-Byte Compare if Less Than or equal:
- __h16u1 _hmc64_cmplteq_s(__h64u1 *mem_op, __h64u1 imm_op)
- Verifies if the memory operand value is greater or equal to the immediate
- operand value. If true, returns 0 to the call function, otherwise, returns 1.
-
- 8-Byte Compare if Less Than:
- __h16u1 _hmc64_cmplt_s(__h64u1 *mem_op, __h64u1 imm_op)
- Verifies if the memory operand value is greater than the immediate operand
- value. If true, returns 0 to the call function, otherwise, returns 1.
-
--------------------------------------------------------------------------------
-
-------------------------------- MIPS FUNCTIONS --------------------------------
-
- The implementation is based on MIPS ISA and does not follows exactly the real
- behave of the architecture. All the instructions were implemented, except
- memory and floating point operations.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- ARITHMETIC:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- Add:
- __m32s1 _mips_add(__m32s1 rs, __m32s1 rt)
- Sums the rs and rt registers and returns the result to the call function.
-
- Add Unsigned:
- __m32u1 _mips_addu(__m32u1 rs, __m32u1 rt)
- Sums the rs and rt registers and returns the result to the call function.
-
- Subtract:
- __m32s1 _mips_sub(__m32s1 rs, __m32s1 rt)
- Subtracts the rs and rt registers and returns the result to the call function.
-
- Subtract Unsigned:
- __m32u1 _mips_subu(__m32u1 rs, __m32u1 rt)
- Subtracts the rs and rt registers and returns the result to the call function.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- IMMEDIATE ARITHMETIC:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- Add Immediate:
- __m32s1 _mips_addi(__m32s1 rs, __m32s1 imm_op)
- Sums the rs register and immediate operand and returns the result to the call
- function.
-
- Add Immediate Unsigned:
- __m32u1 _mips_addiu(__m32u1 rs, __m32u1 imm_op)
- Sums the rs register and immediate operand and returns the result to the call
- function.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- LOGIC:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- And:
- __m32u1 _mips_and(__m32u1 rs, __m32u1 rt)
- Applies the AND operation between the rs and rt registers and returns the
- result to the call function.
-
- Nor:
- __m32u1 _mips_nor(__m32u1 rs, __m32u1 rt)
- Applies the NOR operation between the rs and rt registers and returns the
- result to the call function.
-
- Or:
- __m32u1 _mips_or(__m32u1 rs, __m32u1 rt)
- Applies the OR operation between the rs and rt registers and returns the
- result to the call function.
-
- Xor:
- __m32u1 _mips_xor(__m32u1 rs, __m32u1 rt)
- Applies the XOR operation between the rs and rt registers and returns the
- result to the call function.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- IMMEDIATE LOGIC:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- And Immediate:
- __m32u1 _mips_andi(__m32u1 rs, __m32u1 imm_op)
- Applies the AND operation between the rs registers and immediate operand and
- returns the result to the call function.
-
- Or Immediate:
- __m32u1 _mips_ori(__m32u1 rs, __m32u1 imm_op)
- Applies the OR operation between the rs registers and immediate operand and
- returns the result to the call function.
-
- Xor Immediate:
- __m32u1 _mips_xori(__m32u1 rs, __m32u1 imm_op)
- Applies the XOR operation between the rs registers and immediate operand and
- returns the result to the call function.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- COMPARISON:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- Set Less Than:
- __m32s1 _mips_slt(__m32s1 rs, __m32s1 rt)
- Compares the rs and rt registers. Returns 1 if rs is minor than rt.
- 0, otherwise.
-
- Set Less Than Unsigned:
- __m32u1 _mips_sltu(__m32u1 rs, __m32u1 rt)
- Compares the rs and rt registers. Returns 1 if rs is minor than rt.
- 0, otherwise.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- IMMEDIATE COMPARISON:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- Set Less Than Immediate:
- __m32s1 _mips_slti(__m32s1 rs, __m32s1 imm_op)
- Compares the rs and rt registers. Returns 1 if rs is minor than rt.
- 0, otherwise.
-
- Set Less Than Immediate Unsigned:
- __m32u1 _mips_sltiu(__m32u1 rs, __m32u1 imm_op)
- Compares the rs and rt registers. Returns 1 if rs is minor than rt.
- 0, otherwise.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- SHIFT:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- Shift Left Logical:
- __m32u1 _mips_sll(__m32u1 rt, __m32u1 shamt)
- Shifts to the left the shamt value in rt register. Returns the result to the
- call function.
-
- Shift Right Logical:
- __m32u1 _mips_srl(__m32u1 rt, __m32u1 shamt)
- Shifts to the right the shamt value in rt register. Returns the result to the
- call function.
-
- Shift Right Arithmetic:
- __m32s1 _mips_sra(__m32s1 rt, __m32s1 shamt)
- Shifts to the right the shamt value in rt register keeping rt signal. Returns
- the result to the call function.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- MULTIPLICATION/DIVISION:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- Divide Only:
- __m32s1 _mips_div(__m32s1 rs, __m32s1 rt)
- Returns to the call function the division between rs and rt registers.
-
- Divide Only Unsigned:
- __m32u1 _mips_divu(__m32u1 rs, __m32u1 rt)
- Returns to the call function the division between rs and rt registers.
-
- Module Only:
- __m32s1 _mips_mod(__m32s1 rs, __m32s1 rt)
- Returns to the call function the module operation between rs and rt registers.
-
- Module Only Unsigned:
- __m32u1 _mips_modu(__m32u1 rs, __m32u1 rt)
- Returns to the call function the module operation between rs and rt registers.
-
- Multiply 32-bits:
- __m32s1 _mips_mult32(__m32s1 rs, __m32s1 rt)
- Multiplies the rs and rt registers and return the result to the call function.
-
- Multiply 32-bits Unsigned:
- __m32u1 _mips_multu32(__m32u1 rs, __m32u1 rt)
- Multiplies the rs and rt registers and return the result to the call function.
-
- Multiply 64-bits:
- __m64s1 _mips_mult64(__m32s1 rs, __m32s1 rt)
- Multiplies the rs and rt registers and return the result to the call function.
-
- Multiply 64-bits Unsigned:
- __m64u1 _mips_multu64(__m32u1 rs, __m32u1 rt)
- Multiplies the rs and rt registers and return the result to the call function.
-
--------------------------------------------------------------------------------
-
-------------------------------- VIMA FUNCTIONS -------------------------------
-
- The implementation is based on MIPS and ARM NEON specification and the model
- is inspired in HIVE module for HMC, to vectorize data transfer inside HMC and
- it does not follows exactly the real behave described in these specifications.
- As VIMA implements vectorized instructions, the vector size is specified
- below.
-
- VM64I: 256-bytes array size to integer types;
- VM2KI: 8-Kbytes arrey size to integer types;
- VM32L: 256-bytes array size to long integer types;
- VM1KL: 8-Kbytes array size to long integer types;
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- ARITHMETIC:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- 32-bit Add (64 integers):
- _vim64_iadds(__v32s *a, __v32s *b, __v32s *c)
- Perform signed addition between 32-bit elements source vectors A[0:63]
- and B[0:63] and stores the result into the destination vector C[0:63].
-
- 32-bit Add (2048 integers):
- _vim2K_iadds(__v32s *a, __v32s *b, __v32s *c)
- Perform signed addition between 32-bit elements source vectors A[0:2047]
- and B[0:2047] and stores the result into the destination vector C[0:2047].
-
- 32-bit Add Unsigned (64 integers):
- _vim64_iaddu(__v32u *a, __v32u *b, __v32u *c)
- Perform unsigned addition between 32-bit elements source vectors A[0:63]
- and B[0:63] and stores the result into the destination vector C[0:63].
-
- 32-bit Add Unsigned (2048 integers):
- _vim2K_iaddu(__v32u *a, __v32u *b, __v32u *c)
- Perform unsigned addition between 32-bit elements source vectors A[0:2047]
- and B[0:2047] and stores the result into the destination vector C[0:2047].
-
- 32-bit Subtract (64 integers):
- _vim64_isubs(__v32s *a, __v32s *b, __v32s *c)
- Perform signed subtraction between 32-bit elements source vectors A[0:63]
- and B[0:63] and stores the result into the destination vector C[0:63].
-
- 32-bit Subtract (2048 integers):
- _vim2K_isubs(__v32s *a, __v32s *b, __v32s *c)
- Perform signed subtraction between 32-bit elements source vectors A[0:2047]
- and B[0:2047] and stores the result into the destination vector C[0:2047].
-
- 32-bit Subtract Unsigned (64 integers):
- _vim64_isubu(__v32u *a, __v32u *b, __v32u *c)
- Perform unsigned subtraction between 32-bit elements source vectors A[0:63]
- and B[0:63] and stores the result into the destination vector C[0:63].
-
- 32-bit Subtract Unsigned (2048 integers):
- _vim2K_isubu(__v32u *a, __v32u *b, __v32u *c)
- Perform unsigned subtraction between 32-bit elements source vectors A[0:2047]
- and B[0:2047] and stores the result into the destination vector C[0:2047].
-
- 32-bit Abs (64 integers):
- _vim64_iabss(__v32s *a, __v32s *b)
- Takes the absolute value of each 32-bit element in a source vector A[0:63]
- and stores it into the destination vector B[0:63].
-
- 32-bit Abs (2048 integers):
- _vim2K_iabss(__v32s *a, __v32s *b)
- Takes the absolute value of each 32-bit element in a source vector A[0:2047]
- and stores it into the destination vector B[0:2047].
-
- 32-bit Max (64 integers):
- _vim64_imaxs(__v32s *a, __v32s *b, __v32s *c)
- Find the maximal value between each 32-bit element of source vectors A[0:63]
- and B[0:63] and stores it into the destination vector C[0:63].
-
- 32-bit Max (2048 integers):
- _vim2K_imaxs(__v32s *a, __v32s *b, __v32s *c)
- Find the maximal value between each 32-bit element of source vectors A[0:2047]
- and B[0:2047] and stores it into the destination vector C[0:2047].
-
- 32-bit Min (64 integers):
- _vim64_imins(__v32s *a, __v32s *b, __v32s *c)
- Find the minimal value between each 32-bit element of source vectors A[0:63]
- and B[0:63] and stores it into the destination vector C[0:63].
-
- 32-bit Min (2048 integers):
- _vim2K_imins(__v32s *a, __v32s *b, __v32s *c)
- Find the minimal value between each 32-bit element of source vectors A[0:2047]
- and B[0:2047] and stores it into the destination vector C[0:2047].
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- LOGIC:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- 32-bit And (64 integers):
- _vim64_iandu(__v32u *a, __v32u *b, __v32u *c)
- Perform AND operation between 32-bit elements source vectors A[0:63] and
- B[0:63] and stores the result into the destination vector C[0:63].
-
- 32-bit And (2048 integers):
- _vim2K_iandu(__v32u *a, __v32u *b, __v32u *c)
- Perform AND operation between 32-bit elements source vectors A[0:2047] and
- B[0:2047] and stores the result into the destination vector C[0:2047].
-
- 32-bit Or (64 integers):
- _vim64_iorun(__v32u *a, __v32u *b, __v32u *c)
- Perform OR operation between 32-bit elements source vectors A[0:63] and
- B[0:63] and stores the result into the destination vector C[0:63].
-
- 32-bit Or (2048 integers):
- _vim2K_iorun(__v32u *a, __v32u *b, __v32u *c)
- Perform OR operation between 32-bit elements of source vectors A[0:2047] and
- B[0:2047] and stores the result into the destination vector C[0:2047].
-
- 32-bit Xor (64 integers):
- _vim64_ixoru(__v32u *a, __v32u *b, __v32u *c)
- Perform XOR operation between 32-bit elements source vectors A[0:63] and
- B[0:63] and stores the result into the destination vector C[0:63].
-
- 32-bit Xor (2048 integers):
- _vim2K_ixoru(__v32u *a, __v32u *b, __v32u *c)
- Perform XOR operation between 32-bit elements source vectors A[0:2047] and
- B[0:2047] and stores the result into the destination vector C[0:2047].
-
- 32-bit Not (64 integers):
- _vim64_inots(__v32s *a, __v32s *b)
- Perform NOT operation in 32-bit elements source vector A[0:63] and stores
- the result into the destination vector B[0:63].
-
- 32-bit Not (2048 integers):
- _vim2K_inots(__v32s *a, __v32s *b)
- Perform NOT operation in 32-bit elements source vector A[0:2047] and stores
- the result into the destination vector B[0:2047].
-
- 32-bit Mask (64 integers):
- _vim64_imsks(__v32s *a, __v32s *b, __v32s *c)
- Insert each signed 32-bit element source vector A[0:63] into the destination
- vector C[0:63] if the corresponding 32-bit element from source vector B[0:63]
- is 0, otherwise, it leaves the destination vector unVManged.
-
- 32-bit Mask (2048 integers):
- _vim2K_imsks(__v32s *a, __v32s *b, __v32s *c)
- Insert each signed 32-bit element source vector A[0:2047] into the destination
- vector C[0:2047] if the corresponding 32-bit element from source vector
- B[0:2047] is 0, otherwise, it leaves the destination vector unVManged.
-
- 32-bit Masku (64 integers):
- _vim64_imsku(__v32u *a, __v32u *b, __v32u *c)
- Insert each unsigned 32-bit element source vector A[0:63] into the destination
- vector C[0:63] if the corresponding 32-bit element from source vector B[0:63]
- is 0, otherwise, it leaves the destination vector unVManged.
-
- 32-bit Masku (2048 integers):
- Insert each unsigned 32-bit element source vector A[0:2047] into the
- destination vector C[0:2047] if the corresponding 32-bit element from source
- vector B[0:2047] is 0, otherwise, it leaves the destination vector unVManged.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- COMPARISON:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- 32-bit Set Less Than (64 integers):
- _vim64_islts(__v32s *a, __v32s *b, __v32s *c)
- Compare each signed 32-bit elements from source vectors A[0:63] and B[0:63]
- and if the element of A[0:63] is minor, then destination source C[0:63]
- stores 1 in the same position, otherwise, stores 0.
-
- 32-bit Set Less Than (2048 integers):
- _vim2K_islts(__v32s *a, __v32s *b, __v32s *c)
- Compare each signed 32-bit elements from source vectors A[0:2047] and
- B[0:2047] and if the element of A[0:2047] is minor, then destination source
- C[0:2047] stores 1 in the same position, otherwise, stores 0.
-
- 32-bit Set Less Than Unsigned (64 integers):
- _vim64_isltu(__v32u *a, __v32u *b, __v32u *c)
- Compare each unsigned 32-bit elements from source vectors A[0:63] and B[0:63]
- and if the element of A[0:63] is minor, then destination source C[0:63]
- stores 1 in the same position, otherwise, stores 0.
-
- 32-bit Set Less Than Unsigned (2048 integers):
- _vim2K_isltu(__v32u *a, __v32u *b, __v32u *c)
- Compare each unsigned 32-bit elements from source vectors A[0:2047] and
- B[0:2047] and if the element of A[0:2047] is minor, then destination source
- C[0:2047] stores 1 in the same position, otherwise, stores 0.
-
- 32-bit Compare if equal (64 integers):
- _vim64_icmqs(__v32s *a, __v32s *b, __v32s *c)
- Compare each signed 32-bit elements from source vectors A[0:63] and B[0:63]
- and if they are equal, then destination source C[0:63] stores 1 in the same
- position, otherwise, stores 0.
-
- 32-bit Compare if equal (2048 integers):
- _vim2K_icmqs(__v32s *a, __v32s *b, __v32s *c)
- Compare each signed 32-bit elements of source vectors A[0:2047] and B[0:2047]
- and if they are equal, then destination source C[0:2047] stores 1 in the same
- position, otherwise, stores 0.
-
- 32-bit Compare if equal Unsigned (64 integers):
- _vim64_icmqu(__v32u *a, __v32u *b, __v32u *c)
- Compare each unsigned 32-bit elements of source vectors A[0:63] and B[0:63]
- and if they are equal, then destination source C[0:63] stores 1 in the same
- position, otherwise, stores 0.
-
- 32-bit Compare if equal Unsigned (2048 integers):
- _vim2K_icmqu(__v32u *a, __v32u *b, __v32u *c)
- Compares each unsigned 32-bit elements of source vectors A[0:2047] and
- B[0:2047] and if they are equal, then destination source C[0:2047] stores 1
- in the same position, otherwise, stores 0.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- SHIFT:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- 32-bit Shift Left Logical (64 integers):
- _vim64_isllu(__v32u *a, __v32u *b, __v32u *c)
- Left shift each 32-bit element in source vector A[0:63] the amount specified
- in source vector B[0:63] and stores the result into the destination vector
- C[0:63]. This operation does not shift signal.
-
- 32-bit Shift Left Logical (2048 integers):
- _vim2K_isllu(__v32u *a, __v32u *b, __v32u *c)
- Left shift each 32-bit element in source vector A[0:2047] the amount specified
- in source vector B[0:2047] and stores the result into the destination vector
- C[0:2047]. This operation does not shift signal.
-
- 32-bit Shift Right Logical (64 integers):
- _vim64_isrlu(__v32u *a, __v32u *b, __v32u *c)
- Right shift eaVM 32-bit element in source vector A[0:63] the amount specified
- in source vector B[0:63] and stores the result into the destination vector
- C[0:63]. This operation does not shift signal.
-
- 32-bit Shift Right Logical (2048 integers):
- _vim2K_isrlu(__v32u *a, __v32u *b, __v32u *c)
- Right shift each 32-bit element in source vector A[0:2047] the amount specified
- in source vector B[0:2047] and stores the result into the destination vector
- C[0:2047]. This operation does not shift signal.
-
- 32-bit Shift Right Arithmetic (64 integers):
- _vim64_isras(__v32s *a, __v32s *b, __v32s *c)
- Right shift each 32-bit element in source vector A[0:63] the amount specified
- in source vector B[0:63] and stores the result into the destination vector
- C[0:63]. This operation shifts signal.
-
- 32-bit Shift Right Arithmetic (2048 integers):
- _vim2K_isras(__v32s *a, __v32s *b, __v32s *c)
- Right shift each 32-bit element in source vector A[0:2047] the amount specified
- in source vector B[0:2047] and stores the result into the destination vector
- C[0:2047]. This operation shifts signal.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- MULTIPLICATION/DIVISION:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- 32-bit Divide Only (64 integers):
- _vim64_idivs(__v32s *a, __v32s *b, __v32s *c)
- Perform a signed division between 32-bit elements from source vectors A[0:63]
- and B[0:63] and stores the result into the destination vector C[0:63].
-
- 32-bit Divide Only (2048 integers):
- _vim2K_idivs(__v32s *a, __v32s *b, __v32s *c)
- Perform a signed division between 32-bit elements from source vectors A[0:2047]
- and B[0:2047] and stores the result into the destination vector C[0:2047].
-
- 32-bit Divide Only Unsigned (64 integers):
- _vim64_idivu(__v32u *a, __v32u *b, __v32u *c)
- Perform an unsigned division between 32-bit elements from source vectors
- A[0:63] and B[0:63] and stores the result into the destination vector C[0:63].
-
- 32-bit Divide Only Unsigned (2048 integers):
- _vim2K_idivu(__v32u *a, __v32u *b, __v32u *c)
- Perform an unsigned division between 32-bit elements from source vectors
- A[0:2047] and B[0:2047] and stores the result into the destination vector
- C[0:2047].
-
- 32-bit Module Only (64 integers):
- _vim64_imods(__v32s *a, __v32s *b, __v32s *c)
- Perform a signed module operation between 32-bit elements from source vectors
- A[0:63] and B[0:63] and stores the result into the destination vector C[0:63].
-
- 32-bit Module Only (2048 integers):
- _vim2K_imods(__v32s *a, __v32s *b, __v32s *c)
- Perform a signed module operation between 32-bit elements of source vectors
- A[0:2047] and B[0:2047] and stores the result into the destination vector
- C[0:2047].
-
- 32-bit Module Only Unsigned (64 integers):
- _vim64_imodu(__v32u *a, __v32u *b, __v32u *c)
- Perform an unsigned module operation between 32-bit elements from source
- vectors A[0:63] and B[0:63] and stores the result into the destination vector
- C[0:63].
-
- 32-bit Module Only Unsigned (2048 integers):
- _vim2K_imodu(__v32u *a, __v32u *b, __v32u *c)
- Perform an unsigned module operation between 32-bit elements from source
- vectors A[0:2047] and B and stores the result into the destination vector
- C[0:2047].
-
- 32-bit Multiply (64 integers):
- _vim64_imuls(__v32s *a, __v32s *b, __v32s *c)
- Perform signed multiplication between 32-bit elements from source vectors
- A[0:63] and B[0:63] and stores the result into the destination vector
- C[0:63].
-
- 32-bit Multiply (2048 integers):
- _vim2K_imuls(__v32s *a, __v32s *b, __v32s *c)
- Perform signed multiplication between 32-bit elements from source vectors
- A[0:2047] and B[0:2047] and stores the result into the destination vector
- C[0:2047].
-
- 32-bit Multiply Unsigned (64 integers):
- _vim64_imulu(__v32u *a, __v32u *b, __v32u *c)
- Perform unsigned multiplication between 32-bit elements from source vectors
- A[0:63] and B[0:63] and stores the result into the destination vector C[0:63].
-
- 32-bit Multiply Unsigned (2048 integers):
- _vim2K_imulu(__v32u *a, __v32u *b, __v32u *c)
- Perform unsigned multiplication between 32-bit elements from source vectors
- A[0:2047] and B[0:2047] and stores the result into the destination vector
- C[0:2047].
-
- 64-bit Multiply (32 integers):
- _vim32_imuls(__v64s *a, __v64s *b, __v64s *c)
- Perform signed multiplication between 32-bit elements from source vectors
- A[0:31] and B[0:31] and stores the result into the destination vector C[0:31].
-
- 64-bit Multiply (1024 integers):
- _vim1K_imuls(__v64s *a, __v64s *b, __v64s *c)
- Perform signed multiplication between 32-bit elements from source vectors
- A[0:1023] and B[0:1023] and stores the result into the destination vector
- C[0:1023].
-
- 64-bit Multiply Unsigned (32 integers):
- _vim32_imulu(__v64u *a, __v64u *b, __v64u *c)
- Performs unsigned multiplication between 32-bit elements from source vectors
- A[0:31] and B[0:31] and stores the result into the destination vector C[0:31].
-
- 64-bit Multiply Unsigned (1024 integers):
- _vim1K_imulu(__v64u *a, __v64u *b, __v64u *c)
- Perform unsigned multiplication between 32-bit elements from source vectors
- A[0:1023] and B[0:1023] and stores the result into the destination vector
- C[0:1023].
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- IMMEDIATE:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- 32-bit Move Immediate Data (64 integers):
- _vim64_imovs(__v32s *a, __v32s b)
- Replicate a signed 32-bit immediate b into the vector A[0:63].
-
- 32-bit Move Immediate Data (2048 integers):
- _vim2K_imovs(__v32s *a, __v32s b)
- Replicate a signed 32-bit immediate b into the vector A[0:2047].
-
- 32-bit Move Immediate Data Unsigned (64 integers):
- _vim64_imovu(__v32u *a, __v32u b)
- Replicate a unsigned 32-bit immediate b into the vector A[0:63].
-
- 32-bit Move Immediate Data (2048 integers):
- _vim2K_imovu(__v32u *a, __v32u b)
- Replicate a unsigned 32-bit immediate b into the vector A[0:2047].
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- kNN FLOAT INSTRUCTIONS:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- 32-bit Move Immediate Data (64 floats):
- _vim64_fmovs(__v32f *a, __v32f b)
- Replicate a signed 32-bit floating-point immediate b into the vector A[0:63].
-
- 32-bit Subtract (64 floats):
- _vim64_fsubs(__v32f *a, __v32f *b, __v32f *c)
- Perform signed subtraction between 32-bit floating-point elements source
- vectors A[0:63] and B[0:63] and stores the result into the destination vector
- C[0:63].
-
- 32-bit Multiply (64 floats):
- _vim64_fmuls(__v32f *a, __v32f *b, __v32f *c)
- Perform signed multiplication between 32-bit floating-point elements from
- source vectors A[0:63] and B[0:63] and stores the result into the destination
- vector C[0:63].
-
- 32-bit Cumulative Sum (64 floats):
- _vim64_fcsum(__v32f *a, __v32f *b)
- Perform cumulative sum of the 32-bit floating-point elements from source vector
- A[0:63] in variable b.
-

About

Intrinsics are high level functions implemented in C language and are based in some ISAs. The mainly purpose is simulate these architectures in SiNUCA (Simulator of Non-Uniforme Caches)..

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages