Intrinsics are high level functions implemented in C language and are based in some ISAs. The mainly purpose is simulate these architectures in SiNUCA (Simulator of Non-Uniforme Caches)..
License
ascordeiro/intrinsics
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
-------------------------------------------------------------------------------- - - Aline Santana Cordeiro - ascordeiro@inf.ufpr.br - LSE - Embedded Sistems Laboratory - 2018 - PPGInf - Federal University of Paraná - -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- - Intrinsics Library - -------------------------------------------------------------------------------- - - The Intrinsics-HMC, Intrinsics-MIPS, and Intrinsics-VIMA Libraries were - developed to simulate the ISA's execution of Hybrid Memory Cube (HMC), - described in HMC 2.1 specification (http://www.hybridmemorycube.org), MIPS, - and our Vector-in-Memory architecture, that is based on Processing-in-Memory - architecture but allowing vector instructions execution. The mainly purpose - is to write a C or C++ program using these libraries to generate a simulation - trace to simulate the program's execution in one of these architectures. This - task can be achieved by using the program's binary file as SiNUCA-tracer - entry, that interprets the libraries instructions and generate the traces in - a simulation specific format. - -------------------------------------------------------------------------------- - - DATA TYPES - - - HMC: - __h16u1: 16-bit unsigned integer; - __h64u1: 64-bit unsigned integer; - __h64u2: Two 64-bit unsigned integers in a vector; - __h128u1: 128-bit unsigned integer; - - MIPS: - __m32s1: 32-bit signed integer; - __m32u1: 32-bit unsigned integer; - __m64s1: 64-bit signed integer; - __m64u1: 64-bit unsigned integer; - - VIMA: - __v32s: 32-bit signed integer; - __v32u: 32-bit unsigned integer; - __v64s: 64-bit signed integer; - __v64u: 64-bit unsigned integer; - __v32f: 32-bit signed float; - - The data types were renamed inspired in Intel Intrinsics function pattern. - Data types are identified by the first 2 underlines characteres. The third - character identify the choosen architecture. The next 2 characteres indicates - the variable size in bits. The next charactere indicates if is a signed or - unsigned number (for VIMA, float is also an option). For HMC and MIPS, the - last one indicates the number of variables assigned to that type (As VIMA is - an vectorized instruction, the instruction lenght is assigned to instruction - name). - -------------------------------------------------------------------------------- - -------------------------------- HMC FUNCTIONS --------------------------------- - - The implementation is based on HMC 2.1 specification and does not follows - exactly the real behave described in the specification. - - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - ARITHMETIC: - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - Dual 8-Byte Signed Add Immediate: - __h64u2 *_hmc64_saddimm_d(__h64u2 *mem_op, __h64u2 *imm_op) - Sums two memory operands with two immediate operands (8-byte). The operands - must have 4-bytes size and must be in two-complement with left zeroes padding - (4-byte). The result is returned to the call function. - - Single 16-Byte Signed Add Immediate: - __h128u1 _hmc128_saddimm_s(__h128u1 *mem_op, __h128u1 *imm_op) - Sums a memory operand (16-byte) with an immediate operand in two-complement - with left zeroes padding (16-byte). The result is returned to the call - function. - - 8-Byte Increment: - __h64u1 _hmc64_incr_s(__h64u1 *mem_op) - Increments a memory operand (8-byte) in one unity. The result is returned to - the call function. - - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - LOGIC: - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - 8-Byte Bit Write: - __h64u1 _hmc64_bwrite_s(__h64u1 mem_op, __h64u1 imm_op, __h64u1 mask) - The mask field (8-Byte) selects which immediate operand bits (8-byte) must be - written in the same positions of memory operand (8-Byte). The result is - returned to the call function. - - 16-Byte Swap: - __h128u1 _hmc128_bswap_s(__h128u1 mem_op, __h128u1 imm_op) - Stores the immediate operand value into the memory operand address (16-Byte). - Returns the original memory operand value (16-Byte). - - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - BOOLEAN: - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - 16-Byte AND: - __h128u1 _hmc128_and_s(__h128u1 mem_op, __h128u1 imm_op) - Stores the AND operation's result between the memory operand (16-byte) and - immediate operand (16-byte) into the memory operand address. Returns the - original memory operand value. - - 16-Byte NAND: - __h128u1 _hmc128_nand_s(__h128u1 mem_op, __h128u1 imm_op) - Stores the NAND operation's result between the memory operand (16-byte) and - immediate operand (16-byte) into the memory operand address. Returns the - original memory operand value. - - 16-Byte NOR: - __h128u1 _hmc128_nor_s(__h128u1 mem_op, __h128u1 imm_op) - Stores the NOR operation's result between the memory operand (16-byte) and - immediate operand (16-byte) into the memory operand address. Returns the - original memory operand value. - - 16-Byte OR: - __h128u1 _hmc128_or_s(__h128u1 mem_op, __h128u1 imm_op) - Stores the OR operation's result between the memory operand (16-byte) and - immediate operand (16-byte) into the memory operand address. Returns the - original memory operand value. - - 16-Byte XOR: - __h128u1 _hmc128_xor_s(__h128u1 mem_op, __h128u1 imm_op) - Stores the XOR operation's result between the memory operand (16-byte) and - immediate operand (16-byte) into the memory operand address. Returns the - original memory operand value. - - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - COMPARISON: - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - 8-Byte Compare and Swap if Greater Than: - __h128u1 _hmc64_cmpswapgt_s(__h64u1 mem_op, __h64u1 imm_op) - Stores into the memory operand address the greater value between the memory - operand (8-byte) and immediate operand (8-byte). Returns the original memory - operand value. - - 8-Byte Compare and Swap if Less Than: - __h128u1 _hmc64_cmpswaplt_s(__h64u1 mem_op, __h64u1 imm_op) - Stores into the memory operand address the smaller value between the memory - operand (8-byte) and immediate operand (8-byte). Returns the original memory - operand value. - - 16-Byte Compare and Swap if Greater Than: - __h128u1 _hmc128_cmpswapgt_s(__h128u1 mem_op, __h128u1 imm_op) - Stores into the memory operand address the greater value between the memory - operand (16-byte) and immediate operand (16-byte). Returns the original memory - operand value. - - 16-Byte Compare and Swap if Less Than: - __h128u1 _hmc128_cmpswaplt_s(__h128u1 mem_op, __h128u1 imm_op) - Stores into the memory operand address the smaller value between the memory - operand (16-byte) and immediate operand (16-byte). Returns the original memory - operand value. - - 8-Byte Compare and Swap if Equal: - __h64u1 _hmc64_cmpswapeq_s(__h64u1 mem_op, __h64u1 imm_op, __h64u1 cmp_field) - Compares the cmp_field (8-byte) with memory operand value (8-byte). If equal, - stores the immediate operand value (8-byte) into the memory operand address. - Returns the original memory operand value. - - 16-Byte Compare and Swap if Zero: - __h128u1 _hmc128_cmpswapz_s(__h128u1 mem_op, __h128u1 imm_op) - Compares the memory operand value (16-byte) with zero. If equal, stores the - immediate operand value (16-byte) into the memory operand address. Returns - the original memory operand value. - - 8-Byte Equal To: - __h16u1 _hmc64_cmpswapgt_s(__h64u1 mem_op, __h64u1 imm_op) - Verifies is the memory operand value (8-byte) is equal to the immediate - operand value (8-byte). Returns 1 if equal, 0 if not. - - 16-Byte Equal To: - __h16u1 _hmc128_cmpswapgt_s(__h128u1 mem_op, __h128u1 imm_op) - Verifies is the memory operand value (16-byte) is equal to the immediate - operand value (16-byte). Returns 1 if equal, 0 if not. - - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - CUSTOMIZED: - These instructions were created to fit better in usual programming problems. - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - 8-Byte Compare if Greater Than or equal: - __h16u1 _hmc64_cmpgteq_s(__h64u1 *mem_op, __h64u1 imm_op) - Verifies if the memory operand value is minor or equal to the immediate - operand value. If true, returns 0 to the call function, otherwise, returns 1. - - 8-Byte Compare if Less Than or equal: - __h16u1 _hmc64_cmplteq_s(__h64u1 *mem_op, __h64u1 imm_op) - Verifies if the memory operand value is greater or equal to the immediate - operand value. If true, returns 0 to the call function, otherwise, returns 1. - - 8-Byte Compare if Less Than: - __h16u1 _hmc64_cmplt_s(__h64u1 *mem_op, __h64u1 imm_op) - Verifies if the memory operand value is greater than the immediate operand - value. If true, returns 0 to the call function, otherwise, returns 1. - -------------------------------------------------------------------------------- - -------------------------------- MIPS FUNCTIONS -------------------------------- - - The implementation is based on MIPS ISA and does not follows exactly the real - behave of the architecture. All the instructions were implemented, except - memory and floating point operations. - - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - ARITHMETIC: - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - Add: - __m32s1 _mips_add(__m32s1 rs, __m32s1 rt) - Sums the rs and rt registers and returns the result to the call function. - - Add Unsigned: - __m32u1 _mips_addu(__m32u1 rs, __m32u1 rt) - Sums the rs and rt registers and returns the result to the call function. - - Subtract: - __m32s1 _mips_sub(__m32s1 rs, __m32s1 rt) - Subtracts the rs and rt registers and returns the result to the call function. - - Subtract Unsigned: - __m32u1 _mips_subu(__m32u1 rs, __m32u1 rt) - Subtracts the rs and rt registers and returns the result to the call function. - - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - IMMEDIATE ARITHMETIC: - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - Add Immediate: - __m32s1 _mips_addi(__m32s1 rs, __m32s1 imm_op) - Sums the rs register and immediate operand and returns the result to the call - function. - - Add Immediate Unsigned: - __m32u1 _mips_addiu(__m32u1 rs, __m32u1 imm_op) - Sums the rs register and immediate operand and returns the result to the call - function. - - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - LOGIC: - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - And: - __m32u1 _mips_and(__m32u1 rs, __m32u1 rt) - Applies the AND operation between the rs and rt registers and returns the - result to the call function. - - Nor: - __m32u1 _mips_nor(__m32u1 rs, __m32u1 rt) - Applies the NOR operation between the rs and rt registers and returns the - result to the call function. - - Or: - __m32u1 _mips_or(__m32u1 rs, __m32u1 rt) - Applies the OR operation between the rs and rt registers and returns the - result to the call function. - - Xor: - __m32u1 _mips_xor(__m32u1 rs, __m32u1 rt) - Applies the XOR operation between the rs and rt registers and returns the - result to the call function. - - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - IMMEDIATE LOGIC: - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - And Immediate: - __m32u1 _mips_andi(__m32u1 rs, __m32u1 imm_op) - Applies the AND operation between the rs registers and immediate operand and - returns the result to the call function. - - Or Immediate: - __m32u1 _mips_ori(__m32u1 rs, __m32u1 imm_op) - Applies the OR operation between the rs registers and immediate operand and - returns the result to the call function. - - Xor Immediate: - __m32u1 _mips_xori(__m32u1 rs, __m32u1 imm_op) - Applies the XOR operation between the rs registers and immediate operand and - returns the result to the call function. - - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - COMPARISON: - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - Set Less Than: - __m32s1 _mips_slt(__m32s1 rs, __m32s1 rt) - Compares the rs and rt registers. Returns 1 if rs is minor than rt. - 0, otherwise. - - Set Less Than Unsigned: - __m32u1 _mips_sltu(__m32u1 rs, __m32u1 rt) - Compares the rs and rt registers. Returns 1 if rs is minor than rt. - 0, otherwise. - - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - IMMEDIATE COMPARISON: - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - Set Less Than Immediate: - __m32s1 _mips_slti(__m32s1 rs, __m32s1 imm_op) - Compares the rs and rt registers. Returns 1 if rs is minor than rt. - 0, otherwise. - - Set Less Than Immediate Unsigned: - __m32u1 _mips_sltiu(__m32u1 rs, __m32u1 imm_op) - Compares the rs and rt registers. Returns 1 if rs is minor than rt. - 0, otherwise. - - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - SHIFT: - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - Shift Left Logical: - __m32u1 _mips_sll(__m32u1 rt, __m32u1 shamt) - Shifts to the left the shamt value in rt register. Returns the result to the - call function. - - Shift Right Logical: - __m32u1 _mips_srl(__m32u1 rt, __m32u1 shamt) - Shifts to the right the shamt value in rt register. Returns the result to the - call function. - - Shift Right Arithmetic: - __m32s1 _mips_sra(__m32s1 rt, __m32s1 shamt) - Shifts to the right the shamt value in rt register keeping rt signal. Returns - the result to the call function. - - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - MULTIPLICATION/DIVISION: - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - Divide Only: - __m32s1 _mips_div(__m32s1 rs, __m32s1 rt) - Returns to the call function the division between rs and rt registers. - - Divide Only Unsigned: - __m32u1 _mips_divu(__m32u1 rs, __m32u1 rt) - Returns to the call function the division between rs and rt registers. - - Module Only: - __m32s1 _mips_mod(__m32s1 rs, __m32s1 rt) - Returns to the call function the module operation between rs and rt registers. - - Module Only Unsigned: - __m32u1 _mips_modu(__m32u1 rs, __m32u1 rt) - Returns to the call function the module operation between rs and rt registers. - - Multiply 32-bits: - __m32s1 _mips_mult32(__m32s1 rs, __m32s1 rt) - Multiplies the rs and rt registers and return the result to the call function. - - Multiply 32-bits Unsigned: - __m32u1 _mips_multu32(__m32u1 rs, __m32u1 rt) - Multiplies the rs and rt registers and return the result to the call function. - - Multiply 64-bits: - __m64s1 _mips_mult64(__m32s1 rs, __m32s1 rt) - Multiplies the rs and rt registers and return the result to the call function. - - Multiply 64-bits Unsigned: - __m64u1 _mips_multu64(__m32u1 rs, __m32u1 rt) - Multiplies the rs and rt registers and return the result to the call function. - -------------------------------------------------------------------------------- - -------------------------------- VIMA FUNCTIONS ------------------------------- - - The implementation is based on MIPS and ARM NEON specification and the model - is inspired in HIVE module for HMC, to vectorize data transfer inside HMC and - it does not follows exactly the real behave described in these specifications. - As VIMA implements vectorized instructions, the vector size is specified - below. - - VM64I: 256-bytes array size to integer types; - VM2KI: 8-Kbytes arrey size to integer types; - VM32L: 256-bytes array size to long integer types; - VM1KL: 8-Kbytes array size to long integer types; - - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - ARITHMETIC: - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - 32-bit Add (64 integers): - _vim64_iadds(__v32s *a, __v32s *b, __v32s *c) - Perform signed addition between 32-bit elements source vectors A[0:63] - and B[0:63] and stores the result into the destination vector C[0:63]. - - 32-bit Add (2048 integers): - _vim2K_iadds(__v32s *a, __v32s *b, __v32s *c) - Perform signed addition between 32-bit elements source vectors A[0:2047] - and B[0:2047] and stores the result into the destination vector C[0:2047]. - - 32-bit Add Unsigned (64 integers): - _vim64_iaddu(__v32u *a, __v32u *b, __v32u *c) - Perform unsigned addition between 32-bit elements source vectors A[0:63] - and B[0:63] and stores the result into the destination vector C[0:63]. - - 32-bit Add Unsigned (2048 integers): - _vim2K_iaddu(__v32u *a, __v32u *b, __v32u *c) - Perform unsigned addition between 32-bit elements source vectors A[0:2047] - and B[0:2047] and stores the result into the destination vector C[0:2047]. - - 32-bit Subtract (64 integers): - _vim64_isubs(__v32s *a, __v32s *b, __v32s *c) - Perform signed subtraction between 32-bit elements source vectors A[0:63] - and B[0:63] and stores the result into the destination vector C[0:63]. - - 32-bit Subtract (2048 integers): - _vim2K_isubs(__v32s *a, __v32s *b, __v32s *c) - Perform signed subtraction between 32-bit elements source vectors A[0:2047] - and B[0:2047] and stores the result into the destination vector C[0:2047]. - - 32-bit Subtract Unsigned (64 integers): - _vim64_isubu(__v32u *a, __v32u *b, __v32u *c) - Perform unsigned subtraction between 32-bit elements source vectors A[0:63] - and B[0:63] and stores the result into the destination vector C[0:63]. - - 32-bit Subtract Unsigned (2048 integers): - _vim2K_isubu(__v32u *a, __v32u *b, __v32u *c) - Perform unsigned subtraction between 32-bit elements source vectors A[0:2047] - and B[0:2047] and stores the result into the destination vector C[0:2047]. - - 32-bit Abs (64 integers): - _vim64_iabss(__v32s *a, __v32s *b) - Takes the absolute value of each 32-bit element in a source vector A[0:63] - and stores it into the destination vector B[0:63]. - - 32-bit Abs (2048 integers): - _vim2K_iabss(__v32s *a, __v32s *b) - Takes the absolute value of each 32-bit element in a source vector A[0:2047] - and stores it into the destination vector B[0:2047]. - - 32-bit Max (64 integers): - _vim64_imaxs(__v32s *a, __v32s *b, __v32s *c) - Find the maximal value between each 32-bit element of source vectors A[0:63] - and B[0:63] and stores it into the destination vector C[0:63]. - - 32-bit Max (2048 integers): - _vim2K_imaxs(__v32s *a, __v32s *b, __v32s *c) - Find the maximal value between each 32-bit element of source vectors A[0:2047] - and B[0:2047] and stores it into the destination vector C[0:2047]. - - 32-bit Min (64 integers): - _vim64_imins(__v32s *a, __v32s *b, __v32s *c) - Find the minimal value between each 32-bit element of source vectors A[0:63] - and B[0:63] and stores it into the destination vector C[0:63]. - - 32-bit Min (2048 integers): - _vim2K_imins(__v32s *a, __v32s *b, __v32s *c) - Find the minimal value between each 32-bit element of source vectors A[0:2047] - and B[0:2047] and stores it into the destination vector C[0:2047]. - - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - LOGIC: - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - 32-bit And (64 integers): - _vim64_iandu(__v32u *a, __v32u *b, __v32u *c) - Perform AND operation between 32-bit elements source vectors A[0:63] and - B[0:63] and stores the result into the destination vector C[0:63]. - - 32-bit And (2048 integers): - _vim2K_iandu(__v32u *a, __v32u *b, __v32u *c) - Perform AND operation between 32-bit elements source vectors A[0:2047] and - B[0:2047] and stores the result into the destination vector C[0:2047]. - - 32-bit Or (64 integers): - _vim64_iorun(__v32u *a, __v32u *b, __v32u *c) - Perform OR operation between 32-bit elements source vectors A[0:63] and - B[0:63] and stores the result into the destination vector C[0:63]. - - 32-bit Or (2048 integers): - _vim2K_iorun(__v32u *a, __v32u *b, __v32u *c) - Perform OR operation between 32-bit elements of source vectors A[0:2047] and - B[0:2047] and stores the result into the destination vector C[0:2047]. - - 32-bit Xor (64 integers): - _vim64_ixoru(__v32u *a, __v32u *b, __v32u *c) - Perform XOR operation between 32-bit elements source vectors A[0:63] and - B[0:63] and stores the result into the destination vector C[0:63]. - - 32-bit Xor (2048 integers): - _vim2K_ixoru(__v32u *a, __v32u *b, __v32u *c) - Perform XOR operation between 32-bit elements source vectors A[0:2047] and - B[0:2047] and stores the result into the destination vector C[0:2047]. - - 32-bit Not (64 integers): - _vim64_inots(__v32s *a, __v32s *b) - Perform NOT operation in 32-bit elements source vector A[0:63] and stores - the result into the destination vector B[0:63]. - - 32-bit Not (2048 integers): - _vim2K_inots(__v32s *a, __v32s *b) - Perform NOT operation in 32-bit elements source vector A[0:2047] and stores - the result into the destination vector B[0:2047]. - - 32-bit Mask (64 integers): - _vim64_imsks(__v32s *a, __v32s *b, __v32s *c) - Insert each signed 32-bit element source vector A[0:63] into the destination - vector C[0:63] if the corresponding 32-bit element from source vector B[0:63] - is 0, otherwise, it leaves the destination vector unVManged. - - 32-bit Mask (2048 integers): - _vim2K_imsks(__v32s *a, __v32s *b, __v32s *c) - Insert each signed 32-bit element source vector A[0:2047] into the destination - vector C[0:2047] if the corresponding 32-bit element from source vector - B[0:2047] is 0, otherwise, it leaves the destination vector unVManged. - - 32-bit Masku (64 integers): - _vim64_imsku(__v32u *a, __v32u *b, __v32u *c) - Insert each unsigned 32-bit element source vector A[0:63] into the destination - vector C[0:63] if the corresponding 32-bit element from source vector B[0:63] - is 0, otherwise, it leaves the destination vector unVManged. - - 32-bit Masku (2048 integers): - Insert each unsigned 32-bit element source vector A[0:2047] into the - destination vector C[0:2047] if the corresponding 32-bit element from source - vector B[0:2047] is 0, otherwise, it leaves the destination vector unVManged. - - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - COMPARISON: - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - 32-bit Set Less Than (64 integers): - _vim64_islts(__v32s *a, __v32s *b, __v32s *c) - Compare each signed 32-bit elements from source vectors A[0:63] and B[0:63] - and if the element of A[0:63] is minor, then destination source C[0:63] - stores 1 in the same position, otherwise, stores 0. - - 32-bit Set Less Than (2048 integers): - _vim2K_islts(__v32s *a, __v32s *b, __v32s *c) - Compare each signed 32-bit elements from source vectors A[0:2047] and - B[0:2047] and if the element of A[0:2047] is minor, then destination source - C[0:2047] stores 1 in the same position, otherwise, stores 0. - - 32-bit Set Less Than Unsigned (64 integers): - _vim64_isltu(__v32u *a, __v32u *b, __v32u *c) - Compare each unsigned 32-bit elements from source vectors A[0:63] and B[0:63] - and if the element of A[0:63] is minor, then destination source C[0:63] - stores 1 in the same position, otherwise, stores 0. - - 32-bit Set Less Than Unsigned (2048 integers): - _vim2K_isltu(__v32u *a, __v32u *b, __v32u *c) - Compare each unsigned 32-bit elements from source vectors A[0:2047] and - B[0:2047] and if the element of A[0:2047] is minor, then destination source - C[0:2047] stores 1 in the same position, otherwise, stores 0. - - 32-bit Compare if equal (64 integers): - _vim64_icmqs(__v32s *a, __v32s *b, __v32s *c) - Compare each signed 32-bit elements from source vectors A[0:63] and B[0:63] - and if they are equal, then destination source C[0:63] stores 1 in the same - position, otherwise, stores 0. - - 32-bit Compare if equal (2048 integers): - _vim2K_icmqs(__v32s *a, __v32s *b, __v32s *c) - Compare each signed 32-bit elements of source vectors A[0:2047] and B[0:2047] - and if they are equal, then destination source C[0:2047] stores 1 in the same - position, otherwise, stores 0. - - 32-bit Compare if equal Unsigned (64 integers): - _vim64_icmqu(__v32u *a, __v32u *b, __v32u *c) - Compare each unsigned 32-bit elements of source vectors A[0:63] and B[0:63] - and if they are equal, then destination source C[0:63] stores 1 in the same - position, otherwise, stores 0. - - 32-bit Compare if equal Unsigned (2048 integers): - _vim2K_icmqu(__v32u *a, __v32u *b, __v32u *c) - Compares each unsigned 32-bit elements of source vectors A[0:2047] and - B[0:2047] and if they are equal, then destination source C[0:2047] stores 1 - in the same position, otherwise, stores 0. - - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - SHIFT: - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - 32-bit Shift Left Logical (64 integers): - _vim64_isllu(__v32u *a, __v32u *b, __v32u *c) - Left shift each 32-bit element in source vector A[0:63] the amount specified - in source vector B[0:63] and stores the result into the destination vector - C[0:63]. This operation does not shift signal. - - 32-bit Shift Left Logical (2048 integers): - _vim2K_isllu(__v32u *a, __v32u *b, __v32u *c) - Left shift each 32-bit element in source vector A[0:2047] the amount specified - in source vector B[0:2047] and stores the result into the destination vector - C[0:2047]. This operation does not shift signal. - - 32-bit Shift Right Logical (64 integers): - _vim64_isrlu(__v32u *a, __v32u *b, __v32u *c) - Right shift eaVM 32-bit element in source vector A[0:63] the amount specified - in source vector B[0:63] and stores the result into the destination vector - C[0:63]. This operation does not shift signal. - - 32-bit Shift Right Logical (2048 integers): - _vim2K_isrlu(__v32u *a, __v32u *b, __v32u *c) - Right shift each 32-bit element in source vector A[0:2047] the amount specified - in source vector B[0:2047] and stores the result into the destination vector - C[0:2047]. This operation does not shift signal. - - 32-bit Shift Right Arithmetic (64 integers): - _vim64_isras(__v32s *a, __v32s *b, __v32s *c) - Right shift each 32-bit element in source vector A[0:63] the amount specified - in source vector B[0:63] and stores the result into the destination vector - C[0:63]. This operation shifts signal. - - 32-bit Shift Right Arithmetic (2048 integers): - _vim2K_isras(__v32s *a, __v32s *b, __v32s *c) - Right shift each 32-bit element in source vector A[0:2047] the amount specified - in source vector B[0:2047] and stores the result into the destination vector - C[0:2047]. This operation shifts signal. - - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - MULTIPLICATION/DIVISION: - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - 32-bit Divide Only (64 integers): - _vim64_idivs(__v32s *a, __v32s *b, __v32s *c) - Perform a signed division between 32-bit elements from source vectors A[0:63] - and B[0:63] and stores the result into the destination vector C[0:63]. - - 32-bit Divide Only (2048 integers): - _vim2K_idivs(__v32s *a, __v32s *b, __v32s *c) - Perform a signed division between 32-bit elements from source vectors A[0:2047] - and B[0:2047] and stores the result into the destination vector C[0:2047]. - - 32-bit Divide Only Unsigned (64 integers): - _vim64_idivu(__v32u *a, __v32u *b, __v32u *c) - Perform an unsigned division between 32-bit elements from source vectors - A[0:63] and B[0:63] and stores the result into the destination vector C[0:63]. - - 32-bit Divide Only Unsigned (2048 integers): - _vim2K_idivu(__v32u *a, __v32u *b, __v32u *c) - Perform an unsigned division between 32-bit elements from source vectors - A[0:2047] and B[0:2047] and stores the result into the destination vector - C[0:2047]. - - 32-bit Module Only (64 integers): - _vim64_imods(__v32s *a, __v32s *b, __v32s *c) - Perform a signed module operation between 32-bit elements from source vectors - A[0:63] and B[0:63] and stores the result into the destination vector C[0:63]. - - 32-bit Module Only (2048 integers): - _vim2K_imods(__v32s *a, __v32s *b, __v32s *c) - Perform a signed module operation between 32-bit elements of source vectors - A[0:2047] and B[0:2047] and stores the result into the destination vector - C[0:2047]. - - 32-bit Module Only Unsigned (64 integers): - _vim64_imodu(__v32u *a, __v32u *b, __v32u *c) - Perform an unsigned module operation between 32-bit elements from source - vectors A[0:63] and B[0:63] and stores the result into the destination vector - C[0:63]. - - 32-bit Module Only Unsigned (2048 integers): - _vim2K_imodu(__v32u *a, __v32u *b, __v32u *c) - Perform an unsigned module operation between 32-bit elements from source - vectors A[0:2047] and B and stores the result into the destination vector - C[0:2047]. - - 32-bit Multiply (64 integers): - _vim64_imuls(__v32s *a, __v32s *b, __v32s *c) - Perform signed multiplication between 32-bit elements from source vectors - A[0:63] and B[0:63] and stores the result into the destination vector - C[0:63]. - - 32-bit Multiply (2048 integers): - _vim2K_imuls(__v32s *a, __v32s *b, __v32s *c) - Perform signed multiplication between 32-bit elements from source vectors - A[0:2047] and B[0:2047] and stores the result into the destination vector - C[0:2047]. - - 32-bit Multiply Unsigned (64 integers): - _vim64_imulu(__v32u *a, __v32u *b, __v32u *c) - Perform unsigned multiplication between 32-bit elements from source vectors - A[0:63] and B[0:63] and stores the result into the destination vector C[0:63]. - - 32-bit Multiply Unsigned (2048 integers): - _vim2K_imulu(__v32u *a, __v32u *b, __v32u *c) - Perform unsigned multiplication between 32-bit elements from source vectors - A[0:2047] and B[0:2047] and stores the result into the destination vector - C[0:2047]. - - 64-bit Multiply (32 integers): - _vim32_imuls(__v64s *a, __v64s *b, __v64s *c) - Perform signed multiplication between 32-bit elements from source vectors - A[0:31] and B[0:31] and stores the result into the destination vector C[0:31]. - - 64-bit Multiply (1024 integers): - _vim1K_imuls(__v64s *a, __v64s *b, __v64s *c) - Perform signed multiplication between 32-bit elements from source vectors - A[0:1023] and B[0:1023] and stores the result into the destination vector - C[0:1023]. - - 64-bit Multiply Unsigned (32 integers): - _vim32_imulu(__v64u *a, __v64u *b, __v64u *c) - Performs unsigned multiplication between 32-bit elements from source vectors - A[0:31] and B[0:31] and stores the result into the destination vector C[0:31]. - - 64-bit Multiply Unsigned (1024 integers): - _vim1K_imulu(__v64u *a, __v64u *b, __v64u *c) - Perform unsigned multiplication between 32-bit elements from source vectors - A[0:1023] and B[0:1023] and stores the result into the destination vector - C[0:1023]. - - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - IMMEDIATE: - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - 32-bit Move Immediate Data (64 integers): - _vim64_imovs(__v32s *a, __v32s b) - Replicate a signed 32-bit immediate b into the vector A[0:63]. - - 32-bit Move Immediate Data (2048 integers): - _vim2K_imovs(__v32s *a, __v32s b) - Replicate a signed 32-bit immediate b into the vector A[0:2047]. - - 32-bit Move Immediate Data Unsigned (64 integers): - _vim64_imovu(__v32u *a, __v32u b) - Replicate a unsigned 32-bit immediate b into the vector A[0:63]. - - 32-bit Move Immediate Data (2048 integers): - _vim2K_imovu(__v32u *a, __v32u b) - Replicate a unsigned 32-bit immediate b into the vector A[0:2047]. - - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - kNN FLOAT INSTRUCTIONS: - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - - 32-bit Move Immediate Data (64 floats): - _vim64_fmovs(__v32f *a, __v32f b) - Replicate a signed 32-bit floating-point immediate b into the vector A[0:63]. - - 32-bit Subtract (64 floats): - _vim64_fsubs(__v32f *a, __v32f *b, __v32f *c) - Perform signed subtraction between 32-bit floating-point elements source - vectors A[0:63] and B[0:63] and stores the result into the destination vector - C[0:63]. - - 32-bit Multiply (64 floats): - _vim64_fmuls(__v32f *a, __v32f *b, __v32f *c) - Perform signed multiplication between 32-bit floating-point elements from - source vectors A[0:63] and B[0:63] and stores the result into the destination - vector C[0:63]. - - 32-bit Cumulative Sum (64 floats): - _vim64_fcsum(__v32f *a, __v32f *b) - Perform cumulative sum of the 32-bit floating-point elements from source vector - A[0:63] in variable b. -
About
Intrinsics are high level functions implemented in C language and are based in some ISAs. The mainly purpose is simulate these architectures in SiNUCA (Simulator of Non-Uniforme Caches)..
Topics
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published