GitHub - ascordeiro/intrinsics: Intrinsics are high level functions implemented in C language and are based in some ISAs. The mainly purpose is simulate these architectures in SiNUCA (Simulator of Non-Uniforme Caches)..

ascordeiro / intrinsics Public
Intrinsics are high level functions implemented in C language and are based in some ISAs. The mainly purpose is simulate these architectures in SiNUCA (Simulator of Non-Uniforme Caches)..
View license
2 stars 2 forks Branches Tags Activity
Star
Notifications
Branches Tags
Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
hmc		hmc
mips		mips
vima		vima
LICENSE		LICENSE
README		README
sinuca_tracer.cpp		sinuca_tracer.cpp
Repository files navigation

--------------------------------------------------------------------------------
-
- Aline Santana Cordeiro - ascordeiro@inf.ufpr.br
- LSE - Embedded Sistems Laboratory - 2018
- PPGInf - Federal University of Paraná
-
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-                               Intrinsics Library                             -
--------------------------------------------------------------------------------
-
-   The Intrinsics-HMC, Intrinsics-MIPS, and Intrinsics-VIMA Libraries were
- developed to simulate the ISA's execution of Hybrid Memory Cube (HMC),
- described in HMC 2.1 specification (http://www.hybridmemorycube.org), MIPS,
- and our Vector-in-Memory architecture, that is based on Processing-in-Memory
- architecture but allowing vector instructions execution. The mainly purpose
- is to write a C or C++ program using these libraries to generate a simulation
- trace to simulate the program's execution in one of these architectures. This
- task can be achieved by using the program's binary file as SiNUCA-tracer
- entry, that interprets the libraries instructions and generate the traces in
- a simulation specific format.
-
--------------------------------------------------------------------------------
-
- DATA TYPES -
-
- HMC:
- __h16u1:  16-bit unsigned integer;
- __h64u1:  64-bit unsigned integer;
- __h64u2:  Two 64-bit unsigned integers in a vector;
- __h128u1: 128-bit unsigned integer;
-
- MIPS:
- __m32s1: 32-bit signed integer;
- __m32u1: 32-bit unsigned integer;
- __m64s1: 64-bit signed integer;
- __m64u1: 64-bit unsigned integer;
-
- VIMA:
- __v32s: 32-bit signed integer;
- __v32u: 32-bit unsigned integer;
- __v64s: 64-bit signed integer;
- __v64u: 64-bit unsigned integer;
- __v32f: 32-bit signed float;
-
-   The data types were renamed inspired in Intel Intrinsics function pattern.
- Data types are identified by the first 2 underlines characteres. The third
- character identify the choosen architecture. The next 2 characteres indicates
- the variable size in bits. The next charactere indicates if is a signed or
- unsigned number (for VIMA, float is also an option). For HMC and MIPS, the
- last one indicates the number of variables assigned to that type (As VIMA is
- an vectorized instruction, the instruction lenght is assigned to instruction
- name).
-
--------------------------------------------------------------------------------
-
-------------------------------- HMC FUNCTIONS ---------------------------------
-
- The implementation is based on HMC 2.1 specification and does not follows
- exactly the real behave described in the specification.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- ARITHMETIC:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- Dual 8-Byte Signed Add Immediate:
- __h64u2 *_hmc64_saddimm_d(__h64u2 *mem_op, __h64u2 *imm_op)
- Sums two memory operands with two immediate operands (8-byte). The operands
- must have 4-bytes size and must be in two-complement with left zeroes padding
- (4-byte). The result is returned to the call function.
-
- Single 16-Byte Signed Add Immediate:
- __h128u1 _hmc128_saddimm_s(__h128u1 *mem_op, __h128u1 *imm_op)
- Sums a memory operand (16-byte) with an immediate operand in two-complement
- with left zeroes padding (16-byte). The result is returned to the call
- function.
-
- 8-Byte Increment:
- __h64u1 _hmc64_incr_s(__h64u1 *mem_op)
- Increments a memory operand (8-byte) in one unity. The result is returned to
- the call function.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- LOGIC:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- 8-Byte Bit Write:
- __h64u1 _hmc64_bwrite_s(__h64u1 mem_op, __h64u1 imm_op, __h64u1 mask)
- The mask field (8-Byte) selects which immediate operand bits (8-byte) must be
- written in the same positions of memory operand (8-Byte). The result is
- returned to the call function.
-
- 16-Byte Swap:
- __h128u1 _hmc128_bswap_s(__h128u1 mem_op, __h128u1 imm_op)
- Stores the immediate operand value into the memory operand address (16-Byte).
- Returns the original memory operand value (16-Byte).
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- BOOLEAN:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- 16-Byte AND:
- __h128u1 _hmc128_and_s(__h128u1 mem_op, __h128u1 imm_op)
- Stores the AND operation's result between the memory operand (16-byte) and
- immediate operand (16-byte) into the memory operand address. Returns the
- original memory operand value.
-
- 16-Byte NAND:
- __h128u1 _hmc128_nand_s(__h128u1 mem_op, __h128u1 imm_op)
- Stores the NAND operation's result between the memory operand (16-byte) and
- immediate operand (16-byte) into the memory operand address. Returns the
- original memory operand value.
-
- 16-Byte NOR:
- __h128u1 _hmc128_nor_s(__h128u1 mem_op, __h128u1 imm_op)
- Stores the NOR operation's result between the memory operand (16-byte) and
- immediate operand (16-byte) into the memory operand address. Returns the
- original memory operand value.
-
- 16-Byte OR:
- __h128u1 _hmc128_or_s(__h128u1 mem_op, __h128u1 imm_op)
- Stores the OR operation's result between the memory operand (16-byte) and
- immediate operand (16-byte)  into the memory operand address. Returns the
- original memory operand value.
-
- 16-Byte XOR:
- __h128u1 _hmc128_xor_s(__h128u1 mem_op, __h128u1 imm_op)
- Stores the XOR operation's result between the memory operand (16-byte) and
- immediate operand (16-byte) into the memory operand address. Returns the
- original memory operand value.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- COMPARISON:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- 8-Byte Compare and Swap if Greater Than:
- __h128u1 _hmc64_cmpswapgt_s(__h64u1 mem_op, __h64u1 imm_op)
- Stores into the memory operand address the greater value between the memory
- operand (8-byte) and immediate operand (8-byte). Returns the original memory
- operand value.
-
- 8-Byte Compare and Swap if Less Than:
- __h128u1 _hmc64_cmpswaplt_s(__h64u1 mem_op, __h64u1 imm_op)
- Stores into the memory operand address the smaller value between the memory
- operand (8-byte) and immediate operand (8-byte). Returns the original memory
- operand value.
-
- 16-Byte Compare and Swap if Greater Than:
- __h128u1 _hmc128_cmpswapgt_s(__h128u1 mem_op, __h128u1 imm_op)
- Stores into the memory operand address the greater value between the memory
- operand (16-byte) and immediate operand (16-byte). Returns the original memory
- operand value.
-
- 16-Byte Compare and Swap if Less Than:
- __h128u1 _hmc128_cmpswaplt_s(__h128u1 mem_op, __h128u1 imm_op)
- Stores into the memory operand address the smaller value between the memory
- operand (16-byte) and immediate operand (16-byte). Returns the original memory
- operand value.
-
- 8-Byte Compare and Swap if Equal:
- __h64u1 _hmc64_cmpswapeq_s(__h64u1 mem_op, __h64u1 imm_op, __h64u1 cmp_field)
- Compares the cmp_field (8-byte) with memory operand value (8-byte). If equal,
- stores the immediate operand value (8-byte) into the memory operand address.
- Returns the original memory operand value.
-
- 16-Byte Compare and Swap if Zero:
- __h128u1 _hmc128_cmpswapz_s(__h128u1 mem_op, __h128u1 imm_op)
- Compares the memory operand value (16-byte) with zero. If equal, stores the
- immediate operand value (16-byte) into the memory operand address. Returns
- the original memory operand value.
-
- 8-Byte Equal To:
- __h16u1 _hmc64_cmpswapgt_s(__h64u1 mem_op, __h64u1 imm_op)
- Verifies is the memory operand value (8-byte) is equal to the immediate
- operand value (8-byte). Returns 1 if equal, 0 if not.
-
- 16-Byte Equal To:
- __h16u1 _hmc128_cmpswapgt_s(__h128u1 mem_op, __h128u1 imm_op)
- Verifies is the memory operand value (16-byte) is equal to the immediate
- operand value (16-byte). Returns 1 if equal, 0 if not.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- CUSTOMIZED:
- These instructions were created to fit better in usual programming problems.
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- 8-Byte Compare if Greater Than or equal:
- __h16u1 _hmc64_cmpgteq_s(__h64u1 *mem_op, __h64u1 imm_op)
- Verifies if the memory operand value is minor or equal to the immediate
- operand value. If true, returns 0 to the call function, otherwise, returns 1.
-
- 8-Byte Compare if Less Than or equal:
- __h16u1 _hmc64_cmplteq_s(__h64u1 *mem_op, __h64u1 imm_op)
- Verifies if the memory operand value is greater or equal to the immediate
- operand value. If true, returns 0 to the call function, otherwise, returns 1.
-
- 8-Byte Compare if Less Than:
- __h16u1 _hmc64_cmplt_s(__h64u1 *mem_op, __h64u1 imm_op)
- Verifies if the memory operand value is greater than the immediate operand
- value. If true, returns 0 to the call function, otherwise, returns 1.
-
--------------------------------------------------------------------------------
-
-------------------------------- MIPS FUNCTIONS --------------------------------
-
- The implementation is based on MIPS ISA and does not follows exactly the real
- behave of the architecture. All the instructions were implemented, except
- memory and floating point operations.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- ARITHMETIC:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- Add:
- __m32s1 _mips_add(__m32s1 rs, __m32s1 rt)
- Sums the rs and rt registers and returns the result to the call function.
-
- Add Unsigned:
- __m32u1 _mips_addu(__m32u1 rs, __m32u1 rt)
- Sums the rs and rt registers and returns the result to the call function.
-
- Subtract:
- __m32s1 _mips_sub(__m32s1 rs, __m32s1 rt)
- Subtracts the rs and rt registers and returns the result to the call function.
-
- Subtract Unsigned:
- __m32u1 _mips_subu(__m32u1 rs, __m32u1 rt)
- Subtracts the rs and rt registers and returns the result to the call function.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- IMMEDIATE ARITHMETIC:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- Add Immediate:
- __m32s1 _mips_addi(__m32s1 rs, __m32s1 imm_op)
- Sums the rs register and immediate operand and returns the result to the call
- function.
-
- Add Immediate Unsigned:
- __m32u1 _mips_addiu(__m32u1 rs, __m32u1 imm_op)
- Sums the rs register and immediate operand and returns the result to the call
- function.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- LOGIC:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- And:
- __m32u1 _mips_and(__m32u1 rs, __m32u1 rt)
- Applies the AND operation between the rs and rt registers and returns the
- result to the call function.
-
- Nor:
- __m32u1 _mips_nor(__m32u1 rs, __m32u1 rt)
- Applies the NOR operation between the rs and rt registers and returns the
- result to the call function.
-
- Or:
- __m32u1 _mips_or(__m32u1 rs, __m32u1 rt)
- Applies the OR operation between the rs and rt registers and returns the
- result to the call function.
-
- Xor:
- __m32u1 _mips_xor(__m32u1 rs, __m32u1 rt)
- Applies the XOR operation between the rs and rt registers and returns the
- result to the call function.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- IMMEDIATE LOGIC:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- And Immediate:
- __m32u1 _mips_andi(__m32u1 rs, __m32u1 imm_op)
- Applies the AND operation between the rs registers and immediate operand and
- returns the result to the call function.
-
- Or Immediate:
- __m32u1 _mips_ori(__m32u1 rs, __m32u1 imm_op)
- Applies the OR operation between the rs registers and immediate operand and
- returns the result to the call function.
-
- Xor Immediate:
- __m32u1 _mips_xori(__m32u1 rs, __m32u1 imm_op)
- Applies the XOR operation between the rs registers and immediate operand and
- returns the result to the call function.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- COMPARISON:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- Set Less Than:
- __m32s1 _mips_slt(__m32s1 rs, __m32s1 rt)
- Compares the rs and rt registers. Returns 1 if rs is minor than rt.
- 0, otherwise.
-
- Set Less Than Unsigned:
- __m32u1 _mips_sltu(__m32u1 rs, __m32u1 rt)
- Compares the rs and rt registers. Returns 1 if rs is minor than rt.
- 0, otherwise.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- IMMEDIATE COMPARISON:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- Set Less Than Immediate:
- __m32s1 _mips_slti(__m32s1 rs, __m32s1 imm_op)
- Compares the rs and rt registers. Returns 1 if rs is minor than rt.
- 0, otherwise.
-
- Set Less Than Immediate Unsigned:
- __m32u1 _mips_sltiu(__m32u1 rs, __m32u1 imm_op)
- Compares the rs and rt registers. Returns 1 if rs is minor than rt.
- 0, otherwise.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- SHIFT:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- Shift Left Logical:
- __m32u1 _mips_sll(__m32u1 rt, __m32u1 shamt)
- Shifts to the left the shamt value in rt register. Returns the result to the
- call function.
-
- Shift Right Logical:
- __m32u1 _mips_srl(__m32u1 rt, __m32u1 shamt)
- Shifts to the right the shamt value in rt register. Returns the result to the
- call function.
-
- Shift Right Arithmetic:
- __m32s1 _mips_sra(__m32s1 rt, __m32s1 shamt)
- Shifts to the right the shamt value in rt register keeping rt signal. Returns
- the result to the call function.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- MULTIPLICATION/DIVISION:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- Divide Only:
- __m32s1 _mips_div(__m32s1 rs, __m32s1 rt)
- Returns to the call function the division between rs and rt registers.
-
- Divide Only Unsigned:
- __m32u1 _mips_divu(__m32u1 rs, __m32u1 rt)
- Returns to the call function the division between rs and rt registers.
-
- Module Only:
- __m32s1 _mips_mod(__m32s1 rs, __m32s1 rt)
- Returns to the call function the module operation between rs and rt registers.
-
- Module Only Unsigned:
- __m32u1 _mips_modu(__m32u1 rs, __m32u1 rt)
- Returns to the call function the module operation between rs and rt registers.
-
- Multiply 32-bits:
- __m32s1 _mips_mult32(__m32s1 rs, __m32s1 rt)
- Multiplies the rs and rt registers and return the result to the call function.
-
- Multiply 32-bits Unsigned:
- __m32u1 _mips_multu32(__m32u1 rs, __m32u1 rt)
- Multiplies the rs and rt registers and return the result to the call function.
-
- Multiply 64-bits:
- __m64s1 _mips_mult64(__m32s1 rs, __m32s1 rt)
- Multiplies the rs and rt registers and return the result to the call function.
-
- Multiply 64-bits Unsigned:
- __m64u1 _mips_multu64(__m32u1 rs, __m32u1 rt)
- Multiplies the rs and rt registers and return the result to the call function.
-
--------------------------------------------------------------------------------
-
-------------------------------- VIMA FUNCTIONS -------------------------------
-
- The implementation is based on MIPS and ARM NEON specification and the model
- is inspired in HIVE module for HMC, to vectorize data transfer inside HMC and
- it does not follows exactly the real behave described in these specifications.
- As VIMA implements vectorized instructions, the vector size is specified
- below.
-
- VM64I: 256-bytes array size to integer types;
- VM2KI: 8-Kbytes arrey size to integer types;
- VM32L: 256-bytes array size to long integer types;
- VM1KL: 8-Kbytes array size to long integer types;
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- ARITHMETIC:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- 32-bit Add (64 integers):
- _vim64_iadds(__v32s *a, __v32s *b, __v32s *c)
- Perform signed addition between 32-bit elements source vectors A[0:63]
- and B[0:63] and stores the result into the destination vector C[0:63].
-
- 32-bit Add (2048 integers):
- _vim2K_iadds(__v32s *a, __v32s *b, __v32s *c)
- Perform signed addition between 32-bit elements source vectors A[0:2047]
- and B[0:2047] and stores the result into the destination vector C[0:2047].
-
- 32-bit Add Unsigned (64 integers):
- _vim64_iaddu(__v32u *a, __v32u *b, __v32u *c)
- Perform unsigned addition between 32-bit elements source vectors A[0:63]
- and B[0:63] and stores the result into the destination vector C[0:63].
-
- 32-bit Add Unsigned (2048 integers):
- _vim2K_iaddu(__v32u *a, __v32u *b, __v32u *c)
- Perform unsigned addition between 32-bit elements source vectors A[0:2047]
- and B[0:2047] and stores the result into the destination vector C[0:2047].
-
- 32-bit Subtract (64 integers):
- _vim64_isubs(__v32s *a, __v32s *b, __v32s *c)
- Perform signed subtraction between 32-bit elements source vectors A[0:63]
- and B[0:63] and stores the result into the destination vector C[0:63].
-
- 32-bit Subtract (2048 integers):
- _vim2K_isubs(__v32s *a, __v32s *b, __v32s *c)
- Perform signed subtraction between 32-bit elements source vectors A[0:2047]
- and B[0:2047] and stores the result into the destination vector C[0:2047].
-
- 32-bit Subtract Unsigned (64 integers):
- _vim64_isubu(__v32u *a, __v32u *b, __v32u *c)
- Perform unsigned subtraction between 32-bit elements source vectors A[0:63]
- and B[0:63] and stores the result into the destination vector C[0:63].
-
- 32-bit Subtract Unsigned (2048 integers):
- _vim2K_isubu(__v32u *a, __v32u *b, __v32u *c)
- Perform unsigned subtraction between 32-bit elements source vectors A[0:2047]
- and B[0:2047] and stores the result into the destination vector C[0:2047].
-
- 32-bit Abs (64 integers):
- _vim64_iabss(__v32s *a, __v32s *b)
- Takes the absolute value of each 32-bit element in a source vector A[0:63]
- and stores it into the destination vector B[0:63].
-
- 32-bit Abs (2048 integers):
- _vim2K_iabss(__v32s *a, __v32s *b)
- Takes the absolute value of each 32-bit element in a source vector A[0:2047]
- and stores it into the destination vector B[0:2047].
-
- 32-bit Max (64 integers):
- _vim64_imaxs(__v32s *a, __v32s *b, __v32s *c)
- Find the maximal value between each 32-bit element of source vectors A[0:63]
- and B[0:63] and stores it into the destination vector C[0:63].
-
- 32-bit Max (2048 integers):
- _vim2K_imaxs(__v32s *a, __v32s *b, __v32s *c)
- Find the maximal value between each 32-bit element of source vectors A[0:2047]
- and B[0:2047] and stores it into the destination vector C[0:2047].
-
- 32-bit Min (64 integers):
- _vim64_imins(__v32s *a, __v32s *b, __v32s *c)
- Find the minimal value between each 32-bit element of source vectors A[0:63]
- and B[0:63] and stores it into the destination vector C[0:63].
-
- 32-bit Min (2048 integers):
- _vim2K_imins(__v32s *a, __v32s *b, __v32s *c)
- Find the minimal value between each 32-bit element of source vectors A[0:2047]
- and B[0:2047] and stores it into the destination vector C[0:2047].
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- LOGIC:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- 32-bit And (64 integers):
- _vim64_iandu(__v32u *a, __v32u *b, __v32u *c)
- Perform AND operation between 32-bit elements source vectors A[0:63] and
- B[0:63] and stores the result into the destination vector C[0:63].
-
- 32-bit And (2048 integers):
- _vim2K_iandu(__v32u *a, __v32u *b, __v32u *c)
- Perform AND operation between 32-bit elements source vectors A[0:2047] and
- B[0:2047] and stores the result into the destination vector C[0:2047].
-
- 32-bit Or (64 integers):
- _vim64_iorun(__v32u *a, __v32u *b, __v32u *c)
- Perform OR operation between 32-bit elements source vectors A[0:63] and
- B[0:63] and stores the result into the destination vector C[0:63].
-
- 32-bit Or (2048 integers):
- _vim2K_iorun(__v32u *a, __v32u *b, __v32u *c)
- Perform OR operation between 32-bit elements of source vectors A[0:2047] and
- B[0:2047] and stores the result into the destination vector C[0:2047].
-
- 32-bit Xor (64 integers):
- _vim64_ixoru(__v32u *a, __v32u *b, __v32u *c)
- Perform XOR operation between 32-bit elements source vectors A[0:63] and
- B[0:63] and stores the result into the destination vector C[0:63].
-
- 32-bit Xor (2048 integers):
- _vim2K_ixoru(__v32u *a, __v32u *b, __v32u *c)
- Perform XOR operation between 32-bit elements source vectors A[0:2047] and
- B[0:2047] and stores the result into the destination vector C[0:2047].
-
- 32-bit Not (64 integers):
- _vim64_inots(__v32s *a, __v32s *b)
- Perform NOT operation in 32-bit elements source vector A[0:63] and stores
- the result into the destination vector B[0:63].
-
- 32-bit Not (2048 integers):
- _vim2K_inots(__v32s *a, __v32s *b)
- Perform NOT operation in 32-bit elements source vector A[0:2047] and stores
- the result into the destination vector B[0:2047].
-
- 32-bit Mask (64 integers):
- _vim64_imsks(__v32s *a, __v32s *b, __v32s *c)
- Insert each signed 32-bit element source vector A[0:63] into the destination
- vector C[0:63] if the corresponding 32-bit element from source vector B[0:63]
- is 0, otherwise, it leaves the destination vector unVManged.
-
- 32-bit Mask (2048 integers):
- _vim2K_imsks(__v32s *a, __v32s *b, __v32s *c)
- Insert each signed 32-bit element source vector A[0:2047] into the destination
- vector C[0:2047] if the corresponding 32-bit element from source vector
- B[0:2047] is 0, otherwise, it leaves the destination vector unVManged.
-
- 32-bit Masku (64 integers):
- _vim64_imsku(__v32u *a, __v32u *b, __v32u *c)
- Insert each unsigned 32-bit element source vector A[0:63] into the destination
- vector C[0:63] if the corresponding 32-bit element from source vector B[0:63]
- is 0, otherwise, it leaves the destination vector unVManged.
-
- 32-bit Masku (2048 integers):
- Insert each unsigned 32-bit element source vector A[0:2047] into the
- destination vector C[0:2047] if the corresponding 32-bit element from source
- vector B[0:2047] is 0, otherwise, it leaves the destination vector unVManged.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- COMPARISON:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- 32-bit Set Less Than (64 integers):
- _vim64_islts(__v32s *a, __v32s *b, __v32s *c)
- Compare each signed 32-bit elements from source vectors A[0:63] and B[0:63]
- and if the element of A[0:63] is minor, then destination source C[0:63]
- stores 1 in the same position, otherwise, stores 0.
-
- 32-bit Set Less Than (2048 integers):
- _vim2K_islts(__v32s *a, __v32s *b, __v32s *c)
- Compare each signed 32-bit elements from source vectors A[0:2047] and
- B[0:2047] and if the element of A[0:2047] is minor, then destination source
- C[0:2047] stores 1 in the same position, otherwise, stores 0.
-
- 32-bit Set Less Than Unsigned (64 integers):
- _vim64_isltu(__v32u *a, __v32u *b, __v32u *c)
- Compare each unsigned 32-bit elements from source vectors A[0:63] and B[0:63]
- and if the element of A[0:63] is minor, then destination source C[0:63]
- stores 1 in the same position, otherwise, stores 0.
-
- 32-bit Set Less Than Unsigned (2048 integers):
- _vim2K_isltu(__v32u *a, __v32u *b, __v32u *c)
- Compare each unsigned 32-bit elements from source vectors A[0:2047] and
- B[0:2047] and if the element of A[0:2047] is minor, then destination source
- C[0:2047] stores 1 in the same position, otherwise, stores 0.
-
- 32-bit Compare if equal (64 integers):
- _vim64_icmqs(__v32s *a, __v32s *b, __v32s *c)
- Compare each signed 32-bit elements from source vectors A[0:63] and B[0:63]
- and if they are equal, then destination source C[0:63] stores 1 in the same
- position, otherwise, stores 0.
-
- 32-bit Compare if equal (2048 integers):
- _vim2K_icmqs(__v32s *a, __v32s *b, __v32s *c)
- Compare each signed 32-bit elements of source vectors A[0:2047] and B[0:2047]
- and if they are equal, then destination source C[0:2047] stores 1 in the same
- position, otherwise, stores 0.
-
- 32-bit Compare if equal Unsigned (64 integers):
- _vim64_icmqu(__v32u *a, __v32u *b, __v32u *c)
- Compare each unsigned 32-bit elements of source vectors A[0:63] and B[0:63]
- and if they are equal, then destination source C[0:63] stores 1 in the same
- position, otherwise, stores 0.
-
- 32-bit Compare if equal Unsigned (2048 integers):
- _vim2K_icmqu(__v32u *a, __v32u *b, __v32u *c)
- Compares each unsigned 32-bit elements of source vectors A[0:2047] and
- B[0:2047] and if they are equal, then destination source C[0:2047] stores 1
- in the same position, otherwise, stores 0.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- SHIFT:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- 32-bit Shift Left Logical (64 integers):
- _vim64_isllu(__v32u *a, __v32u *b, __v32u *c)
- Left shift each 32-bit element in source vector A[0:63] the amount specified
- in source vector B[0:63] and stores the result into the destination vector
- C[0:63]. This operation does not shift signal.
-
- 32-bit Shift Left Logical (2048 integers):
- _vim2K_isllu(__v32u *a, __v32u *b, __v32u *c)
- Left shift each 32-bit element in source vector A[0:2047] the amount specified
- in source vector B[0:2047] and stores the result into the destination vector
- C[0:2047]. This operation does not shift signal.
-
- 32-bit Shift Right Logical (64 integers):
- _vim64_isrlu(__v32u *a, __v32u *b, __v32u *c)
- Right shift eaVM 32-bit element in source vector A[0:63] the amount specified
- in source vector B[0:63] and stores the result into the destination vector
- C[0:63]. This operation does not shift signal.
-
- 32-bit Shift Right Logical (2048 integers):
- _vim2K_isrlu(__v32u *a, __v32u *b, __v32u *c)
- Right shift each 32-bit element in source vector A[0:2047] the amount specified
- in source vector B[0:2047] and stores the result into the destination vector
- C[0:2047]. This operation does not shift signal.
-
- 32-bit Shift Right Arithmetic (64 integers):
- _vim64_isras(__v32s *a, __v32s *b, __v32s *c)
- Right shift each 32-bit element in source vector A[0:63] the amount specified
- in source vector B[0:63] and stores the result into the destination vector
- C[0:63]. This operation shifts signal.
-
- 32-bit Shift Right Arithmetic (2048 integers):
- _vim2K_isras(__v32s *a, __v32s *b, __v32s *c)
- Right shift each 32-bit element in source vector A[0:2047] the amount specified
- in source vector B[0:2047] and stores the result into the destination vector
- C[0:2047]. This operation shifts signal.
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- MULTIPLICATION/DIVISION:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- 32-bit Divide Only (64 integers):
- _vim64_idivs(__v32s *a, __v32s *b, __v32s *c)
- Perform a signed division between 32-bit elements from source vectors A[0:63]
- and B[0:63] and stores the result into the destination vector C[0:63].
-
- 32-bit Divide Only (2048 integers):
- _vim2K_idivs(__v32s *a, __v32s *b, __v32s *c)
- Perform a signed division between 32-bit elements from source vectors A[0:2047]
- and B[0:2047] and stores the result into the destination vector C[0:2047].
-
- 32-bit Divide Only Unsigned (64 integers):
- _vim64_idivu(__v32u *a, __v32u *b, __v32u *c)
- Perform an unsigned division between 32-bit elements from source vectors
- A[0:63] and B[0:63] and stores the result into the destination vector C[0:63].
-
- 32-bit Divide Only Unsigned (2048 integers):
- _vim2K_idivu(__v32u *a, __v32u *b, __v32u *c)
- Perform an unsigned division between 32-bit elements from source vectors
- A[0:2047] and B[0:2047] and stores the result into the destination vector
- C[0:2047].
-
- 32-bit Module Only (64 integers):
- _vim64_imods(__v32s *a, __v32s *b, __v32s *c)
- Perform a signed module operation between 32-bit elements from source vectors
- A[0:63] and B[0:63] and stores the result into the destination vector C[0:63].
-
- 32-bit Module Only (2048 integers):
- _vim2K_imods(__v32s *a, __v32s *b, __v32s *c)
- Perform a signed module operation between 32-bit elements of source vectors
- A[0:2047] and B[0:2047] and stores the result into the destination vector
- C[0:2047].
-
- 32-bit Module Only Unsigned (64 integers):
- _vim64_imodu(__v32u *a, __v32u *b, __v32u *c)
- Perform an unsigned module operation between 32-bit elements from source
- vectors A[0:63] and B[0:63] and stores the result into the destination vector
- C[0:63].
-
- 32-bit Module Only Unsigned (2048 integers):
- _vim2K_imodu(__v32u *a, __v32u *b, __v32u *c)
- Perform an unsigned module operation between 32-bit elements from source
- vectors A[0:2047] and B and stores the result into the destination vector
- C[0:2047].
-
- 32-bit Multiply (64 integers):
- _vim64_imuls(__v32s *a, __v32s *b, __v32s *c)
- Perform signed multiplication between 32-bit elements from source vectors
- A[0:63] and B[0:63] and stores the result into the destination vector
- C[0:63].
-
- 32-bit Multiply (2048 integers):
- _vim2K_imuls(__v32s *a, __v32s *b, __v32s *c)
- Perform signed multiplication between 32-bit elements from source vectors
- A[0:2047] and B[0:2047] and stores the result into the destination vector
- C[0:2047].
-
- 32-bit Multiply Unsigned (64 integers):
- _vim64_imulu(__v32u *a, __v32u *b, __v32u *c)
- Perform unsigned multiplication between 32-bit elements from source vectors
- A[0:63] and B[0:63] and stores the result into the destination vector C[0:63].
-
- 32-bit Multiply Unsigned (2048 integers):
- _vim2K_imulu(__v32u *a, __v32u *b, __v32u *c)
- Perform unsigned multiplication between 32-bit elements from source vectors
- A[0:2047] and B[0:2047] and stores the result into the destination vector
- C[0:2047].
-
- 64-bit Multiply (32 integers):
- _vim32_imuls(__v64s *a, __v64s *b, __v64s *c)
- Perform signed multiplication between 32-bit elements from source vectors
- A[0:31] and B[0:31] and stores the result into the destination vector C[0:31].
-
- 64-bit Multiply (1024 integers):
- _vim1K_imuls(__v64s *a, __v64s *b, __v64s *c)
- Perform signed multiplication between 32-bit elements from source vectors
- A[0:1023] and B[0:1023] and stores the result into the destination vector
- C[0:1023].
-
- 64-bit Multiply Unsigned (32 integers):
- _vim32_imulu(__v64u *a, __v64u *b, __v64u *c)
- Performs unsigned multiplication between 32-bit elements from source vectors
- A[0:31] and B[0:31] and stores the result into the destination vector C[0:31].
-
- 64-bit Multiply Unsigned (1024 integers):
- _vim1K_imulu(__v64u *a, __v64u *b, __v64u *c)
- Perform unsigned multiplication between 32-bit elements from source vectors
- A[0:1023] and B[0:1023] and stores the result into the destination vector
- C[0:1023].
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- IMMEDIATE:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- 32-bit Move Immediate Data (64 integers):
- _vim64_imovs(__v32s *a, __v32s b)
- Replicate a signed 32-bit immediate b into the vector A[0:63].
-
- 32-bit Move Immediate Data (2048 integers):
- _vim2K_imovs(__v32s *a, __v32s b)
- Replicate a signed 32-bit immediate b into the vector A[0:2047].
-
- 32-bit Move Immediate Data Unsigned (64 integers):
- _vim64_imovu(__v32u *a, __v32u b)
- Replicate a unsigned 32-bit immediate b into the vector A[0:63].
-
- 32-bit Move Immediate Data (2048 integers):
- _vim2K_imovu(__v32u *a, __v32u b)
- Replicate a unsigned 32-bit immediate b into the vector A[0:2047].
-
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- kNN FLOAT INSTRUCTIONS:
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
- 32-bit Move Immediate Data (64 floats):
- _vim64_fmovs(__v32f *a, __v32f b)
- Replicate a signed 32-bit floating-point immediate b into the vector A[0:63].
-
- 32-bit Subtract (64 floats):
- _vim64_fsubs(__v32f *a, __v32f *b, __v32f *c)
- Perform signed subtraction between 32-bit floating-point elements source
- vectors A[0:63] and B[0:63] and stores the result into the destination vector
- C[0:63].
-
- 32-bit Multiply (64 floats):
- _vim64_fmuls(__v32f *a, __v32f *b, __v32f *c)
- Perform signed multiplication between 32-bit floating-point elements from
- source vectors A[0:63] and B[0:63] and stores the result into the destination
- vector C[0:63].
-
- 32-bit Cumulative Sum (64 floats):
- _vim64_fcsum(__v32f *a, __v32f *b)
- Perform cumulative sum of the 32-bit floating-point elements from source vector
- A[0:63] in variable b.
-
About

Intrinsics are high level functions implemented in C language and are based in some ISAs. The mainly purpose is simulate these architectures in SiNUCA (Simulator of Non-Uniforme Caches)..
simulation mips-assembly computer-architecture instruction-set-architecture hybrid-memory-cube
Readme
View license
Activity
2 stars
5 watching
2 forks
Report repository