Skip to content
Martin Hellspong edited this page Feb 2, 2016 · 8 revisions

Contributing instruction implementations

Please take a look at this commit implementing 8/16/32 bit CMP in all 12 addressing modes, and then look at this commit adding CMPI, but reusing the same common implementation.

Add opcode constants

Add opcode constants for all addresing modes of your instruction in src/cpu/ops/handlers.rs

As an example, if you were implementing CMP, one of the addressing modes are "address indiriect with displacement", abbreviated "di", which means you could add a constant for CMP.L like so;

pub const OP_CMP_32_DI : u32 = 0xb0a8

But we don't really want such magic opaque values, instead, please OR together other constants representing the instructions logical parts, to create the value. Constants for several such parts are already defined, but for some instructions, you might need to come up with your own.

In this case, we might use the following parts;

pub const OP_CMP_32_DI : u32 = OP_CMP | LONG_SIZED | OPER_DI

If this is the first instruction in a "family", you will need to add the constant OP_CMP (alongside OP_ADD, OP_AND etc), by looking at the bitmask for the instruction in the M68000 Programmer's Reference Manual.

Also, please add a test (to handlers.rs) for at least one of your constants, where the literal value is taken from the Musashi handler table where the entry for cmp_32_di looks like;

op_cmp_32_di           , MASK_XY, 0xb0a8, { 18,  18,   7}},

which means the test should check that OP_CMP_32_DI equals 0xb0a8.

Please run the tests using cargo test

Add common implementation

Most instructions have 8, 16 and 32-bit versions and several adressing modes, but the actual implementation (with "side effects" such as setting the CCR flags) is the same for all, or most, variations, once you have fetched the operands.

Please add the common implementation to src/cpu/ops/common.rs. Very often we need one 8bit, one 16-bit and one 32-bit -sized implementation.

Typically, the signature of a common function needs to be;

pub fn cmp_8(core: &mut Core, dst: u32, src: u32) -> u32 {
    ...
}
pub fn cmp_16(core: &mut Core, dst: u32, src: u32) -> u32 {
    ...
}
pub fn cmp_32(core: &mut Core, dst: u32, src: u32) -> u32 {
    ...
}

For reference, please take a look at the corresponding implementation in the Musashi sources. As Musashi makes heavy use of C preprocessor macros, in some cases it's helpful to look at the preprocessor output (try something like gcc -E -P m68kopac.c -o m68kopac.c.preprocessed). Also, if you don't find m68kopac.c (ops A-C), m68kopdm.c (ops D-M) and m68kopnz.c (ops N-Z), please note that these files are generated by Musashi, so you need to do something like gcc m68kmake.c -o generate-ops and then run the generate-ops executable)

Potentially add a test for the common implementation, although that has not been done for most other ops, that's a bit of technical debt.

Add specific implementations

Using some combination of literal functions, or one of the impl_op-macros, define specific implementations of each addressing mode for your instruction in src/cpu/ops/mod.rs, referencing the common implementation and the particular addressing mode.

impl_op!(-, cmp_32, cmp_32_di,   ay_di_32, dx, 6+12);

(sometimes, we define a wrapper macro (typically called cmp_32) to be more DRY, but here we used impl_op directly).

This process is greatly simplified if you are implementing the dual of some existing instruction - if implementing SUB, you can most likely copy-paste most of the specific ADD implementations, and it's highly likely they are very symmetrical. SUB will therefore have exactly the same addressing modes, and perhaps even the same cycle count as ADD (or if not, they only differ by a constant value). The cycle counts can be found in section 8 of M68000 User's Manual. Please note that if you copy the cycle counts from the Musashi handler table, we don't get an independent check that Musashi does, in fact, use the correct cycle counts.

Add op_entries for your implementations

Add op_entry! linking the opcode constants, with your specific implementation, and a opcode-mask, in src/cpu/ops/handlers.rs;

op_entry!(MASK_OUT_X_Y, OP_CMP_32_DI,   cmp_32_di),

Where the mask-value should match the corresponding entry from the Musashi handler table (note the mask-constants have somewhat different names in the Rust code). In this case MASK_OUT_X_Y means the opcode itself specifies two registers X and Y, each of which can have the value 0-7, which means this op-entry really implements 64 different opcodes for the cmp_32_di instruction. MASK_OUT_X or MASK_OUT_Y specifies just one register (8 opcodes), and MASK_EXACT specifies a single opcode.

Add QuickCheck tests for your instruction

Locate the other qc-tests in src/musashi.rs, and add tests for your instruction. These can mostly be copied from the corresponding op_entry (but note that the order of op-constant and mask are reversed, this could be simplified)

qc!(OP_CMP_32_DI,   MASK_OUT_X_Y, qc_cmp_32_di);

Note that the name given, is not the name of the implementing function, it is actually the name of the test (which is just the function name prefixed with qc_). Note that there are both qc and qc8, where qc8 is meant to be used with byte-sized instructions. The difference is that qc8 allows odd values in address registers, whereas qc doesn't (as unaligned word or long memory operations just causes address errors, which doesn't add much value to our test of the implementation of the specific instruction)

Run the QC tests to compare with Musashi

Please run the tests using ./qc op-prefix, something like;

./qc qc_cmp_8 qc_cmp_16 qc_cmp_32

This will first run cargo test executing 12 tests for cmp_8, then run a separate instance of cargo test for cmp_16, and then one for cmp_32. This will take several minutes (possibly this can be sped up. For one, we generate the entire handler table for each individual test, and then run hundreds of tests against each opcode, and remember that a MASK_OUT_X_Y instruction is really 64 opcodes, which are all tested separately)

As the tests use semaphores to serialize access to Musashi, if one test fails, it panics when holding the semaphore which all other threads are waiting for (as cargo tests are multithreaded), and this poisons the semaphore, causing all other tests to fail (before actually being run). Therefore you might want to run the tests separately, by listing them one by one;

./qc qc_cmpi_8_dn qc_cmpi_8_ai qc_cmpi_8_pi qc_cmpi_8_pd qc_cmpi_8_di qc_cmpi_8_ix qc_cmpi_8_aw qc_cmpi_8_al qc_cmpi_16_dn qc_cmpi_16_ai qc_cmpi_16_pi qc_cmpi_16_pd qc_cmpi_16_di qc_cmpi_16_ix qc_cmpi_16_aw qc_cmpi_16_al qc_cmpi_32_dn qc_cmpi_32_ai qc_cmpi_32_pi qc_cmpi_32_pd qc_cmpi_32_di qc_cmpi_32_ix qc_cmpi_32_aw qc_cmpi_32_al

however, this will run them in sequence. Using GNU Parallel, you can add some parallelism, by running;

parallel --nice 10 ./qc ::: qc_cmpi_8_dn qc_cmpi_8_ai qc_cmpi_8_pi qc_cmpi_8_pd qc_cmpi_8_di qc_cmpi_8_ix qc_cmpi_8_aw qc_cmpi_8_al qc_cmpi_16_dn qc_cmpi_16_ai qc_cmpi_16_pi qc_cmpi_16_pd qc_cmpi_16_di qc_cmpi_16_ix qc_cmpi_16_aw qc_cmpi_16_al qc_cmpi_32_dn qc_cmpi_32_ai qc_cmpi_32_pi qc_cmpi_32_pd qc_cmpi_32_di qc_cmpi_32_ix qc_cmpi_32_aw qc_cmpi_32_al

Which will run several tests at a time (eight on my quadcore Macbook) (btw --nice lowers the prio, or your machine will be maxed out). Please note that if you make changes to the source code, and then run parallel, you will cause 8 simultaneous compilations, which will freak out rustc! Also note that if you edit the source code during a run, the next test started will compile and run the new version.

When all QC-tests pass, you are done! Thanks for contributing!