Skip to content

Targeting a Compiler for Tagha and Tagha's Specifications

assyrianic edited this page Dec 31, 2017 · 24 revisions

Introduction

Tagha itself, though I wish to have an official compiler for it, is not meant to be tied down to a single compiler. Similar to how C itself has many compilers for it. Tagha is only meant to be a minimal runtime that has the bare minimum computation necessary to run C code and to use such C code in an abstracted way.

Tagha itself is an oddity among virtual machines; Tagha's addressing modes are taken as a secondary byte and read depending upon the opcode and the actions it needs to take.

The reasoning behind this is to avoid the instruction decoding overhead. Real CPUs can obviously read individual bits at almost no cost but a virtual machine will take up computation time to do the same thing. Replacing decoding with retrieving a second byte to do addressing modes increases speed though it's debatable without profiling data as evidence.

To be a worthy abstracted C runtime environment, giving Tagha runtime speed is a hard requirement.

Tagha C specification

  • A compiler targeting Tagha must be able to support all or most of the C11 standard concerning runtime. (Tagha cannot do multithreading as C11 demands given its implementation and purpose unless the multithreading is through natives).
  • double and long double should always be 8 bytes.
  • float is defined as 32-bit single precision as defined by the IEEE.
  • Tagha's push and pop operations manipulates the stack by 8 bytes.
  • Local function data should be aligned by a 16 byte boundary. so a struct being 20 bytes requires 12 bytes of padding to reach 32 bytes.
  • long, size_t, and long long should be 8 bytes in size.
  • all binary math operations assume both operands are the same size.
  • Function calling convention is through the registers first and then stack, calling convention uses registers rds to rms for the first 10 arguments, the remaining arguments can then be pushed to the stack. rds will contain the 1st argument up to rms which will contain the 10th argument.
  • Function return values always return in the ras register (ras is the accumulator though general purpose). This includes return values for natives. If the return data is larger than 64-bits, then ras, rbs, and rcs can be used otherwise optimize to have ras return a pointer.
  • In order to accommodate for natives, Tagha might require a native or host keyword in order to make it easier for the compiler to generate the native information in the header and/or emit native-oriented opcodes. Example would be such as:
// example.h
native int puts(const char *);

If the compiler is smart enough, you may not need this but my original idea was that the compiler would differentiate between natives and regular functions based on whether the function has an implementation and is used, if there's no function implementation then the compiler could assume the function is a native. Here's a code example explaining this:

// example.h again
void func1();    // func1 declared
void func2();    // func2 declared

// example.c
void func1()    // func1 implemented
{
}

int main()
{
    func1();    // func1 is declared, implemented, and invoked, compiles into function call appropriate bytecode.
    func2();    // func2 is declared and invoked but not implemented, compiles into native appropriate bytecode
}
  • Calling convention for exported C natives is the same as normal bytecode functions, rds to rms will contain the first 10 arguments, remaining arguments are dumped to the stack.
  • argc and argv are implemented in scripts but env variable is not implemented (see no reason to implement as of currently).

Clone this wiki locally