diff --git a/core.html b/core.html index 15f64f0..43605d3 100644 --- a/core.html +++ b/core.html @@ -211,13 +211,9 @@
The language specified by this document is thus a subset of the language accepted by the in-house compiler. Any source code translated by an implementation of this should be able to be translated by the original compiler. The reverse is not true. A known list of differences is presented in Appendix D.
+The language specified by this document is thus a subset of the language accepted by the in-house compiler. Any source code translated by an implementation of this should be able to be translated by the original compiler. The reverse is not true. A known list of differences is presented in Appendix E.
Terms that are used only in a small portion of this document are defined where they are used and italicized where they are defined.
behaviour
+Because of the lack of information about the original language, some terms in this document are community defined.
external appearance or action.
-behaviour, implementation-defined
-behavior specific to an implementation, where that implementation must document that behavior.
-behaviour, undefined
-behavior which is not guaranteed to produce any specific result.
-behaviour, unspecified
-behavior for which this specification provides two or more possibilities and imposes no further requirements on which is chosen in any instance.
-constraint
-restriction, either syntactic or semantic, on how language elements can be used.
-must
-describes an absolute requirement of the specification. synonymous with shall.
-must not
-describes an absolute prohibition of the specification. synonymous with shall not.
-should
-describes a recommended but not absolutely necessary requirement of the specification.
-should not
-describes an unrecommended but not absolutely necessary requirement of the specifcation.
-may
-describes an optional feature or behaviour of the specification.
-execution environment
-describes an absolute requirement. synonymous with shall.
+describes an absolute prohibition. synonymous with shall not.
+describes a recommended but not absolutely necessary requirement.
+describes an unrecommended but not absolutely prohibited requirement.
+describes an optional requirement.
+the software on which the result of translation is executed on.
-translation environment
-the software on which the language is translated for use by an execution environment.
-implementation
-particular set of software, running in a particular translation environment under particular control options, that performs translation of programs for, and supports execution of commands in, a particular execution environment.
-ill-formed program
-program that is not well-formed.
-well-formed program
-program constructed according to the synctatic and semantic rules as defined by this specification.
-value
-precise meaning of the contents of an name when interpreted as having a specific type.
-argument
-program that is not well-formed.
+precise meaning of the contents of a name when interpreted as having a specific type.
+a value passed to a command that is intended to map to a corresponding parameter.
-parameter
-the value to be received in a specific argument position of a command.
-in-house compiler
-the value to be received in a specific argument of a command.
+the translation environment used by DMA Design. this document refers more specifically to Official GTA3 Script Compiler V413.
+Appendix C describes ambiguities present in the language grammar and resolutions for them.
Appendix D describes problematic elements in the in-house compiler that this specification choose not to follow.
+Appendix D lists commands that cannot happen outside of their very specific syntactical context.
+Appendix E describes problematic elements in the in-house compiler that this specification choose not to follow.
Along this document there are several footnotes explaining or clarifying some decisions. Often these footnotes exposes the reason for changes in the language compared to the in-house compiler. These footnotes are purely informative and are not an integral part of this document.
Some changes demand more details than a footnote permits. These are marked with the footnote [1] and further details are presented in Appendix D.
+Some changes demand more details than a footnote permits. These are marked with the footnote [1] and further details are presented in Appendix E.
A script is a unit of execution which contains its own program counter, local variables and compare flag.
+A script is a unit of execution which contains its own program counter, local variables and compare flag.
A program is a collection of scripts running concurrently in a cooperative fashion.
@@ -475,25 +443,25 @@A variable is a named storage location. This location holds a value of specific type.
There are global and local variables. Global variables are stored in a way they are accessible from any script. Local variables pertains to its particular script and is only accessible from it.
+There are global and local variables. Global variables are stored in a way they are accessible from any script. Local variables pertain to its particular script and is only accessible from it.
The lifetime of a global variable is the same as of the execution of all scripts. The lifetime of a local variable is the same as its script and lexical scope.
+The lifetime of a global variable is the same as of the execution of all scripts. The lifetime of a local variable is the same as its script and lexical scope.
A command is an operation to be performed by a script. Commands produces side-effects which are described by each command description.
+A command is an operation to be performed by a script. Commands produce side effects which are described by their command description.
A possible side-effect of executing a command is the updating of the compare flag. The compare flag of a command is the boolean result it produces. The compare flag of a script is the compare flag of the its last executed command. The compare flag is useful for conditionally changing the flow of control.
+A possible side effect of executing a command is the updating of the compare flag. The compare flag of a command is the boolean result it produces. The compare flag of a script is the compare flag of its last executed command.
The program counter of a script indicates its currently executing command. Unless one of the side-effects of a command is to change the program counter, the counter goes from the current command to the next sequentially. An explicit change in the program counter is said to be a change in the flow of control.
+The program counter of a script indicates its currently executing command. Unless one of the side effects of a command is to change the program counter, the counter goes from the current command to the next sequentially. An explicit change in the program counter is said to be a change in the flow of control.
A command is said to perform a jump if it changes the flow of control irreversibly.
+A command is said to perform a jump if it changes the flow of control irreversibly.
A command is said to call a subroutine if it changes the flow of control but saves the current program counter in a stack to be restored later.
+A command is said to call a subroutine if it changes the flow of control but saves the current program counter in a stack to be restored later.
A command is said to terminate a script if it halts and reclaims storage of such a script.
@@ -511,25 +479,22 @@Other script files are required to become part of the multi-file by the means of require statements within the main script file. The main script file itself is required from the translation environment.
-Many kinds of script files can be required.
+Other script files are required to become part of the multi-file by the means of require statements within the main script file or main extension files. The main script file itself is required from the translation environment.
A main extension file (or foreign gosub file) is a script file required by the means of a GOSUB_FILE statement. Other script files can be required from main extension files.
+A main extension file (or foreign gosub file) is a script file required by the means of a GOSUB_FILE statement.
A subscript file is a script file required by the means of the LAUNCH_MISSION statement. A subscript is a script started by the same statement.
+A subscript file is a script file required by the means of a LAUNCH_MISSION statement. A subscript is a script started by the same statement.
A mission script file is a script file required by the means of the LOAD_AND_LAUNCH_MISSION statement. A mission script is a script started by the same statement. Only a single mission script can be running at once.
+A mission script file is a script file required by the means of a LOAD_AND_LAUNCH_MISSION statement. A mission script is a script started by the same statement.
An implementation may contain special features regarding the way subscripts and mission scripts are executed.
The main script file is found in a unspecified manner. The other script files are found by recursively searching a directory with the same filename (excluding extension) as the main script file. This directory is in the same path as the main script file. The search for the script files shall be case-insensitive. All script files must have a .sc extension. If multiple script files with the same name are found, which script file is chosen is unspecified.
The main script file is found in an unspecified manner. The other script files are found by recursively searching a directory with the same filename (excluding extension) as the main script file. This directory is in the same path as the main script file. The search for the script files shall be case-insensitive. All script files must have a .sc extension. If multiple script files with the same name are found, which script file is chosen is unspecified.
A script type is said to come before another script type under the following total order:
@@ -573,7 +538,7 @@An integer is a binary signed two’s-complement integral number. It represents 32 bits of data and the range of values -2147483648 through 2147483647.
+An integer is a binary signed two’s-complement integral number. It represents 32 bits of data and the range of values -2147483648 through 2147483647.
A floating-point is a representation of a real number. Its exact representation, precision and range of values is implementation-defined.
@@ -597,12 +562,12 @@The lexical grammar of the language is context-sensitive. As such, the lexical elements and the syntactic elements are presented together.
+The lexical grammar of the language is context-sensitive. As such, the lexical elements and the syntactic elements are presented together.[2]
Source code is a stream of printable ASCII characters plus the control codes line feed (\n), horizontal tab (\t) and carriage return (\r).
Source code is a stream of printable ASCII characters plus the control codes line feed (\n), horizontal tab (\t) and carriage return (\r).[1]
Carriage returns should appear only before a line feed.
+Carriage returns should appear only before a line feed.[3]
Lowercase letters in the stream shall be interpreted as its uppercase equivalent.
Space, horizontal tab, parentheses and comma are defined as whitespace characters.
+Space, horizontal tab, parentheses and comma are defined as whitespace characters.
Each line should be interpreted as if there is no whitespaces in either ends of the line.
+Each line should be interpreted as if there is no whitespaces in either ends of the line.[4]
The contents of a comment shall be interpreted as if it is whitespaces in the source code. More specifically:
+The contents of a comment shall be interpreted as if it is whitespaces in the source code.[4] More specifically:
Comments cannot start inside string literals.
+Comments cannot start inside string literals.[5]
A integer literal is a sequence of digits optionally preceded by a minus sign.
+An integer literal is a sequence of digits optionally preceded by a minus sign.[1]
If the literal begins with a minus, the number following it shall be negated.
@@ -744,10 +709,10 @@A floating-point literal is a nonempty sequence of digits which must contain at least one occurrence of the characters . or F.
A floating-point literal is a sequence of digits which must contain at least one occurrence of the characters . or F.
Once the F characters is found, all characters including and following it shall be ignored. The same shall happen when the character . is found a second time.
Once the F characters is found, all characters including and following it shall be ignored. The same shall happen when the character . is found a second time.[6]
The literal can be preceded by a minus sign, which shall negate the floating-point number.
@@ -828,6 +793,9 @@Constraints
If text label variables are not supported by the implementation, the first character of an identifier shall not be a dollar.
+An identifier shall not end with a : character.
A variable name is a sequence of token characters, except the characters [ and ] cannot happen.
A variable name is a identifier, except the characters [ and ] cannot happen. If text label variables are not supported, the first character of a variable name shall not be a dollar.
A variable reference is a variable name optionally followed by an array subscript.
+A variable reference is a variable name optionally followed by an array subscript.[1]
The subscript uses an integer literal or another variable name of integer type for zero-based indexing.
+The subscript uses an integer literal or another variable name of integer type for zero-based indexing.[7]
The program is ill-formed if the array subscript uses a negative or out of bounds value for indexing.
The program is ill-formed if a variable name is followed by a subscript but the variable is not an array.
+The program is ill-formed if a variable name is followed by a subscript but the variable is not an array.[1]
An array variable name which is not followed by a subscript behaves as if its zero-indexed element is referenced.
@@ -891,7 +859,7 @@A parameter definition is a set of definitions regarding a single parameter for a specific command.
A command must have the same amount of arguments as its amount of parameter definitions, unless the missing arguments correspond to optional parameters (defined below).
+A command must have the same amount of arguments as its amount of parameter definitions, unless the missing arguments correspond to optional parameters.
If a variable is used in the same command both as an input and as an output, the input shall be evaluated before any output is assigned to the variable.
@@ -926,10 +894,7 @@An entity can be assigned to a variable. In such case, the variable is said to be of that specific entity type from that line of code on. Previous lines of code are not affected.
If an entity type is associated with an parameter and a variable is used as argument, the variable must have the same entity type as the parameter.
-Further semantics for entities are defined along this document.
+If an entity type is associated with an parameter and a variable is used as argument, the variable must have the same entity type as the formal parameter.
A TEXT_LABEL parameter accepts an argument only if it is an identifier. If the identifier begins with a dollar character ($), its suffix must reference a variable of text label type and such a variable is the actual argument. Otherwise, the identifier is a text label.
A TEXT_LABEL parameter accepts an argument only if it is an identifier. If the identifier begins with a dollar character, its suffix must reference a variable of text label type and such a variable is the actual argument. Otherwise, the identifier is a text label.
INPUT_OPT
@@ -1061,7 +1026,7 @@The INPUT_OPT parameter accepts an argument only if it is an integer literal, floating-point literal, or identifier referencing a variable of integer, floating-point or text label type.
+The INPUT_OPT parameter accepts an argument only if it is an integer literal, floating-point literal, or identifier referencing a variable of integer or floating-point type. A variable of text label type may be accepted by an INPUT_OPT parameter.
A command selector consists of a name and a finite sequence of commands which are alternatives for replacement.
A command name which is the name of a selector shall behave as if its command name is rewritten as a matching alternative before any parameter checking takes place.
+A command name which is the name of a selector shall behave as if its command name is rewritten as a matching alternative before any parameter checking takes place.
A matching alternative is the first command in the alternative sequence to have the same amount of parameters as arguments in the actual command, and to obey the following rules for every argument and its corresponding parameter:
@@ -1129,7 +1094,7 @@The name of commands used to require script files (e.g. GOSUB_FILE) and its directive commands (i.e. MISSION_START and MISSION_END) cannot be on the left hand side of a expression.
The name of commands used to require script files (e.g. GOSUB_FILE) and its directive commands (i.e. MISSION_START and MISSION_END) cannot be on the left hand side of an expression.
SUB_THING_FROM_THING a c if the name a is the same as the name b.
Implementation-defined if a is the same name as c.
Implementation-defined if a is the same name as c.[8]
SET a b followed by SUB_THING_BY_THING a c otherwise.
This declares a label named after the given identifier.
The label can be referenced in certain commands to transfer (or start) control of execution to the statement it prefixes. Labels themselves do not alter the flow of control, which continues to the statement it embodies.
+The label can be referenced in certain commands to transfer (or start) control to the statement it prefixes. Labels themselves do not alter the flow of control, which continues to the statement it embodies.
The command it embodies cannot be any of the commands specified by this section (e.g. VAR_INT, ELSE, ENDWHILE, {, GOSUB_FILE, MISSION_START, etc).
The command it embodies cannot be any of the commands specified by this section (e.g. VAR_INT, ELSE, ENDWHILE, {, GOSUB_FILE, MISSION_START, etc).[1]
The commands with the VAR_ prefix declares global variables. The ones with LVAR_ declares local variables. The INT suffix declares variables capable of storing integers. The FLOAT suffix declares floating-point ones. Finally, the TEXT_LABEL one declares variables capable of storing text labels.
An implementation may not support variables of text label type. In such case, a program declaring a text label variable is ill-formed.
Constraints
@@ -1512,7 +1477,7 @@The initial value of variables is unspecified.
+The initial value of variables is unspecified.[9]
Constraints
The command it embodies cannot be any of the commands specified by this section (e.g. VAR_INT, ELSE, ENDWHILE, {, GOSUB_FILE, MISSION_START, etc).
The command it embodies cannot be any of the commands specified by this section (e.g. VAR_INT, ELSE, ENDWHILE, {, GOSUB_FILE, MISSION_START, etc).[1]
Semantics
@@ -1556,7 +1521,7 @@A conditional list shall not be short-circuit evaluated. All conditional elements are executed in order.
The behaviour is undefined if the command used in a conditional element does not cause side-effects in the compare flag.
+The behaviour is undefined if the command used in a conditional element does not cause side effects in the compare flag.
The WHILE statement executes a set of statements while the compare flag of the conditional list holds true.
The statement executes by grabbing the compare flag of the list of conditions and transferring control to after the WHILE block if it holds false. Otherwise, it executes the given set of statements. Execution of the ENDWHILE command causes control to be transferred to beginning of the block, where the conditions are evaluated again.
The statement executes by grabbing the compare flag of the list of conditions and transferring control to after the WHILE block if it holds false. Otherwise, it executes the given set of statements. Execution of the ENDWHILE command causes control to be transferred to the beginning of the block, where the conditions are evaluated again.
repeat_statement := 'REPEAT' sep integer sep identifier eol
+repeat_statement := 'REPEAT' sep integer sep variable eol
{statement}
[label_prefix] 'ENDREPEAT' eol ;
Constraints
The first argument to REPEAT must be an integer literal.
-The second argument must be a variable of integer type.
+The associated variable must be of integer type.[10]
Semantics
@@ -1694,7 +1656,7 @@The REPEAT statement executes a set of statements until a counter variable reaches a threshold.
The REPEAT command causes the variable to be set to zero. Execution of the ENDREPEAT command causes the variable to be incremented and if it compares less than the threshold, it transfers control back to the set of statements. Otherwise, it leaves the block.
The REPEAT command causes the associated variable to be set to zero. Execution of the ENDREPEAT command causes the variable to be incremented and if it compares less than the threshold, control is transfered back to the set of statements. Otherwise, it leaves the block.
The statements are always executed at least once.
@@ -1716,13 +1678,13 @@Require statements request script files to become part of the multi-file being translated.
A file can be required more than once. If it is required using the same statement as the first request, the latter request is ignored. Otherwise, behaviour is undefined.
+A file can be required more than once. In such case, if it is required using the same statement as the first request, the latter request is ignored. Otherwise, behaviour is undefined.
Constraints
Require statements shall only appear in the main script file or main extension files.
+Require statements shall only appear in the main script file or main extension files.
The GOSUB_FILE command requires a main extension file to become part of the multi-file.
The GOSUB_FILE command requires a main extension file to become part of the multi-file.
It also calls the subroutine specified by label.
@@ -1755,7 +1717,7 @@The LAUNCH_MISSION command requires a subscript file to become part of the multi-file.
The LAUNCH_MISSION command requires a subscript file to become part of the multi-file.
It also starts a new subscript with the program counter at the MISSION_START directive of the specified script file.
Only a single mission script can be running at once.
+Only a single mission script can be running at once.
Semantics
The LOAD_AND_LAUNCH_MISSION command requires a mission script file to become part of the multi-file.
The LOAD_AND_LAUNCH_MISSION command requires a mission script file to become part of the multi-file.
It also starts a new mission script with the program counter at the MISSION_START directive of the specified script file.
It also starts a new mission script with the program counter at the MISSION_START directive of the specified script file.
A subscript file is a sequence of zero or more statements in a MISSION_START and MISSION_END block. More statements can follow.
A subscript file is a sequence of zero or more statements within a MISSION_START…MISSION_END block. More statements can follow.
Constraints
The MISSION_START command shall be the very first line of the subscript file and shall not be preceded by anything but ASCII spaces ( ) and horizontal tabs (\t). Even comments are disallowed.
The MISSION_START command shall be the very first line of the subscript file and shall not be preceded by anything but ASCII spaces ( ) and horizontal tabs (\t). Comments instead of whitespaces are disallowed.
Commands in subscript files shall not refer to labels in mission script files.
@@ -1867,7 +1829,7 @@A mission script file has the same structure of a subscript file.
+A mission script file has the same structure as of a subscript file.
Side-effects
+Side effects
Yields control to another script. The current script is not resumed for at least the specified number of milliseconds.
This command is useful due to the cooperative multitasking nature of the execution environment.
-Side-effects
+Side effects
Side-effects
+Side effects
Side-effects
+Side effects
Returns from the last called subroutine.
@@ -1969,7 +1928,7 @@Side-effects
+Side effects
Returns true (as in any command updating the compare flag to true).
@@ -1986,7 +1945,7 @@Side-effects
+Side effects
Returns false (as in any command updating the compare flag to false).
@@ -2003,7 +1962,7 @@Side-effects
+Side effects
Associates a name to the executing script.
@@ -2018,7 +1977,7 @@It is unspecified whether a name given by a text label variable is accepted.
+It is unspecified whether a name given by a text label variable is accepted.[11]
Side-effects
+Side effects
Terminates the executing script.
@@ -2049,7 +2008,7 @@Side-effects
+Side effects
Creates a script and sets its program counter to the specified label location.
@@ -2064,7 +2023,7 @@The specified label location must be within a scope.
+The specified label location must be within a scope. Such scope may begin at the next non-empty embedded statement relative to the label location.[12]
The type of a local variable and its respective input argument must match. For instance, if an input argument is an integer literal or variable of integer type, its corresponding local variable in the target scope must be of integer type.
@@ -2073,7 +2032,7 @@If an input argument is a variable, the value assignment to the variable in the target scope shall obey the same constraints as specified in the SET alternator.
+If an input argument is a variable, the value assignment to the variable in the target scope shall obey the same constraints as specified for SET (12.1).
Not only the language, but the grammar presented in this document is ambiguous. Here are all the instances of ambiguity, which is the correct derivation, and suggestions to avoid users getting trapped in them.
IF COMMAND goto other @@ -2714,9 +2672,7 @@C.1.
-We suggest an implementation to emit an warnings to declarations of names and the use of text labels equal to
goto.
TODO list of special command names (user cannot write these, include AND/OR/NOT)
+The leaked script compiler is full of bugs. It was written for in-house use, so it’s meant to work and recognize at least the intended language. The problem is, the language is too inconsistent in this buggy superset. After constantly trying to make those bugs part of this specification, I strongly believe we shouldn’t. For the conservative, the following is a list of known things miss2 accepts that this specification does not.
@@ -2766,6 +2729,9 @@You may use custom characters (c > 127), but you may clash with the characters DMA used to tokenize string literals.
BLA BLA The in-house compiler recognizes all of the octets in the range 01 through FF almost indiscriminately. Some of these codes produce unexpected results, while others are used internally during translation (for e.g. escaping string literals). In practice, only the printable characters were used by designers. For simplicity we restrict source code to these characters. BLA BLA
+A string literal is the same as four tokens
TODO list of special command names (user cannot write these, include AND/OR/NOT) -TODO label semantics of start new script (GTA3 allows label: {}) -TODO SAN ANDREAS ALLOWS IDENTIFIERS TO BEGIN WITH UNDERSCORES -TODO remember GTASA INPUT_OPT does not accept text label vars at all (not at runtime level) -TODO timera timerb (remember, only within scope; cannot declare var with same name; global shall not be named timera/timerb) -TODO interesting NOP is not compiled -TODO creating packages and such are declarations too (not only var decls) -TODO translation limits -LIMITS -TODO gxtsema gxt key length <8 -TODO gxtsema filename (excluding extension must be) <16 -TODO label name <=38 -TODO varname <40 -TODO scriptname <=7 -TODO scriptnames <= 300 -TODO <=9216 gvar storage words -TODO <=16 lvars storage words -TODO <=255 array size -TODO <=35000 label refs -TODO <=255 line -TODO <=127 string
-"This // is a string" is a string literal, not an incomplete string.
+NEGATE for substraction and ONEOVER for division) to perform this operation.
+REPEAT in the in-house compiler disallows local variables. Due to being a complex command, however, the compiler ignores this parametric restriction and ends up permiting local variables (which is likely the intended behaviour).
+SCRIPT_NAME, but this does not tell much because it barely supports any text label variable feature. No compiled multi-file contains script names assigned by a variable. Thus, we refrain from defining any semantics for this case.
+TODO creating packages and such are declarations too (not only var decls) +TODO interesting NOP is not compiled +TODO timera timerb (remember, only within scope; cannot declare var with same name; global shall not be named timera/timerb) +TODO SAN ANDREAS ALLOWS IDENTIFIERS TO BEGIN WITH UNDERSCORES +TODO translation limits +LIMITS +TODO gxtsema gxt key length <8 +TODO gxtsema filename (excluding extension must be) <16 +TODO label name <=38 +TODO varname <40 +TODO scriptname <=7 +TODO scriptnames <= 300 +TODO <=9216 gvar storage words +TODO <=16 lvars storage words +TODO <=255 array size +TODO <=35000 label refs +TODO <=255 line +TODO <=127 string
+diff --git a/dma.html b/dma.html new file mode 100644 index 0000000..98d29b9 --- /dev/null +++ b/dma.html @@ -0,0 +1,82 @@ + + +
+ + + + +
+ + + + +
+