Closed
Description
These options affect how RGBASM works at the lexer level, which can lead to surprising syntax errors. They're also too permissive.
- Standard digits can be changed.
b10
swaps 0 and 1, so%101010
is 21 instead of 42. Assigning a standard digit to a nonstandard placement should be an error. - Digits can be ambiguous.
bXX
is considered valid (and just treatsX
as the digit 0, not 1). Repeating any digit should be an error. - Custom digits override the standard digits.
b.X
makes%X.X.X.
lex as the value 42, but makes%101010
lex as a separate%
operator and101010
number. Custom digits should be alternatives, while still lexing the standard ones (since by point 1, the standards ones cannot have their values changed). - Some characters can be specified via the CLI that cannot via
opt
, and they significantly change how the source code is lexed.opt b;X
lexes asopt b
and then a comment, butrgbasm "-b;X"
successfully makes;
be the 0 digit, so%X;X;X;
lexes as the number 42. There should be an allowed subset of digit characters, and special ones like;
and\
should not be in it.
Activity
aaaaaa123456789 commentedon May 12, 2025
I'm not sure about changing standard digits; maybe there is utility in that — but I do believe that, if the standard digits aren't changed (i.e., if none of them is specified in the mask), they should be allowed as alternatives. Duplicates should obviously be an error.
As for a character whitelist, it's probably better to start by listing the things that shouldn't be allowed, just so nobody forgets:
+
and-
as digits, but the truth of the matter is that%++2
becomes ambiguous if you do that.::
would become ambiguous if it could be part of a number,;
would become confusing (particularly considering that some tools do parse code, which might interpret it as a comment), and\
is essentially always a problem.Rangi42 commentedon May 12, 2025
Agreed. Note, "operators" means
+
-
*
/
%
&
|
^
<
>
=
!
~
.Omitting those, that leaves these ASCII characters:
A-Z
a-z
. Those are fine.%
&
$
`
. Disallow those._
. Probably disallow that, since it's the goes-anywhere "digit separator" in numeric literals. (On the other hand, it might be popular as a choice for 0.).
. Should be fine, since it's valid in local labels and fixed-point literals, and is a popular choice for 0.@
#
. Should be fine.,
. Disallow this, since it separates lists of numeric literals.?
and single quote'
. Probably disallow these, since'
will plausibly gain use as a quote character (e.g. for character literals), and?
could become a ternary operator.That leaves the whitelist as:
A-Z
a-z
.
@
#
, maybe_
, maybe0-9
, and maaaybe?
or'
.aaaaaa123456789 commentedon May 12, 2025
You should definitely at least also allow the digits themselves — so you can specify an alternative for just some of them. (Think
opt b.1
.) I'd honestly allow them all, but at the very least they should be able to represent themselves.EDIT: note that this allows
opt b01
as a natural way of disabling all aliases.Rangi42 commentedon May 12, 2025
Yeah, that's why I said "Assigning a standard digit to a nonstandard placement should be an error."
Rangi42 commentedon May 12, 2025
Our current test cases:
Most of those will become errors if they weren't already, and that's okay.
Note that invalid
opt
directives -- likeopt b123
-- are currently non-fatal errors, so invalid chars likeopt b$^
should be too.aaaaaa123456789 commentedon May 12, 2025
Errors should always be the lowest category that they can be. That means that things that could be warnings should be warnings; things that can be non-fatal errors should be non-fatal. I couldn't tell you if it can be non-fatal, as I'm not implementing the feature; but if it can be, it should be.
EDIT: CLI options should fail right away, though. There's no reason to parse a file from a mistyped command line.
Rangi42 commentedon May 12, 2025
Yeah, things like
rgbasm -b123
are instant-failure errors, so-b;%
would be too.Rangi42 commentedon May 12, 2025
But I need to do
db %😻ඞ😻ඞ😻ඞ
! (It's not ambiguous with any language syntax! :3 )aaaaaa123456789 commentedon May 12, 2025