Skip to content

Conversation

@Rangi42
Copy link
Contributor

@Rangi42 Rangi42 commented Oct 8, 2020

(This PR replaces #592; I'm keeping unrelated fixes in their own branches.)

I expect there will be discussion over whether this is an appropriate fix, and whether nested MACROs should be explicitly disallowed; but at least they're technically possible.

rgbasm now handles this asm the same way as 0.4.1 and outputs the ROM 11 ff 99 99 ee:

SECTION "test", ROM0[0]
foo: \ ; fooo
 MACRO
 ; blah1: MACRO
 db $11
 ; ENDM
\1: \
 macro
  ; blah2: MACRO
  dw $9999
  ; ENDM
 endm
spam_\1_eggs \
 \ ; comment
 : macro
  ; blah3: macro
  db $ee
  ; endm
 endm
 db $ff
ENDM
 foo bar       ; outputs $11, defines bar and spam_bar_eggs
 bar           ; outputs $99 $99
 spam_bar_eggs ; outputs $ee

I went ahead and altered the line-continuation-macro test to be passed. Rationale in the previous PR: #592 (comment)

This alters the result of the line-continuation-macro test; I've argued that that the new result is a logical one.
@ISSOtm
Copy link
Member

ISSOtm commented Oct 9, 2020

No problem with the modified test case, given the altered requirements for an ENDM to count, it makes sense. I'm actually surprised this is correctly handled, I didn't think about it :P


/* Function to read identifiers & keywords for macros within macros */

static bool startsMacroWithinMacro(int c)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There must be a similar check elsewhere, and should be merged with this. Preferably, the factored-out function should be placed close to the function that actually performs identifier lexing, so that the character sets are kept in sync during future modifications.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

startsMacroWithinMacro was based on startsIdentifier. They both allow A-Za-z_, but startsIdentifier also allows . for local labels, whereas startsMacroWithinMacro also allows \\ for macro arguments or line continuations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest having a startsIdentifier and startsScopedIdentifier, perhaps? The latter would accept ., but not the former.

return (c <= 'Z' && c >= 'A') || (c <= 'z' && c >= 'a') || c == '_' || c == '\\';
}

static int readMacroWithinMacro(char firstChar)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this name, because it seems to imply that the function reads the entire macro, when it only checks for a macro boundary. Maybe readMacroBoundary? Though I'm not satisfied with that one...

@ISSOtm
Copy link
Member

ISSOtm commented Oct 9, 2020

After reviewing the code, the part that I really don't like is having to manually handle macro arguments. The entire point of the new lexer is to flatten them in a single place, and this would begin introducing the old lexer's pattern of handling macro args explicitly in each place—which led to a bunch of idiosyncrasies.

As @AntonioND pointed out, the other problem with nested macros is that there is no clear distinction between the two "macro" passes, i.e. whether an argument should be expanded when evaluating the outer macro, or the inner one. The old lexer behavior was to expand nothing inside the inner when processing the outer, which is fine.

The key insight here, is that there isn't much incentive to define a macro within a macro if you can't "pre-expand" anything into it. @Rangi42 instead suggested sticking the macro definition in an EQUS, which sounds like an acceptable workaround (syntax potentially made more bearable by #589).

Another problem with defining a macro in a macro is the lifetime of the underlying buffer: to help speed up macro definition, they are stored as views into buffers as much as possible, instead of creating additional copies. Allowing macros to be defined from within macros would require adding more lifetime logic (ref counting, sure, but that implies a pointer both to the buffer start and the ref-counted buffer, which starts sounding like madness).

Worse still, the computations required to check for a macro definition start require a lexer-in-lexer (to expand any macro args), and a parser-in-lexer (to ensure consistent behavior with the parser), which is more madness that the new lexer is supposed to fix away, and that I don't want to have to maintain either.

I guess the best way to prohibit nested macros would be for the capture function to stop checking for macro definitions (or any kind of depth, for that matter), and have the parser check if inside a macro when reaching a macro definition line.

@Rangi42
Copy link
Contributor Author

Rangi42 commented Oct 9, 2020

If #590 can be fixed to allow using MACRO/ENDM (and REPT/ENDR) inside an EQUS, plus some multi-line string syntax to make it prettier, then yes, that would make MACRO-in-MACRO strictly less powerful than MACRO-in-EQUS-in-MACRO, since the former can't distinguish inner versus outer macro args like \\1 vs \1.

If lexer_CaptureMacroBody were simplified to not track a level at all, it would stop at the first ENDM, so if anyone nested one macro in another they'd get syntax error at the second ENDM. I think that would be enough indication that nested macros aren't supported. (The alternative would be to scan for a MACRO token while capturing the macro body, but you'd still need this lexer-in-lexer behavior since MACRO isn't the first token in its line.)

Basically if we can take care of #590 I can modify this to be "Remove the partial level support from lexer_CaptureMacroBody". :P

@ISSOtm
Copy link
Member

ISSOtm commented Oct 9, 2020

If lexer_CaptureMacroBody were simplified to not track a level at all, it would stop at the first ENDM, so if anyone nested one macro in another they'd get syntax error at the second ENDM. I think that would be enough indication that nested macros aren't supported. (The alternative would be to scan for a MACRO token while capturing the macro body, but you'd still need this lexer-in-lexer behavior since MACRO isn't the first token in its line.)

Beginning a macro definition would have to be checked for anyways, to avoid calling the macro capture function incorrectly. One could argue that nested macros should be checked during macro definition instead of invocation, but 1. macro args make that impossible to predict, and 2. it would lead back to the parser-in-lexer problem we're trying to avoid anyways.

@Rangi42
Copy link
Contributor Author

Rangi42 commented Oct 9, 2020

#388 is relevant to this issue.

@ISSOtm
Copy link
Member

ISSOtm commented Oct 10, 2020

Yes, and really, considering the interaction that that could have with macro args, I have since changed my mind, and don't want it anymore. What I was intending it for was a big hack, anyways...

@ISSOtm ISSOtm closed this in 462fd75 Dec 9, 2020
@Rangi42 Rangi42 deleted the issue588 branch December 9, 2020 13:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants