-
-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify the sym and map file formats #483
Comments
Symfiles are generally easy to describe in terms of a couple of regexes:
It might be useful to allow banks to be optional, but RGBDS doesn't have any way of generating unbanked symbols anyway. |
What about allowing comments on non-empty lines? Should shorter bank and addresses be allowed? Also, a valid RGBASM identifier is matched by this regex, according to |
A lot harder to parse by simple tools for little gain. This is not really a problem if you're writing a full-blown parser, but if you want to use a small shell script or something, it's better to not have them.
For addresses, I'd say definitely no, since no advances in technology (e.g., new mappers) will ever change the size of an address, and every tool in the planet displays them as four hex digits anyway. For banks, it might be worth it to allow single-digit banks; I don't really mind either way. (RGBLINK currently outputs two or three digits, of course.)
Probably out of scope for a format that is meant for interoperability; other tools might have their own rules for identifiers. This is an area where it pays off to be as lax as reasonably possible. |
The regex could be added to the spec as a recommendation ("MAY reject or output a diagnostic if the identifier does not match the following regex:") Agree with the rest. |
The problem would be agreeing on what regex to use. |
I don't think we're trying to specify a cross-assembler format, just to stabilize what RGBDS is outputting. |
Honestly, comments on non-empty lines aren't that complex for simple tools. In general, you're going to read each line up until the That, or we could remove comments as a whole. There's literally one comment, and it could be considered useless. |
Comments are generally useful when editing the files by hand. But I've never found any need for inline comments. |
I think that comment should be allowed everywhere... Because it's more predictable. If empty lines allow comments, why do other lines prohibit them? It's completely arbitrary, and counter-intuitive. |
I can only see one scenario where I'd edit a .sym file by hand and it's to document a game without source. Using the .sym file as both a scratchpad with notes and a symbol list for the BGB emulator is pretty useful. That being said, with BGB being one of the primary consumers of this format, it seems useful to look into what it does. EDIT: Comments by the BGB author
|
In file formats where semicolon can never be part of a value, semicolon meaning "ignore to end of line" is fine. Or it can even denote that a description for the preceding label follows. But in file formats that can contain semicolon as part of a value, especially in a quoted string such as a section name, stripping comments requires implementing the entire quoted string behavior. |
Given that semicolons are not part of any valid value here (symbols don't take them), we thankfully don't have this problem. I don't think we'll support semicolons in labels either, period. :P |
WLALINK specifies the .sym file that it generates. It allows inline comments set off with The .sym dialect accepted by mgbdis allows a second consecutive line with the same bank and address to specify the type of a label, so as to mark it in a disassembly as data, text, or 2bpp image.
|
The "second line with same address" thing would be a nightmare to parse (again, thinking of something simple like a regex here, not a full-blown program). It would also require a good definition of what types are valid and how to declare them. And it would collide with symfiles with multiple symbols on the same address, which are common. As for sections, I don't know; they might be useful, but they would impose ordering constraints on the file (right now you can list symbols in any order), and besides, isn't that what map files are for? I'd advocate for the simplest possible format — leave everything else to map files. |
Keep in mind that one use case for sym files is reverse engineering: iteratively editing the file and reloading the sym file in BGB. Since humans are not as precise as machines, it may be appropriate to apply the principle of being conservative in what you output and liberal in what you accept. For example, I'd advocate for allowing the bank and address value to be down to one digit (BGB allows this in practice atm) but also recommending that automatic tools should output 2:4 (or longer prefix if needed.) As for the other questions my votes are: Allowing the bank to be optional: No. Extending the format with other metadata: No. Allowing comments on non-empty lines: Yes. (Useful for annotating human-generated files and already supported in BGB.) Another thing that may be useful to specify formally is the meaning of local labels. While it doesn't necessarily matter syntactically for validating the basic format, it matters semantically. Currently, BGB allows local labels to be referenced within the scope of the global label that the local label belongs to. For example, you could enter something like |
A development as far as this issue is concerned: since we have a website, we have a place where we could publish such specifications. Now that I have some time on my hands, I will start compiling the thread above into a first draft, which I will then publish in rgbds-www on a branch (i.e. it won't be published online yet). Should the discussion continue here, or there? |
I wrote documentation on the SYM file format, but I'm less sure about the MAP one. It's currently still a moving target, and while following a somewhat systematic format, it's mostly intended for human consumption, and not scripts. I'm thinking that we should explicitly document the lack of documentation for now, and publish the SYM spec. |
Closes gbdev/rgbds#483 at least for now; as noted, the `.map` file format is intentionally not specified for the time being. Co-authored-by: Rangi <remy.oukaour+rangi42@gmail.com> Co-authored-by: aaaaaa123456789 <aaaaaa123456789@acidch.at>
The files follow no written specification, and so everyone makes their own assumptions... we should write a specification so we can decide what changes can be reliably made. This would also allow external tools (which I do know some use) to generate compatible files, and those are the reason I would like to make the specification broader than what's currently generated by RGBLINK.
To quote @mid-kid:
The text was updated successfully, but these errors were encountered: