Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Collapse to external file" #144

Closed
BacchusFLT opened this issue Feb 14, 2023 · 8 comments
Closed

"Collapse to external file" #144

BacchusFLT opened this issue Feb 14, 2023 · 8 comments
Labels
enhancement New feature or request

Comments

@BacchusFLT
Copy link

BacchusFLT commented Feb 14, 2023

I am not really sure I have thought this all through yet, so just see this as food for thought;

If you disassemble a big chunk of data, it will contain both code and data. If you coded this yourself, the segments of data would in many cases be inserted in the code from an external file, rather than being included in the primary file.

Let's say I have a bitmap picture, a game font, game level data or something like that, I guess I am seeking a mechanism for selecting that and defining it as a segment that should be exported to a separate file, and where the main assembler file would contain a reference.

While generating the assembler file, I am suggesting a feature saving out a selected chunk and having the assembler file contain something like .import binary "thebinaryfile.bin"

That was KickAssembler syntax, which I know isn't supported, but just as an example. For the c64, I guess we'd need the parameter to select between binary (plain raw data) and prg (which adds the load address as the two first byte as the file), in addition to be able to select the relevant filename.

If the data block was also collapsed, this would increase the overview of the file as a bigger share of the file would be the actual code.

This would help build a proper project from the reassembly, where editing the exported segments using external tools could also start.

The part where I surely haven't thought this all through is in combination with the visualizers. There are cases where "collapse to external file" would be relevant for non-graphical elements (like music and loader) but this could also be a relevant way to sort the request to collapse data covered by visualizers.

@fadden
Copy link
Owner

fadden commented Feb 14, 2023

I can think of two general ways to do something like this:

  1. Import raw binary.
  2. Import assembler code with bulk data instructions.

The limiting factor on (1) is likely assembler support. I'm not sure how many assemblers allow you to insert a fully-formed binary blob. Some assemblers might want to link binaries in rather than include them in. (2) is easier because it just requires some version of "#include", which is pretty standard.

For (1), the labels need to be in the base file. For (2), we can have a file with multiple blobs with individual labels. In fact, for (2) we can just generally put code in other files. OTOH, for (2) we have to consider nested includes.

There's a similar open item for 65816 code, for which having everything in one file is very awkward, but that's a broader issue (and you'd very much prefer linking over including in that case to help with label scope).

@BacchusFLT
Copy link
Author

BacchusFLT commented Feb 14, 2023

I see the value of both the 1 and 2 options, so if both could be supported then you also have the full framework for the inclusion of external files also including source.

I am however mainly advocating option 1 here, as that would mean that the external file could be edited by an external editor. The whole point would be that an external editor would be able to access it in the binary format. My modus operandi as it is today is that I save the binary segments out using an emulator and then only feed 6502 bench with the segments containing code. I then need to manually create the inclusions. With the suggested feature I could import the entire memory dump, and for that sorting out what is code, what is data that I want in my main file, data I would want separately and segments that are uninitialized memory or junk that I don't care about.

The assemblers I found documentations to have the feature:

TASS has ".binary"
ACME has "!binary"
CA65 has ".incbin"

I have no idea how the Merlin works, so that would possibly be a restriction. I can't find how it should be done, and it would be sad if the least common denominator missing one of the options would prevent a beneficial feature.

@fadden
Copy link
Owner

fadden commented Feb 16, 2023

I appear to have underestimated support among assemblers. Thank you for doing the research.

We can work around non-support by just not separating the data if the assembler can't handle it. Some source generators already have situations (usually having to do with 65816 code) where they just dump raw hex.

Generating multiple output files is already supported (needed for ca65), so that's not an issue. We'd need to have the code generator output a label and an include statement in the generated source, then copy the binary data to a separate file, and skip forward.

The uncertain part is handling it in the user interface. We need a way to identify the region of bytes to copy to the file. One approach would be to make it a new data operand format. Since the region can't have labels or comments in it, and ideally it doesn't show up on-screen as anything more than a "500 bytes here" sort of marker, that might work. You can't straddle a label or long comment with a single operand format, so we prevent mid-region labels automatically. We need a place to store the filename; the SymbolRef field might work for that, since it's a weak reference and nothing will break if the symbol can't be found. Reverting the binarification could be done by changing the data operand.

Visualizers should still work, because they're based purely on file offset, and do not require a label.

Another way might be to define a new kind of label. When encountered, everything that follows, up to the next label, is thrown into the binary file. The filename could be tucked into the end-of-line comment.

Generated files currently have the form "name_assembler.ext". We might not need the "_assembler" part, since the binary is not particular to any given assembler. OTOH, if multiple assemblers are in use, it might help to keep all files associated with a given assembler in a nice group.

@BacchusFLT
Copy link
Author

BacchusFLT commented Feb 16, 2023 via email

@fadden
Copy link
Owner

fadden commented Feb 16, 2023

The reference to "output a label", would that mean that all references to
inside the block exported would be "label+offset"? (I would very much like
that!)

Yes. That would work the same way it does now.

Can I also iterate that collapsing the area would also be much appreciated.
If that is a separate option, which could also apply to other areas or just
this, I can't say. General is always better if it's possible.

Collapsing this type of section is easy if we do it as a distinct operand format. Presumably any data that's worth omitting from the assembly isn't all that interesting to look at. Anyone who wants to see what's actually there can double-click on the "bytes" column to view the file hex dump.

I'm not sure what to do with the HTML export, which is really just an slightly modified copy of what's shown on screen. Appending a hex dump of the imported file might make sense.

A more general show/hide mechanism would be best integrated into the main list UI. We could add a checkbox to the data operand editor that means, "only show the first line of multi-line items", but that's inconvenient to toggle, and doesn't help with something like a sprite formatted to have one graphic line per line of code (which looks nicer in the assembler, where you can't hide anything).

@fadden fadden added the enhancement New feature or request label Feb 16, 2023
@BacchusFLT
Copy link
Author

BacchusFLT commented Feb 17, 2023 via email

@fadden
Copy link
Owner

fadden commented Feb 18, 2023

Added to "TO-DO" list.

@fadden fadden closed this as completed Feb 18, 2023
@fadden fadden reopened this May 20, 2024
fadden added a commit that referenced this issue Jun 1, 2024
This adds a new data format option, "binary include", that takes a
filename operand.  When assembly sources are generated, the section
of file is replaced with an appropriate pseudo-op, and binary files
are generated that hold the file contents.  This is a convenient way
to remove large binary blobs, such as music or sound samples, that
aren't useful to have in text form in the sources.

Partial pathnames are allowed, so you can output a sound blob to
"sounds/blather.bin".  For safety reasons, we don't allow the files
to be created above the project directory, and existing files will
only be overwritten if they have a matching length (so you don't
accidentally stomp on your project file).

The files are not currently shown in the GenAsm dialog, which lets
you see a preview of the generated sources.  The hex dump tool
can do this for the (presumably rare) situations where it's useful.

A new regression test, 20300-binary-include, has been added.  The
pseudo-op name can be overridden on-screen in the settings.

We don't currently do anything new for text/HTML exports.  It might
be useful to generate an optional appendix with a hex dump of the
excised sections.

(issue #144)
@fadden
Copy link
Owner

fadden commented Jun 1, 2024

Available in https://github.com/fadden/6502bench/releases/tag/v1.9.0-dev2

In brief:

  • Works just like the other "bulk data" formatters.
  • Each binary include must have a unique filename.
  • Files can be stored in subdirectories (e.g. "sounds/stuff.bin"), but can't ascend to a parent of the project directory.
  • The binary files are generated at the same time as the assembly sources. Existing files with the same names will be overwritten if and only if they have the same length. This is a safety measure to avoid inadvertently overwriting other files.

@fadden fadden closed this as completed Jun 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants