Skip to content

Add -o / --output option to rgbfix to write separate output files #1666

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 20, 2025

Conversation

jmillikin
Copy link
Contributor

This allows rgbfix to be used from build systems that discourage or forbid in-place modification of input files in build steps.

  • rgbfix foo.gb and cat foo.gb | rgbfix continue to work as-is.
  • New behavior: rgbfix foo.gb -o out.gb opens foo.gb read-only and writes output to out.gb
  • Existing stdin/stdout support also composes with -o, such that cat foo.gb | rgbfix -o out.gb and rgbfix foo.gb -o - | cat > out.gb work.

@Rangi42 Rangi42 requested a review from ISSOtm March 1, 2025 08:32
@Rangi42 Rangi42 added this to the 0.9.2 milestone Mar 1, 2025
@Rangi42 Rangi42 added enhancement Typically new features; lesser priority than bugs rgbfix This affects RGBFIX labels Mar 1, 2025
@Rangi42
Copy link
Contributor

Rangi42 commented Mar 1, 2025

Hey jmillikin, thank you for this contribution! I'd like to know what @ISSOtm thinks of this feature.

On the one hand, it can currently be done with rgbfix - > out.gb < in.gb, but an explicit rgbfix -o out.gb in.gb flag might be more obvious to users.

If we do want to merge it, the man/rgbfix.1 docs and the contrib/*_compl files would also need updating.

@jmillikin jmillikin force-pushed the rgbfix-output-filename branch from 58450ef to 3255762 Compare March 1, 2025 14:10
@jmillikin
Copy link
Contributor Author

Updated man/rgbfix.1 and contrib/*_compl/ for the -o option.

The build system I'm using (Bazel) is pretty strongly file-oriented, and doesn't make it easy to use stdin/stdout for communicating data to/from subprocesses. If invoking a shell manually there's issues of path quoting, so rgbfix would need to be invoked like this:

args = ["/bin/sh", "-c", "$1 - > $2 < $3", "sh", rgbfix.path, out_gb.path, in_gb.path],

... and frankly a desire to avoid that is what inspired the creation of this patch request.

@ISSOtm
Copy link
Member

ISSOtm commented Mar 1, 2025

The problem with >final.gb is that this is handled by the shell, and thus is created empty on failure.

The problem with -o is that the output file can be only partially (over)written.

The proper fix is to write to e.g. final.gb.tmp, and then mv final.gb.tmp final.gb to atomically overwrite the result file.

Note that the “input file” problem is also solved by the above fix; and that RGBLINK and RGBFIX should be considered a single step (they are only separate programs so that RGBFIX can be reused externally... and there have been feature requests for making RGBLINK self-sufficient).

@jmillikin
Copy link
Contributor Author

jmillikin commented Mar 1, 2025

Merging the execution of two separate binaries (rgblink and rgbfix) into a single build step would require me to write a wrapper script of some sort, which is a bit of a hassle.

rgbfix seems useful on its own, since it allows patchelf-style modifications to a ROM image without having to put all of the possible modifications into rgblink.

The problem with -o is that the output file can be only partially (over)written.

The proper fix is to write to e.g. final.gb.tmp, and then mv final.gb.tmp final.gb to atomically overwrite the result file.

In this patch the output file is opened with O_CREAT, so if it already exists then rgbfix will error out instead of trying to overwrite it. And if it errors out (calls exit(1)) then the output file will be ignored and discarded by the build system, so partial writes aren't a concern.

@Rangi42
Copy link
Contributor

Rangi42 commented Mar 2, 2025

there have been feature requests for making RGBLINK self-sufficient

This could be difficult, at least with the current v0.x CLIs, because rgblink and rgbfix each have a lot of flags already, and one program that does it all would get complicated.

@jmillikin
Copy link
Contributor Author

Gentle ping?

@ISSOtm
Copy link
Member

ISSOtm commented Mar 11, 2025

So rgbfix <FILE> remains useful for manual in-place edits.

As for the functionality itself, what does this patch provide that mv <FILE>.tmp <FILE> doesn't?

@jmillikin
Copy link
Contributor Author

jmillikin commented Mar 12, 2025

As mentioned up-thread, I'm trying to use rgbds with a build system named Bazel, which defines the build as a graph that uses commands to map from input files to output files. This build system design requires that commands specify their input and output files; this is enforced by (1) making inputs read-only and (2) not allowing the same output file to be generated by different commands.

The other binaries in rgbds work perfectly in this sort of build system. rgbasm maps source file -> object file, rgblink maps object files -> .gb ROM file -- but rgbfix mutates its input file in-place, which isn't allowed. Which means I need to write a wrapper program around rgbfix to set up its filesystem, then fork() / exec() / CreateProcessW(), then move the output files around to the right place.

Since the other binaries in rgbds all open their inputs read-only and have some way to specify the output path, I figured it would be reasonable for rgbfix to have similar functionality. That allows it to be invoked from the build system, rather than requiring a manual fix-up step to set the ROM logo and padding and such.

@ISSOtm
Copy link
Member

ISSOtm commented Mar 13, 2025

I see. This is already possible using the piping functionality, which I'll express in Makefile syntax, knowing nothing better:

%.o: %.asm
    rgbasm -o $@ $< ...

%.gb.tmp: %.o %.o ...
    rgblink -o $@ $^ ...

%.gb: %.gb.tmp
    rgbfix - >$@ <$< ...

Is this sufficient?

@Rangi42
Copy link
Contributor

Rangi42 commented Mar 13, 2025

Is this sufficient?

I don't think so, since the whole point was that Bazel makes it inconvenient to work with - < > like that (see the /bin/sh Bazel syntax example above).

If the reasoning against this feature is just that the shell has features to do it manually, then I'd still be in favor of adding it. That way every program could work with in/out files:

rgbasm  -o foo.o   foo.asm
rgblink -o foo.tmp foo.o
rgbfix  -o foo.gb  foo.tmp

or with stdin/out piping:

< foo.asm rgbasm -o - - | rgblink -o - - | rgbfix - > foo.gb

and wouldn't rely on a particular execution environment to work. (Shell scripts, Makefiles, Bazel, Python subprocesses, etc, don't all have convenient features beyond "run this executable with these arguments".)

@jmillikin jmillikin force-pushed the rgbfix-output-filename branch from 3255762 to 3f187fe Compare March 13, 2025 23:30
@jmillikin
Copy link
Contributor Author

@Rangi42 Yep, exactly! Thank you for explaining it better than I was.

@ISSOtm
Copy link
Member

ISSOtm commented Mar 14, 2025

Fair enough, then.

@jmillikin
Copy link
Contributor Author

Are there any additional changes the reviewers would like to see here?

Copy link
Member

@ISSOtm ISSOtm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a tiny little thing regarding the Bash completions, I don't have anything to request about this patch! :)

@jmillikin jmillikin force-pushed the rgbfix-output-filename branch from 3f187fe to 71562f2 Compare April 4, 2025 12:06
Copy link
Member

@ISSOtm ISSOtm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thank you ^^

@ISSOtm ISSOtm requested a review from Rangi42 April 4, 2025 12:30
Comment on lines +1299 to +1298
if (!outputName) {
output = input;
}
processFile(input, output, name, stat.st_size, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid closing the same file handle twice, as output and as input.

Suggested change
if (!outputName) {
output = input;
}
processFile(input, output, name, stat.st_size, true);
processFile(input, outputName ? output : input, name, stat.st_size, true);

Copy link
Contributor Author

@jmillikin jmillikin Apr 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, I don't understand this comment.

The close(output) call is guarded by if (openedOutput), which can only true if outputName is non-null. The guard around output = input has the opposite test, so I don't think there's any path by which close() could be called twice on the same FD.

Comment on lines +1237 to +1258
int output = -1;
bool openedOutput = false;
if (outputName) {
if (!strcmp(outputName, "-")) {
output = STDOUT_FILENO;
(void)setmode(STDOUT_FILENO, O_BINARY);
} else {
output = open(outputName, O_WRONLY | O_BINARY | O_CREAT, 0600);
if (output == -1) {
report(
"FATAL: Failed to open \"%s\" for writing: %s\n", outputName, strerror(errno)
);
return true;
}
openedOutput = true;
}
}
Defer closeOutput{[&] {
if (openedOutput) {
close(output);
}
}};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int output = -1;
bool openedOutput = false;
if (outputName) {
if (!strcmp(outputName, "-")) {
output = STDOUT_FILENO;
(void)setmode(STDOUT_FILENO, O_BINARY);
} else {
output = open(outputName, O_WRONLY | O_BINARY | O_CREAT, 0600);
if (output == -1) {
report(
"FATAL: Failed to open \"%s\" for writing: %s\n", outputName, strerror(errno)
);
return true;
}
openedOutput = true;
}
}
Defer closeOutput{[&] {
if (openedOutput) {
close(output);
}
}};
int output = -1;
if (outputName) {
if (!strcmp(outputName, "-")) {
output = STDOUT_FILENO;
(void)setmode(STDOUT_FILENO, O_BINARY);
} else {
output = open(outputName, O_WRONLY | O_BINARY | O_CREAT, 0600);
if (output == -1) {
report(
"FATAL: Failed to open \"%s\" for writing: %s\n", outputName, strerror(errno)
);
return true;
}
}
}
Defer closeOutput{[&] {
if (output != -1) {
close(output);
}
}};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggestion would cause close(STDOUT_FILENO) to be called if the first branch is taken; is that intended?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is not intended. But, my goal here and in the other change suggestion was to avoid the openedOutput boolean flag logic, unless it's absolutely necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The openedOutput flag is necessary because output might be in one of three states:

  • No output, which means the input file is modified in-place.
  • Output is valid and was opened by this function, so it should be closed before this function returns.
  • Output is valid but is borrowed from the global state (STDOUT_FILENO), and therefore should not be closed.

If you want to avoid a bool openedOutput then there are some options, but they're more elaborate than a bool so probably not what you're looking for:

  • Call dup(STDOUT_FILENO) so that the validity of output guarantees it's owned by this function.
  • Split processFilename() into two functions, with the lifecycle of output being handled similarly to how processFilename() currently handles input. Maybe call them processInputFilename() and processOutputFilename() ?
  • Add a new class MaybeOwnedFd that keeps track of whether its internal int fd is owned or borrowed, and has a destructor that closes the fd if owned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thank you for explaining. You've carefully thought this out. I may add more tests for the edge cases and refactor this in one of those ways, but it's good as-is for now.

@jmillikin jmillikin force-pushed the rgbfix-output-filename branch from 71562f2 to 487d7e2 Compare April 5, 2025 06:06
@jmillikin
Copy link
Contributor Author

Gentle ping? Sorry to keep poking via email, but I'm not sure if the comment replies are getting through.

@Rangi42
Copy link
Contributor

Rangi42 commented Apr 12, 2025

Gentle ping? Sorry to keep poking via email, but I'm not sure if the comment replies are getting through.

They are, it's just nobody has been working on rgbds lately. This isn't forgotten though; it will be done before 0.9.2 releases.

@avivace
Copy link
Member

avivace commented Apr 12, 2025

@Rangi42 what is missing here to merge?

@jmillikin jmillikin force-pushed the rgbfix-output-filename branch from 487d7e2 to a7843b5 Compare April 14, 2025 12:14
@jmillikin
Copy link
Contributor Author

Updated to fix clang-format errors.

@Rangi42 Rangi42 merged commit bc8d99d into gbdev:master Apr 20, 2025
23 checks passed
@jmillikin jmillikin deleted the rgbfix-output-filename branch April 20, 2025 04:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Typically new features; lesser priority than bugs rgbfix This affects RGBFIX
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants