Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleaner and more verbose extract.sh; drop bash dependency #67

Merged
merged 3 commits into from
Jun 2, 2022

Conversation

vrza
Copy link
Contributor

@vrza vrza commented May 31, 2022

Tidying up this script a bit as well.

@taviso
Copy link
Owner

taviso commented Jun 1, 2022

Looks good to me, what do you think about using set -e?

@vrza
Copy link
Contributor Author

vrza commented Jun 1, 2022

I prefer explicit error handling and almost never use set -e in scripts.

In this particular case, while set -e would exit the script early, it would still not perform clean up (removing root/ and orig/) which might be necessary in order to re-run the script.

We could add a cleanup function, either to be invoked by the user manually (extract clean), or we can run it and exit on any error.

While in theory cleanup could be triggered by using set -e with trap ERR, in practice it's fiddly, as some commands might exit a non-zero status during normal script operation.

https://mywiki.wooledge.org/BashFAQ/105

Somewhat related thoughts, for your consideration:

  • I only use .sh extension for shell scripts if they are meant to be used as a library, by invoking . lib.sh from another script
  • Perhaps the extract process should be triggered by a Makefile target, for a more unified build and cleanup user experience.

- use POSIX options with cpio and tar for portability
- drop dependency on working directory
- add 'clean' subprogram
@vrza
Copy link
Contributor Author

vrza commented Jun 1, 2022

Made the script more portable, #62 is now partially fixed, but libarchive cpio aka bsdcpio failing to extract 123UNIX4.IMG is something that still needs looking into.

Skip duplicate / junk data at beginning of 123UNIX4.IMG.
Now it can be extracted with both GNU and BSD (libarchive) cpio.
@vrza
Copy link
Contributor Author

vrza commented Jun 1, 2022

Fully fixed #62.

@vrza
Copy link
Contributor Author

vrza commented Jun 1, 2022

Begginings of123UNIX2.IMG and 123UNIX4.IMG seem to be the same:

$ cmp 123UNIX2.IMG 123UNIX4.IMG
123UNIX2.IMG 123UNIX4.IMG differ: byte 36865, line 489 

This is where in 123UNIX4.IMG, 123_sysV.hlp.z seems to be corrupt (gunzip detects corruption in compressed data).

123.o.z_2 is found deep in 123UNIX4.IMG, 550kB in, in cpio format. GNU cpio is able to skip to this file automatically, but BSD (libarchive) cpio is not.

It's hard to say how and when, but it almost seems like 123UNIX4.IMG was at some point partially overwritten with data from 123UNIX2.IMG.

@taviso
Copy link
Owner

taviso commented Jun 2, 2022

Thanks this looks good to me - I'm tempted to wait to merge until the keymap tool is done #65 (hopefully tomorrow), because right now it just uses xterm for everything. I think I can just pull all the right sequences from termcap and it will be a pretty good default.

@taviso taviso merged commit bd2577e into taviso:main Jun 2, 2022
@taviso
Copy link
Owner

taviso commented Jun 2, 2022

Never mind, I changed my mind, let's just commit it for now.

taviso added a commit that referenced this pull request Jun 2, 2022
@dtwilliamson
Copy link

I recommend a habit of avoiding all-caps variable names in shell scripts to reduce the chance of collisions with shell and environment variable names.

http://mywiki.wooledge.org/BashGuide/InputAndOutput#The_Environment

@vrza
Copy link
Contributor Author

vrza commented Jun 2, 2022

We should definitely be careful not to clobber any exported variables, lest they be exported to our child processes inadvertently modified.

Regarding naming style, I try to stick to an old convention of using ALL_CAPS for constants (vars not modified after assignment) and exported vars. As a reference, Google shell scripting style guide also recommends this:

https://google.github.io/styleguide/shellguide.html#s7.3-constants-and-environment-variable-names

@taviso
Copy link
Owner

taviso commented Jun 3, 2022

I'm hearing about issues with the new extract script - I'm tempted to revert it for now until I can test it more thoroughly.

(I just heard it hangs under WSL2, I don't really know why)

taviso added a commit that referenced this pull request Jun 3, 2022
@taviso
Copy link
Owner

taviso commented Jun 3, 2022

I figured out the problem, it wasn't hanging, disk i/o on wsl2 is just ridiculously slow - so the dd seeking with bs=1 was just taking forever.

I'll re-merge this tomorrow.

@vrza
Copy link
Contributor Author

vrza commented Jun 3, 2022

@taviso Try inverting the parameters (skip=1 bs=550536) to see if that helps improve performance?

The root cause of why 123UNIX4.IMG seems to be corrupt still needs investigating -- dd seek is a workaround for data corruption. Finding a pristine copy of that floppy disk image would be nice.

In the meantime, you could also just upload the trimmed version of 123UNIX4.IMG, so that the script doesn' have to skip corrupt data.

@taviso
Copy link
Owner

taviso commented Jun 3, 2022

Yeah, It's really unfortunate - the original files I recovered were TD0 files, so I thought maybe the tool I used to recover the images were buggy - but I've tried three independent implementations and they all produce the same file.

I guess it's possible this is some form of copy protection, I know that the 123 DOS and Agenda disk images had some unusual properties that made them difficult to copy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants