New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
simh changes .dsk image files by silently adding signature #1059
Comments
This is the revision that added this questionable feature.
|
Please explain "which part" of the original data is "corrupted", and how this negatively affects you and all the things you may do with such a container. |
First and foremost, adding something to a disk image, changes its size (obviously). I use an SD card to store the drive images for the same disk controller back to back. I put these images on SD card with Next thing is that like I mentioned before, the images created previously were sometimes under-allocated. If the guest OS was merely reserving the sectors of such a disk by marking them used, but never writing to those, the sectors were never added to the file still would be treated as "0"s when read beyond the image boundary. And that was fine. With the current addition, the contents read might well be the "magic" added by simh, and that is not okay. One notable thing is the index file on ODS1, which was allocated but no blocks might have been "cleared" -- merely the clean bitmap is written at the beginning of the file. Still, the structure verification program(s) would go ahead and read those never stored sectors, and expect them to be unused (in accordance with the bitmap). With the new addition, the sector does look like a used file header, actually: as it has a non-zero file number and a checksum (at the end of the sector) -- but they contain "garbage" -- as they do not look like either valid Lastly, the code included in When I first saw simh starting to print a message on attach that it found a filesystem, I thought, "Oh, fine!" But so can do the
Now simh acts upon that rather intrusively, and that it not fine, anymore. If you need to store meta-information about a container, you should be doing so in a separate file (or, if you like Windows that much and insist on using the same file -- in a separate NTFS stream), so it does not get in the way of the actual disk data. More so, if the user does not want you to create the container information files stored next to their disk images, they can also control that by (at least) changing permissions for the directory not to accept any new files. Finally, if they want to remove what simh added, they won't need to jump through the hoops and tweak the contents of the .dsk files -- they would merely need to delete those extra files (and also all at once, quickly, with something like The .dsk file modification is a totally unacceptable technique, no matter what the intent is / was. When you insert a thumb drive into your computer, your system does not start modifying your data behind your back. Yes, Mac tends to create those annoying "thumb" folders, but they do not change any of your files, just because. They keep their stuff separately, and easily removable, too! So I urge you to revise these (backward-incompatible) changes with the handling of the image files, and to never modify the .dsk containers from outside the guest OS (formatting a new drive with writing the STD144 track is the only exception -- and even that used to be done only with an explicit consent from the user, either via a command switch or a question asked by simh directly). |
I don't understand exactly what you're describing here. Are you saying that you've got n dd images from some physical disk merely stuffed one behind the other on a SD card that DOES NOT HAVE a file system on it? And when you attach this raw SD card to a simh disk, it is writing beyond the first dd'd image and overwriting the first 512 bytes of the second dd image? I'm wondering how you find such a SD disk useful. Specifically how do you reference the second and subsequent simh dd images stuck one behind the other on that SD disk? If, you actually have a file system on this SD disk, then each of these dd images are separate files and thus if sim_disk adds 512 bytes of data BEYOND the actual size of the simh unit that the container is being attached to, then no other disk images will be corrupted NOR will any of the data WITHIN the container (sized based on the drive it is attached to) be changed. The container size may indeed be expanded to the full container size of the simh device, but any such expansion will contain 0's and thus read operations to the simh device will return the same values as reads to the previously unexpanded drive.
As I said above, if the container is expanded to reflect the logical size of the simh unit it is attached to, the expanded content (readable from any OS running within the simulator) will contain 0's and thus return the same contents it did on prior versions. The extra 512 bytes are written past the logical size of the simulated disk drive and thus will never be visible to anything running within the simulator. If you think you've got an example case where this is not happening I want to see the details.
You seem to have some specific knowledge about the functional failures in the current detection logic for the file systems you mention. Please provide example disk images with legitimate file systems that aren't properly detected and/or propose changes to fix what you've seen.
It comes down to the general concept that AUTOSIZING of disks is the default since that is the most flexible for many more folks since they generally don't read every detail of exactly how to use each device and as you noted above, older disks which had legitimate file systems on them, but hadn't been written out to the full size of the disk would AUTOSIZE to the wrong disk type. Detecting file systems on disks let SCP know what the simulated operating system actually thought the disk size was when it was originally created and thus allow AUTOSIZE to work correctly, otherwise the disk type would be chosen based on the container size which might not be correct at all.
That may or may not be true for some cases, and certainly could be accommodated if there were any reasonable big endian systems available today. I've got an old Sparc box sitting around which I purchased most of 20 years ago specifically to have as a big-endian test system for simh development. That box hasn't been turned on for more than 10 years due to lack of demand and the horrible performance that it had the last time things were tried. If you can point to a big endian system which can be had cheaply we can look at this. You really haven't described how or when the additional 512 bytes added to the end of disk container files actually causes "corruption" or otherwise actually impacts you. |
The SD card goes into my FPGA that is running PDP11... You see, simh is not the only simulator out there, and your change makes it very hard to make the images interchangeable, like they were before. Yes, SD card does not have a file system of its own, it's barely an array of sectors. Drive images there follow each other one after the other, for a single controller -- all of the same size, so base address of the next one is its number times the size of the emulated drive. The additional sector messes that up (I did not know it was there until I noticed the boot sector [of the next drive] was gone). I know that I can use the transfer size with How do you know that the containers were correctly sized by simh? What if there's a human error (a mere typo)? Simh would "label" the pristine image per the wrong attachment right away, and there's no way back -- the wrong data is already there (in that label) that was forced onto the disk contents! I still don't understand whose decision was that to write something into the .dsk files -- community's? What I know for sure, is that it's the wrong thing to do. You can ask anybody. You should not be writing into user's data. Period. You want to keep your metadata -- keep it separately. If the goal is the mere "autosizing" for the lazy folks, then I guess they won't object if simh did its bookkeeping separately from the images, like I suggested previously -- either side by side with them, or, better, in a "hidden" subfolder, like ".simh/" -- all those 512-byte extras together. You want them to be tied closely with the original -- the subfolder helps as you can then name each of them there arbitrarily, and use an encoded string for the file names that checksums some .dsk file properties like name, size, inode number, etc, even bootsector. So if the original image gets moved, overwritten etc, you won't be using the "wrong" metadata with the new image. I don't see how that can be any difficult, actually. But it won't break the compatibility, and will leave the entire file for the guest OS disposal, like it should be. |
Then I don't understand why you created your label using the byte conversions as if it was to be used between the CPUs of different endian-ness. That's inconsistent, at best. |
So, simh is NOT actually doing anything to the SD card, your dd of the disk image happened to take the container's contents plus the additional metadata. As you note, your process can be specific and only dd the part of the container file which actually contains disk data. This would not have even been a problem if each time you moved things to the SD card, you moved ALL of the disk images in order. I say this since what you've described suggests that each successive dd operation for the subsequent disk contents should be specifying the offset into the destination SD where each successive disk image belongs. The first file would have an extra 512 bytes, but writing the second disk image would overwrite those bytes with the data for the second disk image, etc.. Note that simh DID NOT corrupt any disk data, your data move process did. Accommodating this extra data has 2 solutions mentioned here.
I'm not sure what problem you're describing here. If a user created a container and put a file system on it the file system would be within the bounds of the container he created if he happened to type something wrong when he did that it is his error. If he subsequently realizes that he made a mistake, then he should be creating a new container... Newly created containers are sized based on the disk type in question. For MANY YEARS simh 4.x has created containers that are the full size of the disk type it was creating. The AUTOSIZING file system detection logic was explicitly added to accommodate legacy disk containers created before the simh 4.x full size creation paradigm. The commit you reference above added the metadata beyond the data part of the container.
The meta data is absolutely outside of the bounds of the user data in the disk container, and as you suggest is kept separately. The operating systems running in the simulator can not see or touch this data.
If you were tasked with solving this problem you clearly would have solved it differently, but given that the problem as it currently is implemented is easily tolerated there really isn't a big problem here. You could use the Bob's 3.x code which probably would meet your goals of interchange media between simh and the FPGA which certainly doesn't yet have all of the devices in the 4.x PDP11 simulator.
Byte conversions were used specifically to support the big-endian case. simh data (disks and other things) is supposed to be endian independent and allow interchange. I believe this goes back to when big-endian systems were relatively common. The code currently happens not to have been tested recently, but with sufficient motivation it could be tested and fixed if errors actually exist. |
I do not have to transfer disk images to SD in order. I can do so randomly. I used simh to prep an image, then write it at a specific offset. The extra sector damages the integrity of the next image on SD. That is the problem. The container size changed from what it used to be (either lesser or equal to the drive it represented) but has gotten bigger, and that's not good.
The user brought a .dsk file with a filesystem on it, created previously. simh not knowing what there was, labeled it at a wrong offset because the user made a mistake attaching it to a wrong drive. The .dsk file is now ruined.
It's not true because simh cannot know for sure what it is there, on the disk. See above. Metadata has to be physically separate for the data operations to be safe. Let the guest OS handle the image data, throughout.
So why this "pretty" information can't be pulled from some place else other than the .dsk file itself? I was just suggesting the files (IMO the simplest), but it can be whatever. A registry, a database, but it has got to be separate from the original .dsk image. Not mere logically within, but outside of that file.
You contradict here with your own code! The new "footer" (which I call simh disk label here) is written in big-endian (network) format, yet the fs detection code, in the very same source file, does not even care to do anything endian-agnostic, and is all very much little-endian. Your 20 y.o. sparc machine can demonstrate you that immediately: I can guarantee you, no filesystems will get recognized. If, however, as you mentioned previously, simh should no longer be concerned about the big-endian arch, then writing just the "footer" in big-endian form is rather weird.
Sadly you seem to be unable to think it out of the box, and fail to acknowledge there is a problem. There's very little I can do about it, but I tried. I'm sure this will be brought up again as folks gradually upgrade their simh binaries and realize the simulator is now messing up their (existing) images. |
I wholly agree with Anthony's argument. SIMH should NOT modify disk images by default unless allowed by the user. I haven't seen this issue yet (I am still back on an April commit). My RX01/02 emulator expects .dsk images (on a FAT filesystem) to be EXACTLY 512,512 or 256,256 bytes to be valid images. My 2c. |
Don, thanks! If you attach your images read-only (as with the Ironically, having the metadata separate as suggested, would have still let simh create the metadata even for read-only images (which it can't do now), to warn people on their next attach that they were doing something different:
Yet the main point here was that aside from anything, USER DATA MUST BE RESPECTED. The .dsk containers are pure disk images (not necessarily even created by simh!), portable between different systems, and must remain so (or if not -- then at least given an option, with a big fat warning that it's changing, to politely decline). But this change seems to only cater to some "forgetful" users of simh, totally disregarding the fact of breaking the compatibility and creating a nuisance for others. Who cares, right? Having the metadata separate would still be able to help the weak-minded individuals just the same with reminding them what attachment they used the last time, and would not even call for another So far the solution offered here was to switch to the earlier 3.x version. Hmm, really? Here's what I did, instead:
for every disk image simh mutilated. And I inserted |
For |
I'm well aware of that, and if you read the thread you'd see that I said:
But that is not exactly the point of this discussion here. The problem is that simh began meddling with the contents of your files on its own (and worst part, without any permission), which formerly was solely the job of the guest OS -- but that was always only with whatever you instructed the OS to do. It's your data, and only you modify it the way you want / need. simh can poke around and "guess" the containers' contents all that it wants, and report that information back (as a "filesystem found") but it cannot act upon that the intrusive way, as it is implemented now (no warning, no question, no nothing). The only thing it might be concerned about, can be the size of the container exceeding the capacity of the drive it's being attached to. And even then, a mere warning will do by saying that data beyond the capacity limit won't be accessible. Not even an error, IMO. What seems "prudent" to me is that simh should not try to be the authority of your data, especially when it cannot actually figure the contents with 100% accuracy (and frankly, it should not even be bothered to do such a job as a hardware simulator, what strictly speaking simh is!). If there's a need to "track" the user actions' "correctness" or consistency, simh is welcome to do so in a totally separate file space -- but again, NOT THE CONTAINERS themselves! Lastly, if the urge is that the containers must be keeping that information, then there should be a clear and explicit consent from the user for allowing the simulator to mess with the data:
How happy would you be, Rhialto, if you inserted a thumb drive with some pictures (that you wanted to show to your friends) into your computer, which would figure it out (Ah, pictures!), then would decide on its own that the balance of white was not optimal, and finally would quietly go ahead and fix them by adjusting the palette (not even the actual "scene" data, it's logically separate) for you without you even knowing? And, oh, all that with also changing the sizes of your pictures' files and dates that they were modified (i.e. taken) -- without you even touching them with any modifications of your own -- just showing them to your friends. How cool is that? |
Please note:
|
Please note that YOU made my container incompatible with what it used to be. That simple.
Yes, but only on behalf of the operating system that is running under the simulator! I haven't given simh any permissions to change the container on its own volition, have I?
It's called "file format", "compatibility", "interchangeability", whatever, and it's broken now thanks to this change, with some sort of a stubbornly-manic and centric superiority of simh over other software that can be also used to deal with this data, and complete disregard of the consequences of these actions. |
I am done beating a dead horse here, since like I said, you are unable or unwilling to acknowledge the problem that the "managing" of the container in the way it is implemented currently, is unacceptable by any standard. I don't know who tasked you to code it the way you did, and who you discussed it with prior to doing so, but I am sure that had the discussion been brought to a broader light some 15 months ago, you would have heard a lot of objections, at least in the part of doing the additions to the containers all the way silently -- I am positive about this one!
I haven't checked how this is implemented (and I did not know it existed) but I'm quite sure it may not undo the changes in all cases, because if the "footer" was written past an artificial hole in the file, created as a result of reposition to the "logical end", the hole, and hence, the size of the file after |
I agree that altering the contents of a file silently is a bad idea... but it wasn't my idea. There's a simple solution: use V3.X. |
@markpizz I still believe that what you called "change" (by modifying the issue title) is actually a "corruption", because as a hardware simulator, simh has no idea (and should not be even concerned!) of what the image contains, and writing anything in there corrupts that original data, in general. |
Please help me understand the prior comments on this issue: @markpizz refers to the file in question as a container. Maybe the two different terms could be implying different semantics:
Does that sound accurate? |
@gtackett: |
On Tue 26 Oct 2021 at 16:49:53 -0700, Anthony Lawrence wrote:
These two things (that you want to label as an_image_ and a
_container_) used to be identical things for the `.dsk` files, which
were supposed to be the pure medium data. Changing the notion and
function (an _image_ becoming a _container_, all of a sudden -- and
worst, silently) is not a compatible change. That is what this issue
is actually about.
The solution is of course, like the saying goes, to add another level of
indirection.
Have the container (with metadata inside) be one file. One item of
metadata is the file name of the image, which would be the other file.
This could easily be accused of being overkill, of course.
The container file could now be in a readable text format, which could
be called an advantage.
…-Olaf.
--
___ "Buying carbon credits is a bit like a serial killer paying someone else to
\X/ have kids to make his activity cost neutral." -The BOFH ***@***.***
|
First off, the
It does not have to actually be the same file, the metadata can be kept separately from the pure data perfectly fine. For example, in a dot-file or in a |
On Fri 29 Oct 2021 at 14:50:04 -0700, Anthony Lawrence wrote:
> Have the container (with metadata inside) be one file.
It does not have to actually be the same file, the metadata can be
kept separately from the pure data perfectly fine.
That's what I meant; I thought the next sentence made that doubly clear:
"the image, which would be the other file."
Sorry for not being explicit enough.
…-Olaf.
--
___ "Buying carbon credits is a bit like a serial killer paying someone else to
\X/ have kids to make his activity cost neutral." -The BOFH ***@***.***
|
That's not that straightforward: the simh's But the point remains the same: a |
What the simh program does with disk containers while they are being used by a simulator is completely in the domain of simh. If you or someone or some program want to access or otherwise manipulate the 'data' portion of the container external to simh there are supported ways to explicitly get to that. Feel free to either use the supported method or whatever else you may want to invent external to the simulator. |
@cheater said:
That is really not my problem. I'm not in the business of selling simh to folks. However they manage to find simh they do.
In general, I suspect that they don't start by installing simh on the computer. Something else triggers their interest in retro computing and they search around and come across something specific that interest them. Then they maybe install simh or maybe start poking around with the many sample use cases or the hear about Oscar's PiDP-8 or PiDP-11 and just order one and before they even mess with simh at all. Then they start from that. Again most generally don't care at all how simh manages to work. They just care about running the various operating systems or other software that can be found for these systems.
Again, not my problem.
I suspect many dig way back in their mind to when they first came into contact with the system(s) that are being simulated, they remember how to do something on the particular system. |
So, this discussion has been quiet for several days and no one has come along to describe how the current model negatively impacts something else they need to do, just the earlier opinions about how they would have done things differently. @eschaton spoke with such authority when he said:
It isn't as if I decided out of thin air to store both the disk container data content AND the details about what it actually is in the same container. There are many examples of precisely this concept. Rather than just spouting out arbitrary claims (like @eschaton) without specific examples, I'll point at the following: ViritualBox has VDI, VMDK, and VHD files that can be disk containers with each having its necessary metadata imbedded in the respective container. Microsoft's VM tools (Virtual PC, Virtual Server, Hyper-V) initially supported VHD and now also support VHDX. And, simh v4.x has supported VHD container files since approximately 2011. VMDK seems to come from VMware. All of these containers have metadata stored in the same files as the simulated disk data. One key goal of the appended metadata was to very specifically not interfere with the use of those containers in prior simh versions (v3.x) at least, and that in fact is the case. The default behavior for essentially all disks in 3.x was to explicitly mention the drive type in the configuration file and not attempt to autosize to determine drive types. Given that paradigm, any appended metadata has absolutely no effect on the behavior of the drive. It all comes down to this handful of folks have deeply felt beliefs that this is a completely horrible solution and the world might soon end or their desks burst into flames with it implemented this way. :-) Sorry, there is a working solution now, I'm not looking for ideas about how to do this differently. Since the folks with deeply held beliefs have the option to not have to deal with the meta data yet they still object to the current implementation paradigm, they clearly think others will be somehow harmed if their own idea isn't adopted. I don’t know what else to say. |
|
@AK6DN said:
If a unit has autosize disabled, then no metadata is added to the disk container. The desired behavior is how simh 3.x defaulted, so configuring a drive that way will avoid the "problem".
Disk containers attached read only (-R) do not get meta data added.
I'm considering two things:
@wboerhout said:
That will absolutely work. Is this your approach wholly based on principle or do you have an explicit need to move containers between different simulators (others not being simh) or an explicit need to attach unexpected sized disk containers to different drive types?
I didn't say it was impossible. I just pointed out that metadata contained in the same container file as disk data wasn't unprecedented. Note that I gave explicit examples of these cases. Both you and @eschaton haven't actually provided any specific pointers to your references that manage metadata without setting your desks on fire. |
This has finally got me riled enough to figure out how to comment. I have two major modes in which I use SIMH. One is where I am effectively doing development in a simulated environment. In that mode, the metadata isn't an issue; I'm modifying the disk images anyway, and their prior content is not of interest. The other major mode though, is that I archive hundreds of imaged historical media. A few of these are "releases" of the above efforts, but the large majority are meant to form a historical archive. It is absolutely useful to attach these to SIMH, for various forensic purposes. It is not acceptable to modify them. That said, I can do as Wilm mentions, and adjust my workflows to preserve these images. It is just a pain. (Mostly it creates a new workflow to ensure I didn't screw them up by accident.) One other comment is that I don't really buy the argument that only a few outspoken critics are complaining. I believe it is in the nature of tools like SIMH, that they are complex enough that the majority of the "users" are using the product in a canned way, created and maintained by a small cadre of individuals who have taken the time to learn the tool in depth. (I certainly am not an in-depth user of github, for example. I mostly use it by following someone's walk-through.) In that sense, I fear that a significant fraction of your hard-core user base (including me) is inconvenienced by this. Vince |
I do need (well, want, no lives depend on it) to move containers between simh and a commercial emulator. But also, I need to move containers between Q-Bus or Unibus controllers on different simh instances that support Q-bus or Unibus. This is how I noticed that I need to do extra things where before I did not. I have not been a fan of AUTOSIZE from the start, because I (used to) know disk sizes of most RD and RZ disks by heart, and I tend to set explicit disk types anyway. And, correct me if I'm wrong, set RQx NOAUTO does not prevent adding the header when the disk is created by the attach. The metadata is not unimportant. But, growing up in the VAX era made backwards compatibility important to me. New code and new features should not break old scripts / workflows. Wilm |
Of course you are! The .dsk file can be a result of random writes to it. Meaning that there will be gaps (that read as 0s) in between the sections of the file that have actually been written (including the pure-zero sectors, which will be present).
No, it won't! It's the "attach" command that mutilates the .dsk file; so using "zap" at the first command in the .ini file won't prevent a new signature added, later on in the startup sequence. Also beware that depending on the simh codebase "zap" can actually "bite off" some of the original sectors from the image, along with the trailing signature -- as there was a bug there, previously.
And you are saying with another sort of authority that your solution is the best one. Well, if it was, we won't be battling with you here, in an attempt to persuade you that you were wrong with the current approach. All those references to other formats used elsewhere are MOOT because they were designed from the get-go to contain heterogeneous information, pure data and meta-data altogether mixed-in yet logically well-isolated. The .dsk files are pure data only. There's no way to logically isolate the appendage you are authoritatively writing to it from the rest of the data!
Your sarcasm comes from inability to think outside the box. I warned you at the top of this thread that the issue was going to become recurrent, with more and more people getting frustrated with your implementation. There's nothing to laugh about now.
That's what we heard a lot lately, actually, from different aspects of this discussion. Thanks for using the plain language, at last. Now we know where we all stand with all our suggestions. Also, if that also applies to new users, why are you so adamant of catering to them this idea of the resident metadata? |
If this is such a desirable addition to SimH, where are the other users
that are coming out to defend its use?
That's what ultimately gets me. Everyone here is basically suggesting two
ways to fix this:
1. Disable it by default
2. Use a sidecar file
Forking SimH or reverting back to version 3 are not fixes, despite Bob
Supnik himself suggesting the latter.
Mark, you've made a lot of improvements to SimH over the years, and I
really appreciate those efforts, but I am personally very disappointed that
you are not willing to modify the behavior of this feature to suit our
workflows.
|
Mark said:
> no one has come along to describe how the current model negatively impacts something else they need to do
I don’t know how the changes will impact me going forward, but the changes have adversely affected my use of SimH in three ways:
1) a CDROM image file from a VAX3900 had metadata added which prevented it being read on a virtual CD drive on a VAXstation 3100. I had to dredge through historic archives to find a non-corrupted version to restore.
2) Various disk image files which were originally much smaller than the maximum size because only 20% of the drive had been allocated and used by VMS, got enlarged to full device size. For one disk this probably doesn’t matter, but I like to keep multiple backup copies to save temporal state, and this now takes many gigabytes of unnecessary space.
3) I’d like to move several SimH disk files from a VAX3900 to a MicroVAX 3100 simulation, in order to match historic licences. Originally only the 3900 with DU devices was available, so the disks were set up on that. However if I attach the disk file to the DK device, then the new sizing system objects.
I do very much appreciate the work that Mark does on SimH, but I do think this should have been implemented (if at all) in much more of an upwards compatible fashion.
P.S.
I can’t do the more traditional way of moving the disk contents of booting the MicroVAX as a satellite to the VAX and doing a BACKUP/IMAGE to a new volume, as the MicroVAX simulator fails on network boot (#718). I’d prefer effort was put into fixing such hard bugs which have been logged for several years, rather than adding dubious extra facilities.
Regards,
…--
Paul Hardy
web: <http://www.pghardy.net> www.pghardy.net
|
@wboerhout said:
And you've determined that the presence of the metadata actually negatively affects operation of this other emulator?
There have been some bugs in the interpretation logic of the metadata - independent of where it is stored, that have been fixed. There still potential issues mixing containers between RQ and SCSI, but that has nothing to do with between Unibus and Qbus. Disks attached to RQ devices work on any Qbus or Unibus system that have an RQ controller. Likewise for the other disk containers present on the common controllers these systems supported (RP, RL, etc.).
That is true. Please explain what valuable user data exists in newly created disk container file. Just in case you can think of some, feel free to ZAP and all of your data will be preserved. :-)
Unless you've got evidence that your unnamed commercial emulator misbehaves in the presence of the metadata, no workflow is broken. If you encounter bugs or other problems in the metadata handling in simh, then I'll be glad to fix real problems in the logic. @drovak said:
Well, this reminds me of some words from a book we read to my kids: The words were: "I am the Lorax, I speak for the trees, for the trees have no tongues" In this case, the trees are everyone whom hasn't been involved enough to have to dig deeply into the simh world to get under the covers. He then said:
That would completely remove benefit for the above mentioned trees and require that whole community to dig far more deeply than they ever need or expect to go.
That would again burden the trees to have distinctly deep internal simh knowledge to find the sidecar and carry it around to where ever the container gets moved to. Meanwhile, I've struggled with why the community of negative speakers here are so much against adding a single line one time to one file on their system. Specifically change the simh.ini file in your home directory to contain: SET NOAUTOSIZE Wait a minute, I just realized and checked that the whole concept of simh executing .ini files at startup has never been formally documented. Bob's original "SIMH User's guide" didn't specifically mention startup command file execution and as such the changes to that area never got added. The key change is the addition to the document was:
I'm not going to change the simh default behaviors around autosizing, but anyone in the burning desks club can modify their own default simply enough that it might seem like arguing for the sake of arguing. Who knows? I am absolutely interested in any bugs in the interpretation and usage of the information in the metadata that folks encounter while using simh. @vrs42 said:
If you attach any of these containers to SIMH, the operating system running in the respective simulator can readily change the contents (unless you attach them read only). Read Only attaches don't add any meta data. If you really want to protect the source images (for archival sake), in simh, for essentially the past 10 years you could:
The above will copy the archive.dsk to the temp-working-copy.dsk container as a VHD (thus minimizing space consumed), but if you wanted the temp-working-copy.dsk container to be in SIMH format, merely add:
before the ATTACH command. Sure this is a change to your workflow, but it isn't due to meta data and you might not have known or considered it and it might actually be useful. @pghardy said:
This was a bug that got fixed as soon as you reported it, and the original CDROM contents were restorable with ZAP.
Storage is cheap, but I agree that's not a great excuse to copiously waste it. You could migrate these containers to VHD's (see example above) and they would be almost the original truncated size (potentially smaller in some cases). The smaller potential comes from the fact that unless you play special consideration, VMS's INITIALIZE command tends to locate the ODS2 home blocks around the disk and index file near the middle of the volume. Any basic INITIALIZE activity beyond the beginning of the disk will leave 0 sections of the container just sitting there. When this data gets migrated to a VHD, any 1MB stretches of the disk that only contain 0's don't take up space in the VHD container.
I think you meant RZ rather than DK, where RZ is SCSI, or maybe you meant the RD controller on the MicroVAX2000/VAXStation2000. As I said previously, more general purpose SCSI support is coming and SCSI should then interoperate with MSCP drives fairly well. Until that time, the above mentioned copy mechanism will allow easy access to the data on a new automatically created container. Meanwhile, the MicroVAX2000/VAXStation2000 RD device supports various DEC RD53, RD54, etc. Oddly enough, even though these drives had the same names as the ones connected to the RQDX MSDP controllers, each of these same named drives are different in size depending on RQ vs RD controller and as such, the file systems are different sizes. This is a case of the meta data protecting you from yourself without you realizing it. There are ways to get around the problem, but actually understanding what is going on matters.
|
The "if at all" part becomes more apparent now, when all the long explanations of @markpizz were laid out above: The metadata may not even be needed / created and / or used in a plethora of use-cases except for "autosize" but then again, the simulator somehow creates the metadata even for that, out of "thin air", and so it's also not at all critical and can be recreated just like that on-the-fly, and then used solely in-core of the simulator. So why is it there then in our disk images? It looks like @markpizz on his pathway to becoming an arborist, is obviously missing a few very important points:
There's a huge one already: good software never modifies user data behind their back. That's the number one law in the software development. Regardless of the intent. |
For me this issue and its solution are perfectly clear. A raw disk image is in a certain format, let's call it the "raw disk format". When additional meta data is added to such an image, the format of that image changes and it no longer is an image in raw disk format. That means that there should be (at least) two formats: a raw disk format and a second "signature format". If simh wants to add meta data to a raw disk image it could (and should) ask the user if it wants to upgrade (i.e. copy) the image to the new "signature format". This way the original raw disk image is preserved and the advantages of the "signature format" are available for those wanting these. The signature format could be versioned, allowing changes when the need for more additional meta data arises. |
I’m again struggling with the SimH changes that added disk file
signatures.
For a historic software preservation project, I rescued disk images from
a real MicroVAX3100-80 by dint of clustering it with a SimH instance and
doing image backups. At the time, the only SimH to use was VAX, a 3900
which uses DU/RA disks, so that was what I created. Now that SimH
supports MicroVAX 3100-80, I am trying to get back to nearer the
original configuration. However if I attach the containers to DK/RZ
disks (using current SimH windows binary for 3100-80), I get a set of
errors:
att rz0 Disks/HARDY1DISK0.VAXDSK
%SIM-ERROR: RZ device: Non-existent parameter - RA90
%SIM-ERROR: RZ0: Cannot set to drive type RA90
%SIM-ERROR: RZ0: 'Disks/HARDY1DISK0.VAXDSK' can only be attached Read
Only
Why does SimH want to insist that the disk files be readonly? If they
are valid enough to read, (the disk blocks mount OK in VMS) they are
valid enough to write!
I tried using ZAP to remove the metadata, which seemed to work the first
time, but when I went back to the 3900 to prepare VMS for the different
device names, my ZAP hung processing one of the disks. I killed the
process, but then the disk wouldn’t mount in VMS (no boot block). I
restored the disk image from backup, and have gone back for the moment
to just using the 3900.
I say again that the new system is fragile, overkill, and causes more
problems than it solves! Please allow simple attaching of raw disk block
files to different disk types.
--
Paul Hardy
web: www.pghardy.net <http://www.pghardy.net>
From: Paul Hardy
Sent: 14 February 2022 19:07
To: 'simh/simh'
***@***.***>;
'simh/simh' ***@***.***>
Subject: RE: [simh/simh] simh changes .dsk image files by silently
adding signature (#1059)
Mark said:
> no one has come along to describe how the current model negatively
> impacts something else they need to do
I don’t know how the changes will impact me going forward, but the
changes have adversely affected my use of SimH in three ways:
1) a CDROM image file from a VAX3900 had metadata added which prevented
it being read on a virtual CD drive on a VAXstation 3100. I had to
dredge through historic archives to find a non-corrupted version to
restore.
2) Various disk image files which were originally much smaller than the
maximum size because only 20% of the drive had been allocated and used
by VMS, got enlarged to full device size. For one disk this probably
doesn’t matter, but I like to keep multiple backup copies to save
temporal state, and this now takes many gigabytes of unnecessary space.
3) I’d like to move several SimH disk files from a VAX3900 to a MicroVAX
3100 simulation, in order to match historic licences. Originally only
the 3900 with DU devices was available, so the disks were set up on
that. However if I attach the disk file to the DK device, then the new
sizing system objects.
I do very much appreciate the work that Mark does on SimH, but I do
think this should have been implemented (if at all) in much more of an
upwards compatible fashion.
P.S.
I can’t do the more traditional way of moving the disk contents of
booting the MicroVAX as a satellite to the VAX and doing a BACKUP/IMAGE
to a new volume, as the MicroVAX simulator fails on network boot
(#718
<#718> ). I’d prefer effort was put
into fixing such hard bugs which have been logged for several years,
rather than adding dubious extra facilities.
Regards,
…--
Paul Hardy
web: www.pghardy.net <http://www.pghardy.net>
|
To keep it SIMPLE, this is how this unuseful thing impact our work at our Computer Museum:
So yes, this is a great problem for us as this unwanted feature is creating only problems for our preservation / restoration efforts. I think is't simple to understand. Does simh need this kind of data? Create an external file. Problem solved. Asbesto |
@markpizz there are now many stories here of users telling you how this unfeature messed up their files and made their work hard. Too many stories to count. You admitted that it's only useful in a very narrow situation - during autosizing. There is literally no one here defending your solution, other than you. You are coming off as a bad actor here. If this is a project just for you, make the repository private and keep committing to it so you can use it. If it's for everyone, then listen to everyone, not just yourself. |
Today I was dragged into this discussion by @cheater, who utilized twitter to begin pinging individuals with a scant connection to this project, including historians and documentarians, in some sort of attempt to accomplish something. I assume the something was to shame/brigand @markpizz into reversing a decision they seem rather firm on having made. This went as well as one might expect - after not finding refuge under my rough take on the thread, cheater moved to harassment, assembling further people to come to my account and debate what the meaning of computer history is and how I can best present myself to the world. All of this predicated, of course, on the fact that I do not see the point of dragging in more individuals into what is absolutely a code and feature discussion. The result is I am still getting direct messages on twitter attempting to neg-push/shame me into admitting it was a bad idea not to throw my weight behind criticizing/harassing @markpizz. I am primarily posting this comment should individuals with boundary issues continue to attempt to blindfold-recruit underinformed individuals to join the "fight". My advice is not to. However, while I'm here. I've had enough experiences with both open-source projects and computer history/preservation situations to know that there are occasionally cases where the feature sets of tools do not match the need of institutions. Institutions have two choices: contribute coders/funding to maintain a fork or feature set within the project to handle their needs, or add additional steps to their process to work around the perceived flaws of the tools. The debate, therefore, is neither new, unusual, or something requiring a brigand/sea-lioning of anyone classified as standing in the way of the "battle" for preserving software. Attempting to lay a "professional/unprofessional" patina on this pulls a lot of weight away from how any such preservation project should be conducted. Mistrusting open-source tools, maintaining pristine originals before conducting forensics/analysis and additionally logging the processes handled to ensure fixity are all basic operations in software preservation. Either the SIMH project has contingency/strength to handle debates about project direction/management, or it does not. However that all shakes out: Thanks to all the contributors to the SIMH project over the years. |
Thanks for the heads up. @cheater has now been blocked from this organization and all discussions hence forth. He's never actually contributed anything but criticism. He attributes this lack of positive contribution to contractual commitments with his employer. Meanwhile, the issues raised here are actively being worked on to best address both my goals and the broad needs of the user base and the various problems/bugs that have been reported here. @pghardy mentioned these bugs/problems:
As I mentioned a while back, this issue (attaching containers to dissimilar controllers) was a feature that would be implemtned in the future (very soon now), so in the interim, several suggestions were provided to create compatible containers or to avoid autosizing.
As I just said this was a quick compromise to allow access to the data before full support was available.
Using ZAP was a reasonable work around, but you probably did a ZAP -Z.
This but was fixes shortly after he mentioned it specifically months ago. Since that initial fix, ISO 9660 (and any .iso file) attaches will avoid any metadata additions and will be done read only on any device type.
If you created disk containers with any version of simh from the github.com/simh/simh repo any time in the past 10-12 years, your disk container would have been the size of the drive you were creating it on. disk containers started a size 0 with simh 3.x. Stay tuned for best practice to achieve minimal storage impact.
Stay tuned |
I've already had my say on this issue. I agree with textfiles. If you don't like SimH V4 behavior on metadata, you have multiple ways to proceed:
Mark added the metadata capability because SimH V3 could and did create sub-sized disk containers that were only as long as the data written, and that could and did confuse the autosizing logic. It was primarily a problem for disks that didn't follow the DEC standard for bad blocks, such as MSCP disks or DECsystem-10 disks. SimH V3 made no attempt to protect users from themselves. V4 does. One idea I have not heard suggested: decommit the autosize feature and the metadata feature in tandem. I added autosize to deal with the RK06/7 and RP04/6 on the VAX and PDP11 - where it doesn't fail, because the DEC STD 144 bad block table requires a full sized image. But it does fail in lots of other cases, and it isn't very useful. So why not just drop it, and drop the metadata feature too, and everyone lives happily ever after? |
How about this as a solution. Make sure all disks support the -n option which creates a blank disk. You can also add a -l or -m option to label the disk with metadata indicating the type of disk and the size. On attach check if the metadata exists and use it. If it does not do nothing! Use the size and attempt to do the best you can. This means that people bringing disks from other simulators or actual hardware will not have their data changed without their permission. And when new disks are created the metadata can be added if the user wishes. Also if the controller does not support autosizing then no metadata will ever be attached. |
Thanks for these wonderful suggestions which you are welcome to use on any simulator you work on. Meanwhile, the design for the subject of this issue is long done and about to see the light of day. |
As I understand it, these footers were intended to solve a problem, and it's a problem that would only really occur:
For everything else (existing raw disk images, physical media as block devices) the auto-detect/auto-size functions work, right?
I'm curious to see what the solution is here, but I'm glad to see one has been worked out. 🙏 |
@markpizz That was uncalled for. I was trying to offer a suggestion that would make all parties happy. Perhaps if it is taking time to do these fixes, development and testing should be carried out on a branch, rather then in the master branch. As a maintainer your job is to fix bugs, manage commits from contributors, and make enhancements as requested. You appear to be moving more into the role of author. |
The meta data functionality was in the master branch and working in simulators coming up on 2 years ago. The first comment about it came some 15 months afterwards. As I've said already, I'm not looking for suggestions for design from scratch ideas, so no insult was intended just clarification of the state of things. I've already provided a a very simple means to allow the key complainers about the metadata design to avoid the meta data addition to their disk container files (SET NOAUTOSIZE) and a way to remove meta data from containers which may have inadvertently acquired it (ZAP 'container.file'). SET NOAUTOSIZE can be made a personal default by including this command in the ~/simh.ini. These accommodations have seemed not to be sufficient for the couple of primary complainers. Note the complaints haven't actually been "too many stories to count", just because the same guy repeats his story many times does not actually add to the count. So far, there have been bugs identified and fixed (or being worked on) and 2 distinct stories about cases where someone was using a simh disk container outside of simh and needed to change their external use due to the meta data. The first case is easily fixed by either SET NOAUTOSIZE or by modifying the external procedure slightly.
I find it odd that in addition to the other roles you mention, you're now just realizing that I've taking on the role of author. The codebase in the simh master branch (just in the scp.c, and sim_*.c,.h files) has expanded by a factor of between 3 and 4 since departing from v3. The vast majority of those changes were authored by me in addition to new code and enhancements directly to code in a number of simulators. |
Yes, my bad, I was not clear about that. I was talking about putting back the disk image (used in simh) not on the original disk (that obviously take care of the "problem" if the added data is at the end) but on an hardware disk emulator (like scsi2sd for example) in which you dump the disk image on another modern media (usb, sdcard etc.) For what I understood, the SET NOAUTOSIZE is helpful, so TY, I will try that :) |
In the light of #1163 this is going to be my last post to this repo in the shape and form under the new "terms" (unless either they are dropped, or the project is forked out as FOSS again elsewhere). I accidentally learned about this new licensing fuss just yesterday as I was about to submit a yet another bug report (and a patch for, ironically,
Now this proves to be a complete lie. You had been told about it by others much earlier (but I was unaware, not on that list): https://groups.io/g/simh/message/303 The "author" here cannot understand that the success of open source comes from users' feedback... And a lot of that feedback (resulting in hundreds of commits) comes in a form of bug reports and bugfixes, even if that looks like as a discussion rather than a patch or a piece of code. (Ironically, if one has ever tried to submit the latter, they would have probably noticed that @markpizz, out of his extreme vanity and possessiveness, was to reword the patch and almost never to apply it in the form suggested but as his own redo.) So I am letting Mark Pizzolato (who seems to remain in complete denial) have "his" simulator for his own pleasure (yet I'm quite sure he's not actually using any of it), and keep developing "features" as he pleases. But I'll be looking elsewhere for a fork of this code, which has a truly open source atmosphere, and which is free of this stubborn, childish and pig-headed attitude, and a clinically insane inability to take any criticism. It's not surprising that many (including Bob Supnik) have decided to EXIT out of here. You wanted to go postal, you have at it! |
Hi, |
Mark i got a question, running an Unix V7 image which ran with v3.x, i get this error with former dsk image: %SIM-ERROR: RL0: The disk container '/home/jeanfrancois/got/simh/jfs/drives/pdp11-45_unix_v7_rl.dsk' is larger than simulated device (5242KW > 2621KW) Can you please specify how to fix the dsk so that newer simh builds will propelry read it ? Thanks |
Well, it seems you're trying to attach a RL02 disk image to a drive set to be a RL01. I suggest:
|
Indeed and i found it too in the pdfs. |
Email me at mark@infocomm.com |
Context
This is a new feature but it is a very BAD one: when a disk image it attached to a device in simh, the simulator appends some kind of a signature to the disk image. WHY is this necessary? If this IS necessary -- can this signature be kept separately in a something like a "Media Descriptor File"
.MDS
-- but NOT in the image itself. Please!Just attach a "pristine" .dsk image to a simh disk drive and see it is getting corrupted with an additional sector appended. There seems to be no way to opt out of this behavior. This is also a new behavior (so must have been added recently). This behavior is unacceptable. It may be okay to do this for container files like VHD, but NOT disk sector-by-sector image files, and certainly not silently.
the output of "sim> SHOW VERSION" while running the simulator which is having the issue
how you built the simulator or that you're using prebuilt binaries
the simulator configuration file (or commands) which were used when the problem occurred.
the expected behavior and the actual behavior
The disk image files must not be modified from outside the guest operating system that is run under the simulator. You cannot assume that the only user of the image is the simulator itself, so the "addition", which is understood by simh, is actually a corruption of the original data.
you may also need to provide specific pointers to data files that may be necessary to demonstrate the problem
This garbage does not belong in the .dsk image file (written beyond the actual last sector of the image):
The text was updated successfully, but these errors were encountered: