New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
raSCSI needs to periodically check disk write cache(s) #335
Comments
I really like this idea. I think this is important to implement, since there really isn't a good way to safely shutdown the RaSCSI when its mounted internally. Thank you for the suggestion! |
Thanks for opening this issue. I believe I have seen this same issue on a mac, so adding (anecdotal) confirmation of similar issues. In my case, I setup RaSCSI on a Pi Zero W as an internal solution for a Mac SE/30 I am restoring. I was not really considering Pi shutdown, even though I have used Raspberry Pis for years and have even built soft shutdown solutions for them. In this case, I was pretty excited to try this. I setup multiple drive images that I was able to successfully boot. But it was very easy to mess them up. . . This was a bit of a nightmare while restoring this mac. I was constantly having to start over! I quickly realized I could avoid disk image corruption by opening the RaSCSI web interface and detaching my disks BEFORE cutting power. In some cases, I could resurrect the disk image by booting from another one, then mounting the corrupt image file, then reinstalling the system, or even just updating the HDD driver. One time, however, that did not work and I had to reformat. In no cases did cutting the power actually corrupt the Pi’s image. It only clobbered the attached drive image files which supports the idea that cached writes were getting lost. Come to think of it, I did once have a file change made before shutting down get lost, but I guess I brushed it off—there were so many repeated setups for me because of the images getting corrupted. I posted some questions about this on 68kmla forums because I also decided to re-cap my mac’s analog board and PSU. I was not 100% sure my problem wasn’t something on the SCSI bus, like low term power or something. I was so frustrated by the situation that I de-soldered the headers on the RaSCSI so I could reverse them and try a Pi 3B+ in case my Zero was the problem, or to try external options, but my mac was not cooperating. I could not get it to run externally, so I decided I should re-cap the Mac’s analog board and PSU before trying again. I also setup BlueSCSI so I could have a working internal solution since safe shutdown was so problematic. Yesterday, I finished the recap work, and today, landoGriffin on 68kmla forums shared this link. So I am adding my story here in case it helps. As far as fixing this, if I cannot shut off the mac without pulling out my phone or another computer—well, that just makes the internal option completely impractical. I would hope write cacheing is cleared the moment the drive is otherwise idle. If not, I would want an option to disable write cacheing completely. I suppose that kills write performance, but at least I would be safe to shutdown when my computer says it is safe to cut the power. |
Also, not a criticism of pacjunk’s approach above—the concept of flushing the cache by completing the writes is sound, but 5 seconds is an eternity. I would think we would want to be in the 5 milliseconds range! (ok 5ms is pretty quick, but it should be faster than i can reach over and hit the power switch). Actually, this value is extremely important, so it would probably make the most sense not to pick the time arbitrarily, but to compare to write cacheing in actual drives of the era to see what was typically acceptable. I do remember back in the day being able to disable write caching on drive controllers to avoid even the remote possibility of leaving writes on table so to speak during a power failure for example. We did this on certain critical systems to decrease the likelihood of corruption (for embedded systems mounted up in a crane for example). In any case, I suspect a little research will uncover a good starting value for idle->cache-write threshold timing. |
The effect on performance here is important. If you need to check the cache every 5ms, then you might as well not have it. Shutting down mid-write is always asking for trouble, and you have the issue of OS caching as well as RASCSI caching. My proposal is for the situation where you shut down the OS clean, then hit the power. There is usually a few seconds of idle in this case. It may be better with 1 or 2 seconds flush, but not milliseconds. Given the sometimes random nature of I/O, can you really say it is idle after 5ms? Glad that someone else can reproduce it, and yes it is a pain to have to get the phone out to shutdown the raSCSI, especially when it is host powered. |
A small note that as of the October release, RaSCSI will automatically detach all devices before shutting itself down, i.e. if you do something like "rasctl -X" or by other means send the SHUT_DOWN command to the server. If you're running RaSCSI as a systemd service, it is configured to do the same when it is ordered to stop, e.g. when the system shuts down. This was a partial fix for this particular issue. Of course, this doesn't help in the scenario where the Pi suddenly loses power... |
Thanks for your rational perspective, @Pacjunk. You are right. milliseconds is an order of magnitude too fast, as this is the realm of drive head seek times! One or two seconds is reasonable. I suppose just knowing there is a safe time in the first place is a start. @rdmark I appreciate the use of the command to trigger the detach to reach a safe state, especially incorporating that step into the service shutdown. That at least means it would be safe to skip the detach steps and issue a shut down command, but it still requires a second device. Where it could be useful though is if we could setup a soft shutdown trigger on an unused GPIO pin. For an external device, you could plug it in for power, but have a power off button that would run a safe shutdown script similar to examples we have probably all setup on other Pi-powered projects. Thanks for the dialog. RaSCSI is a cool solution with a lot of promise—especially the Dayna port ethernet. I am looking forward to diving into that next, but for a hands-free internal drive solution, this write cache issue really needs to be resolved. I am happy to test more as needed. |
Has anyone observed whether the hosts properly send the SYNCHRONIZE CACHE command |
I don't see it in the trace logs. I assume the trace will log unhandled commands? |
@Pacjunk Yes, it would. I have never stumbled upon a platform that uses SYNCHRONIZE CACHE, by the way. All in all, periodically writing the cached data would not resolve the general issue. That the Pi crashes or is powered off can happen any time. With periodically flushing the caches you cannot prevent losing data. In addition, just the fact that RaSCSI flushes its caches does not mean that Linux immediately writes these data to the disk. From that perspective IMHO working on this ticket does not provide a reliable benefit. It might make data loss a bit less likely, but that's not worth the effort. I suggest to close this ticket, because the idea sounds fine, but there are no benefits in practice. |
Hmmm. I think to say there are no benefits in practice is a bit too far, as I can reliably ruin disk images on an internal RaSCSI. Making data loss less likely is the benefit. Not being able to make data loss impossible is not a reason to close this issue. I suspect that not addressing this at all will lead to widespread rejection of the solution for internal, and some pretty stark warnings about data loss even on external. I know this—without any adjustment here, I won’t ever use RaSCSI as an internal drive and I’d be obligated to share that perspective whenever I read about someone going after that option. I appreciate the suggestion though about synchronous writes. I mentioned that earlier. There was talk about that ruining performance, but having the option gives a chance to test this in practice. It also reflects the same option present on physical drives. |
@caver01 There is always a tradeoff between performance and reliability. The more often you write, the slower the system gets. Without synchronous writes on the Linux level even a perfect solution on the RaSCSI side might not be worth a lot. Maybe even nothing, considering that Linux caches a lot of filesystem data in memory, as long as there is still free memory. I'm wondering: Is there any other (i.e. non-RaSCSI, but similar to it) solution that does periodic writes? |
I just checked the RaSCSI code. SYNCHRONIZE CACHE currently does nothing. It would not be a big deal to flush the cache on a SYNCHRONIZE CACHE command, just like it is already done on a STOP UNIT (eject) command. But that would not resolve the Linux caching issue and the fact that the usual drivers do not use SYNCHRONIZE CACHE. Which option offered by physical drives are you referring to when you say "It also reflects the same option present on physical drives."? As far as I know physical drives are designed to use the remaining power from their capacitors to finish pending writes (data potentially cached by the drives themselves) when they are powered down. #497 would eliminate any caching issues, by the way, because all SCSI commands would directly be passed to the Linux kernel, and the SG driver directly passes the commands to the drive. No host filesystem involved, thus no software caching involved. But image files would not work anymore when using this feature, because the commands are executed on raw device level. Memory cards would work, for instance. |
In my experience setting up storage solutions in servers (we are talking early 1990s here so it’s somewhat aligned with some of the retro systems where I am using RaSCSI) we often set jumpers or configured drive controllers in software to disable write caching to reduce the likelihood of corruption where customers were hyper-concerned about data integrity. Obviously, there was never a situation where possible corruption was acceptable, but this was definitely an option offered by the devices and we took advantage of it. At the time, we knew we were taking a performance hit doing that, but it was walking that balance. RAID solutions also helped, and a UPS made everyone feel better. |
Yes, I see. Some drives (e.g. QUANTUM SCSI drives) offer switching off caching by software. There are mode pages for that, which you can manipulate with MODE SELECT. |
The scenario of having the Pi powered by the host and switching off the host can be addressed up to a certain degree by implementing a custom SCSI command (or a vendor specific mode page for MODE SELECT) that shuts down RaSCSI or the whole Pi. An OS that can run scripts (or binaries) during its shutdown phase could launch a script that sends this custom command. Linux or any Unix can do that, or MiNT or MagiC for the Atari. I guess a Mac can also execute code during its shutdown. |
You can never totally eliminate the chance of corruption, but what we should be doing is reducing that possibility. The most likely scenario is where you shut down the OS, then flick the power switch (without manually shutting down the pi). Flushing the cache regularly would fix this. As far as I am aware (I come from the server world), controllers and disks will always flush data on idle. RaSCSI does not do this, and unflushed data can sit in the cache for hours (or permanently) until the pi is shut down or the devices detached (which has been coded to flush the cache). Personally I think it is very important to flush the cache, but if others disagree, then I humbly request that an option be added to disable the write cache completely. I would rather have the performance hit. I have been playing a bit with bluescsi (which does not do write caching) and I find performance acceptable. Never had a corruption issue either and due to lack of any management features it is never cleanly shut down. Thanks for looking into it. |
Triggering shutdown is an interesting idea, but I am wondering how practical it is to rely on the host OS for that. One common use case for example is in classic Macs. Who writes new software for old operating systems? I am certainly not equipped to do that. There is a “shutdown items” folder to house a script/app, but that folder did not exist until Mac OS 7.5. What about folks with prior versions? Not to mention, is the timing of executing “shutdown items” even appropriate to pull a drive out from under the OS? Surely there are other tasks the OS is executing during shutdown after it launches an app or script. This does make it less of a drop-in replacement for an actual SCSI HDD. And to that point, why don’t real SCSI drives require this? I suppose I am pointing out that minimum should at least be like-for-like functionality. Can I corrupt a real HDD by killing the power? Probably, but you NEVER see this when looking at the dialog that says “It is now safe to turn off the computer”. RaSCSI should behave reliably in this situation, but it doesn’t. |
@caver01 I think we already answered why real drives do not require this: Either you switch the cache off (by jumper or MODE SELECT), or they use the remaining power from their capacitors to write the pending data. |
You did mention the capacitor hardware bit. Maybe a better comparison, then, is the fact that I can run a BlueSCSI device which also uses image files on the SD card and these are not getting corrupted under these same shutdown circumstances. Perhaps they implemented the same cache write triggers on power off? I dunno. It is open-source, so perhaps we can find their solution. |
BlueSCSI does not have its own cache like RaSCSI does, so nothing to flush there. Also no linux OS in there either. I believe the SD card library does do some caching, but the bluescsi code flushes after each group of blocks is written to the card. |
BlueSCSI doesn't use a RAM cache, so its not going to have this problem. I'm assuming SCSI2SD is the same. They're not running a full OS stack. My two cents on this issue...... I think it is definitely an issue with RaSCSI that it doesn't flush the data when its idle. IMHO, letting the cached data just hang out in RAM in perpetuity is a bug and needs to be fixed. As a proposed experiment, I think we should try updating the disk_track_cache.cpp file to completely disable data caching. There are some online threads that suggest mmap()'ing the file, but I'm not sure that's necessary. I'm not sure its really going to matter. This will allow the operating system's file caching to take over and manage the cache. The operating system should be much better at managing cached data than this custom RaSCSI code. It appears Linux can be tuned so that data will be written to disk within a timeout period. We'll still have an issue that if you're writting data within the last few moments before shutdown, that could be lost. From my limited research tonight - there are many opinions on the web that you shouldn't manually cache files in RAM anyway. RaSCSI doesn't do anything elaborate like trying to lookahead or anything like that. So, there really is no reason to have its own caching scheme. (Running on bare metal might be a different story.... but that support was removed from our code fork) Let's keep the discussion going on this issue if anyone has a chance to do that experiment. I'll dig into it when I have a chance, but I'm not going to have a ton of time to investigate in the near term. |
There is a solution for Atari users, who would like to flush the RaSCSI cache when a drive is idle: #644 flushes the cache on STOP UNIT. There is software for the Atari (AUTOPARK from the HDDRIVER distribution) that sends STOP UNIT to drives that have not been accessed for a configurable time in seconds. (With the next access this tool sends START UNIT for the respective drives.) Provided that STOP UNIT flushes the cache, the use of this software resolves the problem with the RaSCSI cache this tticket tries to address at least for the Atari platform. In addition, #645 flushes the cache on SYNCHRONIZE CACHE. Currently SYNCHRONIZE CACHE is doing nothing. |
I ran an experiment tonight and completely removed the RAM cache. Instead, my updates use mmap to virtually map the file into memory. This should allow the linux disk cache management to work its magic without RaSCSI trying to out-smart it. It appears from the preliminary benchmark results that the performance isn't impacted significantly. I welcome anyone to look at the new branch I created https://github.com/akuker/RASCSI/tree/bug_335_cache_fix There are probably more file systems tweaks that should be made, but this is a starting point. @uweseimet - I'd welcome your opinion on this change. (Well, everyone's opinions ;) ) |
mmm, can't mount 2 x 1GB volumes at the same time. First one is OK, but crashes on 2nd one (this is the cause of the above log). I have created 2 x 200MB disks, and I can mount these at the same time. Did a test copying to these and it stops using memory when the free memory gets below 10% (about 43MB on the pi zero). Maybe this mmap method is a bit unreliable, and a potential memory hog. @akuker How hard is it to replace the RaSCSI caching with just plain file I/O, and not use mmap? |
It would be pretty easy, actually. I can work on that next. For now, I noticed that we never actually gave RaSCSI a high priority in the system. I'm assuming that for most people, RaSCSI is the most important (as far as real-time) process. Using the 'nice' utility to give rascsi a higher priority, the performance is as good as the original RAM cache approach. @Pacjunk - could you try out the latest change in this file? https://github.com/akuker/RASCSI/blob/bug_335_cache_fix/src/raspberrypi/os_integration/rascsi.service (That file needs to be copied to /etc/systemd/system. Then run |
No real difference, maybe slightly slower. I notice that the priority for the rascsi process was marked as "rt" before the change was made. Also, on the crash noted above, I cannot mount a single 2GB image (which works fine on other branches). Smaller ones are fine. |
Testing out the fopen/fseek approach right now :] |
@Pacjunk - An updated version using posix file i/o calls is available on the bug_335_cache_fix_selectable branch. I created a new branch so that I didn't mess up the original mmap version. (not sure if I'll ever need to go back to it, but holding onto it for now) Feel free to give this a try! If you want to switch between versions, update Note: you'll need to |
OK, rebuilding now. Thanks |
OK, with your new code, I'm getting 298KB/sec or 292KB with the dirty parameters set down. I also no longer have issues with 2GB+ images crashing the service. BTW I'm only getting ~330KB to a real HDD, so performance is getting up there. I probably should run some tests on the Alpha which has much faster disks/cpu/ram etc. |
Quite funny: While implementing the SCSI printer device I am stumbling upon code where the current caching approach causes issues: Caching only supports multiples of 512 byte blocks, i.e. if a data chunk is not a multiple of 512 bytes neither reading nor writing will work with the current code. |
Hello, i'm running into problems connected to this issue. My RaSCSI is connected to the "CSS BlackBox", a SCSI interface for the Atari 8-Bit series. I can configure a HDD image and configure the BlackBox to use it. First i need to perform a low level format, then create partitions, and finally initialize those partitions in order to use them with the different DOS versions of the XL/XE. This is all very low level, and no hard disk driver suite like Uwe's HDDriver exists. There is a PARK.COM program, but it doesn't seem to motivate the RaSCSI to flush the cache, so most access to the RaSCSI HDD image is lost after power off. Its very cumbersome to create a network, get both, Pi and a PC connected and then shut down the pi properly before switching off the Atari. Perhaps someone could be so kind and explain me how to switch the caching off. Thank you for reading |
@beetle060 Can you please check which SCSI commands PARK.COM sends? In order to do this start rascsi with the "-L trace" option. When attaching the log please only attach the part of the logfile that contains the commands sent by PARK.COM. When you shut down rascsi or the Pi (either with rasctl, the web UI or the RaSCSI control app) all caches are flushed automatically. There is no way to switch off the cache. IMHO the current cache implementation is not useful, and it even prevents progress on some open tickets. |
I am new to github and to the project, i just downloaded the readily configured Pi image. Not sure which version it runs. I only used the WebUI to configure a disk image, and i configured the Pi to have ssh access enabled so i can login to the shell from my PC. Maybe i can configure serial login from my Atari XL to Pi via a RS232-USB converter. I need to figure out how to start RaSCSI with a commandline option and where to find the logs. I can look into it tomorrow morning. PS: i just found out about Rascsi-Control and bought it. that should make life easier already. But still, a SCSI Harddisk will not keep several tracks of data in the cache for longer periods of time - so shouldn't RaSCSI |
@beetle060 If you still have issues with the latest release and when using the regular RaSCSi/Pi shutdown offered by RaSCSI Control, in order to further investigate I would need the rascsi logfile. You can set the log level to "trace" within RaSCSI Control app, for instance. |
Where do i find the rascsi log file? Seems to be so obvious that i cannot find any info, even on the wiki. I spent the last hour looking for it... |
@rdmark Do you know the default logfile location on the Pi? I always use the console log. @beetle060 I assume it is somewhere in /var/log. |
OK, found it. I loaded DOS and entered the directory with PARK.COM. Then i switched log level to "trace" and loaded PARK.COM which then parked the Drive. PS: the emulated HDD is 20MB Miniscribe from the existing drives library. My Atari detected a HDD "Miniscribe" with 8 heads, 205 cylindres and 82006 sectors . |
@beetle060 I suggest that you create a new github issue and then add the logfile and the issue description there, because this is a separate issue, which is not related to the current ticket. We should not mix unrelated issues. |
OK, thanks for the advice |
Any update on this issue? I have bumped into this a couple times when forgetting to shutdown my pi cleanly. |
Unfortunately, no. I will not be able to address it soon. Hopefully one of the other devs can pick it up. |
Info
Describe the issue
In my testing I have noticed that raSCSI only flushes the disk cache(s) either when it is full, or at shutdown (thanks to a recent change). This potentially leads to unflushed data sitting in the cache for extended periods when the cache is not full.
The following highlights the issue:
Someone probably needs to check this on a Mac, as I'm surprised this hasn't been noticed before if it occurs there. The unclean shutdown scenario is more common than you would think especially when the raSCSI is wired to the hosts power supply. People would just flick the switch without remembering to shut down the Raspberry pi.
Now the cache is there for performance reasons and it should also not be flushed at random times as it may slow current I/O.
I'm not familiar with the raSCSI code or Pi programming in general, but here is how I see something being implemented:
If the system is not idle, then this should not add too much overhead. I suppose it could break if an OS is contantly pinging a disk (e.g. quorum disk). Checking interval may need to be fine tuned in this case.
Another option might be to allow the use of write-through caching. This would probably have a performance hit on slower SD cards, but at least the data would be consistent.
The text was updated successfully, but these errors were encountered: