Proposing improved documentation #5290

nh2 · 2024-02-07T11:53:29Z

nh2
Feb 7, 2024

First, the feature set you have built is very impressive.

I think SeaweedFS would really benefit from more documentation on what exactly it does.

People who want to deploy production systems need that, and it would also help potential contributors.

Some examples:

It says "optimised for small files", but it is not super clear from the whitepaper and other documentation what that means. It mostly talks about about how small the per-file overhad is, but that's not enough. For example, on Ceph I can also store 500M files without problem, but then later discover that some operations that happen only infrequently, such as recovery or scrubs, are O(files) and thus have O(files) many seeks, which can mean 2 months of seeks for a recovery of 500M files to finish. ("Recovery" here means when a replica fails and the data is copied to another replica.)
More on small files: Assuming small files are packed somehow to solve the seek problem, what happens if I delete some files in the middle of the pack? Do I get fragmentation (space wasted by holes)? If yes, is there a defragmentation routine?
One page https://github.com/seaweedfs/seaweedfs/wiki/Replication#write-and-read says "volumes are append only", which suggests that there will be fragmentation. But here I need to piece together info from different unrelated pages in order to answer a core question about how SeaweedFS works.
https://github.com/seaweedfs/seaweedfs/wiki/FAQ#why-files-are-deleted-by-disk-spaces-are-not-released suggests that "vacuum" is the defragmentation process. It says it triggers automatically when deleted-space overhead reaches 30%. But what performance implications does a vacuum have, can it take long and block some data access? This would be the immediate next question any operator would have.
Scrubs and integrity: It is common for redundant-storage systems (md-RAID, ZFS, Ceph) to detect and recover from bitrot via checksums and cross-replica comparisons. This requires automatic regular inspections of the stored data ("scrubs"). For SeaweedFS, I can find no docs about it, only some Github issues (https://github.com/seaweedfs/seaweedfs/issues?q=scrub) that suggest that there is some script that runs every 17 minutes. But looking at that script, I can't find which command is doing the "repair" action. Note that just having checksums is not enough for preventing bitrot: It helps detect it, but does not guarantee that the target number of replicas is brought back up (as it may take years until you read some data again). For that, regular scrubs are needed.
Filers: For a production store of a highly-available POSIX FUSE mount I need to choose a suitable Filer backend. There's a useful page about these on https://github.com/seaweedfs/seaweedfs/wiki/Filer-Stores. But they are many, and information is limited to ~8 words per backend. To know how a backend will perform, I need to know both the backend well, and also how SeaweedFS will use it. I will also be subject to the workflows of that backend, e.g. running and upgrading a large HA Postgres is unfortunately not easy. As another example, Postgres itself also does not scale beyond a single machine, unless one uses something like Citus, and I have no info on whether SeaweedFS will work with that.
The word "Upgrades" seems generally un-mentioned in Wiki and README. How are forward and backward compatibility handled? Can I just switch SeaweedFS versions forward and backward and expect everything will automatically work? For Ceph there are usually detailed instructions on how one should upgrade a large cluster and its clients.

In general the way this should be approached is: Pretend to know nothing about SeaweedFS, and imagine what a user that wants to use it in production wants to know, and what their followup questions would be.

Some parts of that are partially answered in the presentations, but it is difficult to piece together how a software currently works from presentations of different ages (maybe they are already outdated?) and the presentations are also quite light on infos (usually only 1 slide per topic). I think the Github Wiki is a good way to do it, but it too, is too light on information and I'm not sure it has everything that's in the presentations.

I understand the README already says "more tools and documentation", I just want to highlight how important the "what does it do and how does it behave" part of documentation is for software like this.

njhurst · 2024-02-10T03:04:41Z

njhurst
Feb 10, 2024

I opened an issue with your comment #5274 and made a start on improving the docs (I have not addressed any of your points directly yet, perhaps you will be inspired to do that).

0 replies

CharlesJQuarra · 2024-02-11T16:20:12Z

CharlesJQuarra
Feb 11, 2024

I think one of the emphasis on the HN thread regarding the most glaring gap in documentation is from the perspective of Operations and maintenance of deployed systems. What are the most common errors to be encountered after, say, a power outage? what are the approaches to recovering from these?

0 replies

RowanH · 2024-02-22T03:51:51Z

RowanH
Feb 22, 2024

To add a comment - the first thing I want to know about with any storage is how to backup, how to restore, and what happens when bad things happen- how do you get up and running again. What's the likely faults and how they're resolved.

I went to this page https://github.com/seaweedfs/seaweedfs/wiki/Data-Backup and there's a bunch of todos at the bottom.

So if anything from an outsiders perspective, I'd put a vote in for that. As in every other respect seaweedfs looks fantastic for our usecase and I'm going to experiment with it, but in order to go into production I'm going to have to simulate a variety of failure modes and get my head around the 'right way' to recover from them.

There's a couple of errors examples on this page here to look for : https://github.com/seaweedfs/seaweedfs/wiki/weed-shell - I'd suggest breaking this into it's own separate page in Operations 'Errors & Recovery' or similar.

It's the kind of thing if we have it in production, we'd have a big cheat sheet of 'if this then that...' because if one primary storage system went down we'd be scrambling and scrambling fast, nothing worse than trying to figure things out when systems are down...

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposing improved documentation #5290

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

Proposing improved documentation #5290

nh2 Feb 7, 2024

Replies: 3 comments

njhurst Feb 10, 2024

CharlesJQuarra Feb 11, 2024

RowanH Feb 22, 2024

nh2
Feb 7, 2024

njhurst
Feb 10, 2024

CharlesJQuarra
Feb 11, 2024

RowanH
Feb 22, 2024