Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO NOT MERGE][LevelDB] Do no crash if filesystem can't fsync #10000

Closed
wants to merge 1 commit into from

Conversation

NicolasDorier
Copy link
Contributor

@NicolasDorier NicolasDorier commented Mar 15, 2017

Solve #9996.
There should be some kind of warning if fsync is not supported. It is unclear how to implement that easily though since we don't have access to any logger here.

I encountered the problem as I could not run bitcoin on a network share on cifs. (I could previously, which is strange)

It seems that https://www.kernel.org/doc/Documentation/filesystems/cifs/CHANGES Version 1.57 seems to indicate that cifs should not fail, but just not guarantee that the server properly synched on the drive. This is not what I am experiencing.

Ping @laanwj

@laanwj
Copy link
Member

laanwj commented Mar 15, 2017

To be clear: this happens only happens while syncing a directory handle in SyncDirIfManifest(), syncing normal fds must still succeed.

It is unclear how to implement that easily though since we don't have access to any logger here.

I'd say we should warn once when this happens, that data syncing can not be guaranteed over network filesystems.

Even if it succeeds all that fsync can (generally) guarantee with network filesystems is that the data has been written to the server, not that the server has written it to disk

However if there is no access to a logger from there it becomes difficult...

Edit: Likely we want to run this through https://github.com/bitcoin-core/leveldb (but it's fine to have the PR open here for visibility, it just shouldn't be merged directly).

@luke-jr
Copy link
Member

luke-jr commented Mar 15, 2017

Shouldn't we at least log a warning in this case? Maybe even fail if pruning is enabled? (Presumably it means a crash is likely to corrupt the index?)

@NicolasDorier
Copy link
Contributor Author

It should not crash. There is reasons to use shared folder, one is in my posted issue.
If there is a corruption, then the user need to reindex. This is strictly better than just preventing to run at all.

I agree for a warning. We can attempt fsync into bitcoin code on a dummy folder to see if it is enabled, and if not, provide a warning.

@NicolasDorier NicolasDorier changed the title [LevelDB] Do no crash if filesystem can't fsync [DO NOT MERGE][LevelDB] Do no crash if filesystem can't fsync Mar 15, 2017
@luke-jr
Copy link
Member

luke-jr commented Mar 15, 2017

Pruning nodes don't have data to reindex from. I know if I downloaded 100 GB, and then had to re-do the download due to this, I'd be pretty annoyed... Better to find out sooner so I can use a different filesystem and avoid the re-download.

@laanwj
Copy link
Member

laanwj commented Mar 15, 2017

No, it should never crash/fail on this. If people run a datadirectory on a network FS they take some extra risk. There's nothing that we can do in that case (apart from warn).

@laanwj
Copy link
Member

laanwj commented Mar 15, 2017

I agree for a warning. We can attempt fsync into bitcoin code on a dummy folder to see if it is enabled, and if not, provide a warning.

Yeah - maybe just try fsync() on the data directory at startup? No need even to make a dummy dir.

@NicolasDorier
Copy link
Contributor Author

NicolasDorier commented Mar 15, 2017

for using fsync on the data directory I need to use LEVELDB_PLATFORM_WINDOWS and port.h of level db in dbwrapper.h it feels dirty but I do not see another way. Any suggestion ?

Another way is to test in DB::Open, log the warning, and then make special case in the CBitcoinLevelDBLogger that if the log message start is "Warning, fsync directory disabled...", it should log as error. But it also feels dirty.

@laanwj
Copy link
Member

laanwj commented Mar 15, 2017

for using fsync on the data directory I need to use LEVELDB_PLATFORM_WINDOWS and port.h of level db in dbwrapper.h it feels dirty but I do not see another way. Any suggestion ?

To do that, best is probably to check separately fsync() is available in configure.ac. LevelDB compilation flags are not available while compiling our files.

Another way is to test in DB::Open, log the warning, and then make special case in the CBitcoinLevelDBLogger that if the log message start is "Warning, fsync directory disabled...", it should log as error. But it also feels dirty.

Yes. Unfortunately they don't deal in log levels. Making an exception for a single message is indeed ugly. You could do it linux-kernel printk style and add a prefix (for example =e=<message>) to promote something to error.

@NicolasDorier
Copy link
Contributor Author

NicolasDorier commented Mar 16, 2017

The easiest would be prefixed log.
I am wondering if logging it really matter actually. As it is failure of fsync happens only if it is a folder. fdatasync works as expected.

The fsync on a folder is only done if Sync is called on a MANIFEST file. Which seems to be rather rare. Will check my assertion soon.

@laanwj
Copy link
Member

laanwj commented Mar 16, 2017

The fsync on a folder is only done if Sync is called on a MANIFEST file. Which seems to be rather rare. Will check my assertion soon.

The Linux manpage says

Calling fsync() does not necessarily ensure that the entry in the directory containing the file has also reached disk. For that an explicit fsync() on a file descriptor for the directory is also needed.

So this seems only important after adding new files (or deleting them), which is probably why it's only done on MANIFEST changes.

@NicolasDorier
Copy link
Contributor Author

So I think we do not really to show a log as this is rare occurence. (I want to still check that with dbcache to 0 though)

I also want to note also that on windows, running bitcoin core from a shared SMB folder works fine.

@NicolasDorier
Copy link
Contributor Author

I think in any case we should merge it. Windows version works fine on SMB file system, only the linux version crash.

@laanwj
Copy link
Member

laanwj commented Mar 17, 2017

@NicolasDorier I agree

@marukka
Copy link

marukka commented Mar 17, 2017

@NicolasDorier What exactly are you doing to get the Windows version to work on a SMB share? In bug #10018 I had the exact opposite experience using a SMB share from a Server 2016 File Server Failover Cluster with a Server 2016 Remote Desktop Services server publishing Bitcoin Core as a RemoteApp. Did you map the share as a drive letter as opposed to using a UNC path?

@NicolasDorier
Copy link
Contributor Author

Did you map the share as a drive letter as opposed to using a UNC path?

Yes I did. Indeed // path did not work, I think this is why I did it.

@NicolasDorier
Copy link
Contributor Author

Where to ACK that, here or in bitcoin-core/leveldb-subtree#16 ?

@NicolasDorier
Copy link
Contributor Author

merged on bitcoin-core/leveldb-subtree#16

laanwj added a commit to bitcoin-core/leveldb-subtree that referenced this pull request Nov 6, 2019
This code moved, re-apply the fix elsewhere.

See
- bitcoin/bitcoin#10000
- bitcoin-core/leveldb-old#16

Original change by Nicolas Dorier.
laanwj added a commit to bitcoin-core/leveldb-subtree that referenced this pull request Nov 6, 2019
This code moved, re-apply the fix elsewhere.

See
- bitcoin/bitcoin#10000
- bitcoin-core/leveldb-old#16

Original change by Nicolas Dorier.
laanwj added a commit to bitcoin-core/leveldb-subtree that referenced this pull request Nov 6, 2019
This code moved, re-apply the fix elsewhere.

See
- bitcoin/bitcoin#10000
- bitcoin-core/leveldb-old#16

Original change by Nicolas Dorier.
laanwj pushed a commit to bitcoin-core/leveldb-subtree that referenced this pull request Nov 7, 2019
This code moved, re-apply the fix elsewhere.

See
- bitcoin/bitcoin#10000
- bitcoin-core/leveldb-old#16

Original change by Nicolas Dorier, ported to leveldb 1.22
by Wladimir J. ban der Laan.
laanwj pushed a commit to bitcoin-core/leveldb-subtree that referenced this pull request Nov 7, 2019
This code moved, re-apply the fix elsewhere.

See
- bitcoin/bitcoin#10000
- bitcoin-core/leveldb-old#16

Original change by Nicolas Dorier, ported to leveldb 1.22
by Wladimir J. van der Laan.
richmills3 pushed a commit to nchain-research/leveldb that referenced this pull request Mar 12, 2021
This code moved, re-apply the fix elsewhere.

See
- bitcoin/bitcoin#10000
- bitcoin-core/leveldb-old#16

Original change by Nicolas Dorier, ported to leveldb 1.22
by Wladimir J. van der Laan.
richmills3 pushed a commit to nchain-research/leveldb that referenced this pull request Mar 17, 2021
This code moved, re-apply the fix elsewhere.

See
- bitcoin/bitcoin#10000
- bitcoin-core/leveldb-old#16

Original change by Nicolas Dorier, ported to leveldb 1.22
by Wladimir J. van der Laan.
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants