Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows mount support #5003

Closed
djdv opened this issue May 7, 2018 · 85 comments
Closed

Windows mount support #5003

djdv opened this issue May 7, 2018 · 85 comments
Labels
topic/windows Windows specific

Comments

@djdv
Copy link
Contributor

djdv commented May 7, 2018

Tracking/discussion issue for the topic of implementing ipfs mount on Windows.
This is a broad topic, all comments and criticisms are welcome here as long as they help drive us towards a solution. Nothing is set in stone yet and we'd like to implement this correctly.

Currently we have no first party support for this:

Error: Mount isn't compatible with Windows yet

For third party, I'm aware of these projects:
https://github.com/alexpmorris/dipfs
https://github.com/richardschneider/net-ipfs-mount
both utilize the IPFS API and Dokany(a Windows FUSE alternative).

dipfs does not appear to be maintained.
In my experience it will succeed in mounting IPFS as a drive letter, and allows you to traverse IPFS paths via CLI, however it hangs in explorer (likely from it trying to access metadata), or when trying to read data through any means.

net-ipfs-mount is currently being maintained.
This is our best contender, everything appears to work as intended unless you have a non-tiny pinset. When trying to traverse IPFS paths with a large enough pinset, passing the list of pins from the API to net-ipfs-mount can take long enough for Windows to deem /ipfs inaccessible.
For local testing you can run ipfs pins ls and see how long it takes to return.


I would like to start an initiative, aimed at getting Windows mounting on par with the other platforms.
That is to say, at the very least, exposing read only access to /ipfs and /ipns constructs.
If possible, it would be nice to extend the feature set to expose a writable MFS root as well, similar to this https://github.com/tableflip/ipfs-fuse

Most likely, this will mean implementing first party support for mount in go-ipfs, utilizing core APIs where possible.

Mention of https://github.com/billziss-gh/winfsp has come up as an alternative to Dokany. It's likely that winfsp's native API will be our target, however this is not locked down yet. If you have opinions on Windows filesystem APIs, positive or negative, please post them here.

cc:
@mgoelzer, @alanshaw, @dryajov, @mrlindblom, @Kubuxu, @whyrusleeping

Edit: forgot to cc: @alexpmorris, @richardschneider

@djdv djdv mentioned this issue May 7, 2018
9 tasks
@xelra
Copy link

xelra commented May 14, 2018

My vote is for WinFsp.

It has an active developer and in my opinion is the best FUSE alternative for Windows. I'm using it daily in conjunction with SSHFS.

@billziss-gh
Copy link

Adding myself here so that I can receive notifications and provide help.

@billziss-gh
Copy link

billziss-gh commented May 16, 2018

I add here some comments I made over private email regarding the task of creating a Go shim for WinFsp.

Some initial thoughts on the kind of tasks you will have to handle for a Go shim. The WinFsp Tutorial may also be useful to you as it lists the major tasks needed for a successful file system using the native API.

  • Find and load the WinFsp DLL.
    • This is normally done using the inline C function FspLoad, which looks into the registry to find where the WinFsp DLL is installed and loads it. This will have to be rewritten in Go.
  • WinFsp includes API’s for creating Windows services, etc. You do not need to use those if you do not plan to run IPFS as a Windows service, but if you do they are named FspService*.
  • You create a file system by using FspFileSystemCreate and you mount it using FspFileSystemSetMountPoint. You can mount on a drive or directory, but mounting on a directory requires a case-insensitive file system under Windows.
  • The file system starts handling operations when you call FspFileSystemStartDispatcher. This will start a number of threads and will call your file system operations on those threads. The default number of threads is the same as the number of processors (with a minimum of 2).
  • Most of the file system operations should be relatively easy to understand, especially if you have experience with FUSE. Of notable difference is how files are deleted on Windows and the security model (which uses security descriptors and not POSIX permissions).

I am happy to elaborate on any of this.

@djdv
Copy link
Contributor Author

djdv commented May 16, 2018

Much appreciated @billziss-gh. Very glad to have your support with this, and a clear outline of what's necessary. 👍

WinFsp includes API’s for creating Windows services

I have considered packaging go-ipfs in an MSI that optionally installs a service for the ipfs daemon. There seems to be some support for this in golang already but it's nice to know about interactions with services and WinFSP. We may be able to take advantage of it.

@billziss-gh
Copy link

NP. Feel free to ping anytime and I will continue monitoring this issue.

WinFsp includes API’s for creating Windows services

I have considered packaging go-ipfs in an MSI that optionally installs a service for the ipfs daemon.

While I do not understand the internals of IPFS, in general file systems benefit from being run as Services under Windows. I outline some of the reasons here:

https://github.com/billziss-gh/winfsp/wiki/WinFsp-Service-Architecture

@djdv
Copy link
Contributor Author

djdv commented May 16, 2018

No going back now 👀
winfsp

@BillDStrong
Copy link

https://github.com/billziss-gh/cgofuse is a go library for fuse maintained by the winfsp maintainer. It might be best in the long run to use it to create one ipfs-mount that works on all platforms. winfsp has a fuse compatability mode, when used with a cygwin dll. This would also allow decoupling somewhat the mount functionality.

@djdv
Copy link
Contributor Author

djdv commented May 17, 2018

I should mention I'll be dumping my work here from time to time
https://github.com/djdv/go-ipfs/tree/feat/win-mount
It will probably be ugly for a while. I'll divvy up the source files and refactor after things have at least a basic implementation. That is to say, create the binds, and rework them to be more go-like where/when possible.

@BillDStrong
I agree that it would be better to have a unified mount implementation, however, I don't want to introduce a C compiler as a build dependency if it can be avoided.
We can write an independent Windows variant now, wait for the *nix version to be refactored, and maybe derive an interface from commonality between them if it's desirable. We'll have to see how redundant it would be and if there'd be any benefit in making porters implement another abstraction layer or just be independent.

For the moment, I'm dumping this in /mount/winfsp, we may move /fuse into there and have a structure like this:

mount/
    fuse/
    interface/
    winfsp/

or

mount/
    fuse/
    winfsp/
    mount_interface_and_friends.go

We'll have to see in time.

@BillDStrong
Copy link

I looks like cgofuse has a cross compilation option based on xgo. It looks to include the necessary windows winfsp libraries setup. This would still add a dependency on the c compiler, but xgo's ease of use should be enough to offset that downside. It is contained in a docker container, so compilation shouldn't require to much setup over the current workflow.

This is just my opinion, though.

Prerequisites: docker, xgo
Build:

$ docker pull billziss/xgo-cgofuse
$ go get -u github.com/karalabe/xgo
$ cd YOUR-PROJECT-THAT-USES-CGOFUSE
$ xgo --image=billziss/xgo-cgofuse \
    --targets=darwin/386,darwin/amd64,linux/386,linux/amd64,windows/386,windows/amd64 .

@Kubuxu
Copy link
Member

Kubuxu commented May 18, 2018

@BillDStrong thanks for showing me xgo. Few blockers: 1. teaching https://github.com/ipfs/distributions how to use xgo 2. making sure that Jenkins can handle windows cgo on Windows.

@djdv
Copy link
Contributor Author

djdv commented May 18, 2018

Excluding the build dependencies, I'm concerned with some of the API differences
http://www.secfs.net/winfsp/develop/native-api-vs-fuse/

I'd rather not target FUSE on Windows if it means making compromises. Since we already have FUSE for the other platforms, it feels like we're not restricted to it (on Windows) now.
It may still be best to have separate, but native, implementations. Giving us as much compatibility/(flexibility later) with the host as we can get.

Thoughts on this? @BillDStrong @Kubuxu @billziss-gh

@billziss-gh
Copy link

TLDR

Cgofuse gets you nice cross-platform compatibility (on the 3 OS'es) but at the cost of having some major external dependencies (native toolchain or docker+xgo). These may or may not be appropriate for a large project with an established community and expectations.

In detail

I can see both sides of the argument.

I originally created cgofuse to support rclone. As I was new with Golang at the time, I followed @ ncw 's pointers and used cgo to create a single layer that would run on the 3 OS'es.

I find that cgofuse has been a success as it allows the creation of a file system that runs on the 3 OS'es with relative ease. For example, I use cgofuse in my fledgling project objfs (currently pre-alpha) and I do not have to think very much about cross-platform issues. This benefit outweighs for me any potential negatives.

Now for those negatives:

  • Since cgofuse is a cgo project, the native toolchain is required for every user of cgofuse. @ ncw and I tried quite hard to work with precompiled binaries so that rclone would not need to include a native toolchain in its build process. Unfortunately this did not seem possible (at least as of Go 1.8, which was current at the time of the experimentation).

  • Xgo works, but is not the same as the familiar go build. So I find that in practice I do not use it as much as I thought I would. (I also tend to have the right compilers at the ready on all my machines/VM's, so go build is easier for me.) And of course xgo requires Docker and the xgo-cgofuse docker image.

  • Cgofuse currently works on "the 3 OS'es" (Windows, macOS and Linux). I am hoping to eventually extend to at least FreeBSD, but this does limit the target platforms.

Additional considerations for go-ipfs

  • An additional negative for go-ipfs is that it appears to have a large community, which might balk at the idea of introducing a native toolchain requirement in order to support "Windoze" :-) But I am new to IPFS, so I cannot speak on that.

Native API vs FUSE.

Given that IPFS is likely closer to the POSIX file system model rather than the Windows file system model, the differences between the two API's might not mean much for IPFS. For example, if IPFS has no need for and does not use "security descriptors" (Windows ACL's), the ability to use the native GetSecurity and SetSecurity might not be very important.

@djdv
Copy link
Contributor Author

djdv commented May 18, 2018

@billziss-gh
Thanks for sharing the experience! It's very helpful to hear.

ncw and I tried quite hard to work with precompiled binaries so that rclone would not need to include a native toolchain in its build process. Unfortunately this did not seem possible (at least as of Go 1.8, which was current at the time of the experimentation).

I'm curious to hear more about this, since that seems to be the current task at hand for me.
Wrapping the DLL and making Go and C play nice with each other at runtime, and trying to make a sane api around it that allows us to implement an FSP filesystem in Go.

I'm wondering if something couldn't be done to extend cgo-fuse to add dynamic-link support via build tags. However this still restricts us to FUSE.


On the topic of FUSE vs Native.
While IPFS itself has POSIX constructs, there's nothing preventing us from reading and storing metadata like ADS, ACL, etc. alongside normal data.
The practical utility of this is unknown though. It would be interesting to see a single file hash that contains file data as well as multiple sets of platform specific metadata. At the same time though, non-portable platform metadata should probably be avoided altogether by now. Hard to say if we should strive to support or ignore these things when dealing with a bridge between multiple systems.

I'm being extra cautious to harp on these details now, only to make sure we don't lock ourselves into something that will cause problems later. It seems like FUSE at least provides the necessities, but I want to make sure this is talked about first.

While not critical at the moment, I should also bring up performance as something to consider as well here. Will the lack of async i/o bite us?


On the topic of more build dependencies.
I'd rather decrease the requirements to build on Windows rather than increase them, but if we decide cgo-fuse is the way to go, I won't be the one to object. Not to mention the possibility of a dynamic cgo-fuse varient.

@billziss-gh
Copy link

billziss-gh commented May 18, 2018

ncw and I tried quite hard to work with precompiled binaries so that rclone would not need to include a native toolchain in its build process. Unfortunately this did not seem possible (at least as of Go 1.8, which was current at the time of the experimentation).

I'm curious to hear more about this...

My recollection is that we were trying to create what golang calls "binary only packages", which were implemented in Go 1.7 (I think). While this experiment was successful the native toolchain was still required to build the final executable of a project that uses cgofuse (perhaps only the native linker was required).

Ping @ncw who may remember more on this. Nick, the question is if you recall what the issues with using cgofuse as a binary only package were.

I'm wondering if something couldn't be done to extend cgo-fuse to add dynamic-link support via build tags. However this still restricts us to FUSE.

I have considered before the possibility of modifying cgofuse to use the syscall.LoadLibrary approach on Windows to hook into the WinFsp-FUSE interface and avoid using the C compiler on that platform. I did not follow through because the Go libraries lack good dlopen support for non-Windows platforms, which made the use of cgo mandatory.

It may still be worthwhile to do this on the Windows platform only.

Another approach (and perhaps the one you allude to @djdv) is to build cgofuse as a shared library or a plugin. Unfortunately buildmode=shared and buildmode=plugin are still not supported on Windows (I think).

While IPFS itself has POSIX constructs, there's nothing preventing us from reading and storing metadata like ADS, ACL, etc. alongside normal data.

This is actually related to something I deeply care about: how to streamline the differences between the 2 major approaches to file systems (POSIX and Windows). ACL's are a big part of that.

For example, I would love to be able to set ACL's on Windows and have them easily translate into ACL's on my Mac, especially because the ACL systems are conceptually compatible. The biggest hurdle I usually face is the need for a user name/id mapping service, but it sounds like IPFS may already have that problem solved.

While not critical at the moment, I should also bring up performance as something to consider as well here. Will the lack of async i/o bite us?

In general I have not found the lack of async I/O in the FUSE interface to be a performance problem.

One complaint from some users with heavy-weight scenarios, is that their file system may sometimes stall while doing too many blocking operations, because they run out of threads. The remedy is usually to increase the number of threads; some even go as far as to implement asynchronous I/O. Only a small number of commercial users with highly parallel file systems have had the need to implement asynchronous I/O.

Increasing the number of OS threads is not very costly in WinFsp (in terms of context-switching), because it uses IOCP scheduling under the hood. More details in this document.

@billziss-gh
Copy link

billziss-gh commented May 22, 2018

In the last 2 days I spent considerable time refactoring cgofuse with the goal of eliminating the need for cgo on Windows. My intent was to allow building with either CGO_ENABLED=1 to build the (default) "cgo" version of cgofuse, or CGO_ENABLED=0 to build a "nocgo" version of cgofuse.

My thinking was that with the help of syscall.DLL, syscall.Proc and syscall.NewCallbackCDecl, I should be able to replace/rewrite all Windows related C code in cgofuse. I was indeed successful in this endeavor (see the windll cgofuse branch), but unfortunately this story does not have a happy ending.

A FUSE file system needs to be able to receive file system requests (fuse_operations). In the "nocgo" version of cgofuse I setup fuse_operations using syscall.NewCallbackCDecl. These operations will be invoked in the context of threads that were NOT created by the Go runtime. For example, the very first operation init is called in a special thread that WinFsp-FUSE creates for initialization purposes. Other file system operations will be invoked on threads created by the WinFsp-FUSE dispatcher.

This all works fine in the "cgo" version. However the initial init call hangs in the "nocgo" version. It turns out that the Go runtime has a hard bug: golang/go#6751. According to that bug report callbacks created with syscall.NewCallback will hang the runtime if they are invoked in a non-Go thread. This bug spells doom for the "nocgo" version of cgofuse.

Even worse it spells doom for any effort that wishes to eliminate the need for cgo (regardless of whether cgofuse is used), because syscall.NewCallbackCDecl is required to receive file system operations when cgo is not used.

(There is also the alternative of rewriting the WinFsp DLL in Go, so that it interfaces directly with the WinFsp FSD, but this is a huge undertaking and I would not recommend it.)

@djdv
Copy link
Contributor Author

djdv commented May 23, 2018

In the last 2 days I spent considerable time refactoring cgofuse with the goal of eliminating the need for cgo on Windows.
I was indeed successful in this endeavor

This is great to hear! As always, the continued effort is appreciated.

...

It's unfortunate that we're blocked on Go itself, but that seems to be a trend for our Windows tasks recently.
Since we can't go ahead with either proposed solution, I suppose the only option is to try and resolve the issue upstream. However, I haven't gauged the complexity of it yet. Dipping into the Go runtime and how it interacts with the C runtime and threads between them, seems like it could be fun(a hassle).

Based on our needs and what was said, it seems like cgofuse (sans cgo) would be a good solution after the fix is implemented.
This will halt efforts on the winfsp native bindings unless we encounter problems with the fuse bindings.

@ncw
Copy link

ncw commented May 23, 2018

@billziss-gh wrote:

My recollection is that we were trying to create what golang calls "binary only packages", which were implemented in Go 1.7 (I think). While this experiment was successful the native toolchain was still required to build the final executable of a project that uses cgofuse (perhaps only the native linker was required).

As far as I can remember we were trying to make binary blob you could just link using the go linker so with the C parts of it already linked in. I can't remember why it didn't work though and I can't find any notes I made either :-(

This all works fine in the "cgo" version. However the initial init call hangs in the "nocgo" version. It turns out that the Go runtime has a hard bug: golang/go#6751.

:-( I've found the go team very good to work with so if you had the time to track the go bug down you would have a receptive audience!

@djdv
Copy link
Contributor Author

djdv commented May 23, 2018

suppose the only option is to try and resolve the issue upstream.

Correction, it seems like we have 2 upstream options.
There is handling the thread problem, but it may also make sense to handle the plugin problem mentioned earlier (golang/go#19282).

If I understand correctly, I believe this would sidestep the issue. cgofuse would still use cgo, but we would at least be able to link with it, without adding any build dependencies. We could just check for the existence of the plugin and either link with it or not.

I'd still prefer to take advantage of the cgo-less version of cgofuse if we can though.

@billziss-gh
Copy link

@ncw wrote:

As far as I can remember we were trying to make binary blob you could just link using the go linker so with the C parts of it already linked in. I can't remember why it didn't work though and I can't find any notes I made either :-(

My understanding (based on assumptions rather than facts) is that when you do not use cgo, go uses its own linker and avoids using the native linker. But when you do use cgo, the native linker must be used in order to correctly create the executable; the reason being that the standard C library has to be linked in which normally the go linker avoids to do.

The fact that go uses its own linker is probably why we do not have dlopen and why go makes its own system calls instead of going through the standard library (even on platforms like Darwin, where the system call API is not supported by Apple.)

Some relevant golang issue links are here:

  • golang / go # 17490
  • golang / go # 18296

(I have not linked directly to them to avoid adding extraneous references.)

Additional interesting links:

@whyrusleeping whyrusleeping added the topic/windows Windows specific label May 25, 2018
@billziss-gh
Copy link

An update on this.

There is a PR (golang/go#25575) that addresses the thread hanging issue with syscall.NewCallback (golang/go#6751). It is currently under review by the Go team and should be hopefully accepted.

Over the weekend I was able to further develop the "nocgo" version of cgofuse, using a locally patched version of go. I was able to bring up the "nocgo" version of the 64-bit memfs sample and it worked without problems. There is additional work to be done for the 32-bit version and some more rigorous testing to be performed. But I am now fairly confident that this approach will work.

(The reason for the 32-bit vs 64-bit difference is that syscall.Syscall* and syscall.NewCallbackCDecl only allow/expect uintptr arguments, which are 4 bytes on 32-bit and 8 bytes on 64-bit. I have some int64 arguments that I need to pass, so on 32-bits I have to split them into 2 uintptr's to make the stack look right...)

@whyrusleeping
Copy link
Member

@billziss-gh thanks for the update! and for pushing this so hard :) Its exciting to think that we might actually get decent windows filesystem support (something I had previously thought would never happen).

@billziss-gh
Copy link

billziss-gh commented Jun 5, 2018

@whyrusleeping thanks :)

BTW, I have finished updating cgofuse so that it now has cgo and !cgo variants. The updated cgofuse passes all tests on macOS, Linux and Windows. On Windows both cgo and !cgo variants are tested.

So we are now waiting on the Go team and hopefully an approval of golang/go#25575.

@whyrusleeping
Copy link
Member

@billziss-gh any thought towards getting the Lock functionality working? That's my biggest complaint about our current fuse lib, it doesnt support file locking. We can also take this to a different issue if that makes sense

@billziss-gh
Copy link

The problem with user-mode locking is that it is not currently supported on OSXFUSE or WinFsp. So you have 2 out of the "3 OS'es" without locking support by default. This is why I have chosen not to implement it in cgofuse.

(To clarify file locking does work at the kernel level, it is just that it is not exposed to user mode. In practice this means that file locking will work for processes on the same machine, but it cannot be made to work for processes on different machines.)

Consider adding your voice at winfsp/winfsp#116. I actually have an unpublished implementation of user-mode locking support for WinFsp, but have not incorporated it into its public repo because of the problems documented in that issue.

@billziss-gh
Copy link

The PR golang/go#25575 has been merged into go as commit golang/go@bb0fae6.

@djdv
Copy link
Contributor Author

djdv commented Aug 10, 2018

Status update [5965a25]
desktop 2018 08 10 - 13 05 19 25 - 00 00 32 833

https://www.youtube.com/watch?v=9cf2wKA3WMw

@Stebalien
Copy link
Member

That's really cool. Thanks for the update!

@ZerxXxes
Copy link

Great work!
I have to try this out, really interested in the performance on reading 100MB+ size files through FUSE as the current implementation is horribly slow for that.

@ZerxXxes
Copy link

Hmm, I would like to compile this PR for linux to test the performance but seems its not currently possible? Looks like something changed in the main repo that needs to be merged first or do I misunderstand? I'm not very familiar with compiling go :(

@djdv
Copy link
Contributor Author

djdv commented Mar 18, 2019

@ZerxXxes
Hey sorry about that, we recently changed our dependency management system.
I'm in the middle of making a medium sized revision, and after that I'm going to rebase to pull in the changes to make sure everything builds reliably.
The existing variant should be buildable but requires some external tooling. The next commit should work with go's standard tools.

@ZerxXxes
Copy link

Hey no worries, looking forward to the next commit.
Keep up the good work!

@djdv
Copy link
Contributor Author

djdv commented Apr 23, 2019

Status update:
So just as a heads up, a lot of considerations are going into this patch at the moment, because components of it will likely influence core components of go-ipfs.

The general idea being that we have a middle abstraction between FUSE and our APIs, essentially wrapping and conforming multiple APIs to a generic filesystem API.

We then use that API to implement the FUSE interface itself, while also exposing it so that developers can use the same Go bindings for other tasks.

Essentially you should be able to do io.File, err := core.Open("/SomeNameSpace/SomeSubPath"), at an IPFS API level, similar to the current coreapi and mfs api, but unified under the same interface. (Regardless of which API it comes from, and without having unique, bindings, per, boundary (there's more))

Separate, more formal issues/PRs will come up around this later.
But a general outline is like this (*all subject to change):


To implement FUSE we have to implement this interface.
And we have the means to do that via these interfaces
Which are currently being implemented/translated for each API boundary, here.
(coreAPI, mfs, UnixFS/io, softDir, etc.)

A good example of the full pipeline is the /ipfs endpoint, which is parsed by the filesystem index into an object, which has a method that takes the output of coreapi.Pin().Ls(...) and translates it into a filesystem compatible directory.

The tie between the mount interfaces and the fuse interface, are in the process of being separated and made more generic. So the end result should be similar, but who's responsible for what will change.
Looking something like this:

func (fs *fuseIf) FuseOperationX(path, args) statusCode {
	// your specific code can go before or after these calls
	// unifying FS operations across API boundaries

	node, err := mountApi.Lookup(path)    //  use mount API to resolve paths to nodes, using daemon side logic
	//node, err := mountApi.Lookup("/ipfs/Qm.../file.ext")
	status := node.OperationX(args) // call desired operation
	//bytes, err := node.Read(offset)

	// more fuse specific logic
	if status == mountOpFail {
		return fuseFailedValue
	}

	fs.SideEffect(path)
	return status
}

But the idea being that you could have

doMyOwnWorkPart(1)

node, err := mountApi.Lookup(path) // get node from api
status := node.OperationX(args) // call desired operation on node

doMyOwnWorkPart(2)

anywhere you have API access, in the same way you can with IPFS paths and the coreapi.

The broader goal is that this is generally more maintainable and pleasant.
Adding a new API to the supported filesystem interface, would simply require implementing and registering it. Then any client could take advantage of it. For our fuse package specifically, this would simply mean adding the subsystem/node implementation package, registering it with the daemon, and then it would show up under /$namespace.

@djdv
Copy link
Contributor Author

djdv commented May 14, 2019

We're going over all the go-ipfs issues so I'm collecting notes on the ones that relate.

The current mount/FUSE implementation is --

Slow Not fast:
#2166
#4228

Incomplete/Inaccurate:
#4185
#2712

Not flexible/dynamic:
#2187
#5209


Other IPFS components stand to benefit from ipfs/interface-go-ipfs-core#30

The gateway could use a filesystem implementation to interact with our other APIs. req{"/files"} => localDaemonFs.Readdir("/files"), etc.)
#3188

ls behaviour isn't consistent
#4186

There's desire to merge ipfs ls and ipfs files ls
#5497


Neat:
#1702 (comment)


We're already doing that in this branch
#870
#2167
#2168

@polkovnikov
Copy link

Looks like 2 years of no activity on this Issue, and it is not yet closed.

Can anyone tell what's the current status of Windows Mount in current go-ipfs release?

@djdv
Copy link
Contributor Author

djdv commented Aug 15, 2021

Whoops, my bad. I failed this effort a long time ago. Closing now.
As far as I know nobody is working on this. Until someone picks that up, I've been putting time (when I can) into my own effort that achieves the same thing.

It was requested that I separate the code from go-ipfs, which makes sense, but did cost a lot of time because I had to re-implement all the portions the IPFS binary was giving me for free. And account for everything now being remote instead of in the same process.
That part is mostly done and tested now, so I can focus on the other half of the code (the mount stuff).
Just recently I migrated the experimental mounting code I already had for go-ipfs, over to this new base and it worked as expected. But all of that predates the existence of Go's fs.FS. So I'm in the process of migrating that over to the new-ish Go standard.

Here's the most recent status of that if you're interested.
https://www.youtube.com/watch?v=q-hKANpM9Xg
And this is an older version which was in the process of being separated out which is a little fancier.
https://www.youtube.com/watch?v=NeuWm8fJoGc
* Note that these videos are just for friends of mine so they're nothing exciting / not proper demos.

@djdv djdv closed this as completed Aug 15, 2021
@polkovnikov
Copy link

polkovnikov commented Aug 16, 2021

Looks very nice!

Maybe at least your fully working version inside single go-ipfs binary (or separate binary) is available somewhere in public? (in github of your user as a fork+branch?)

So that I (or anyone else) can compile it and try on Windows.

Also very interesting if your file system can show files and folder of unpinned files? E.g. if you do cd Z:\SOME_UNPINNED_HASH\ to some non-existing directory with real existing hash then you successfully enter this directory containing files.

@djdv
Copy link
Contributor Author

djdv commented Aug 16, 2021

@polkovnikov
Thanks :^)
Pardon the walls of text below, it's a lot of context.
But it should have the info you need to fetch, build, run, and track this.


Maybe at least your fully working version inside single go-ipfs binary (or separate binary) is available somewhere in public? (in github of your user as a fork+branch?)

For sure.
I have a temporary development repo up here: https://github.com/djdv/go-filesystem-utils
The main branch is mostly empty right now since everything is still in progress.
With PR's that will be merged when various components are all properly covered by tests.
I'll be pushing my working copy to this branch.
Which will basically have what I'm working on (before it's split up into PRs).

So that I (or anyone else) can compile it and try on Windows.

I don't have any documentation up yet, but I'll try to put some in there later.

For now you can reference this for the dependencies needed on the supported operating systems: https://github.com/billziss-gh/cgofuse#----cross-platform-fuse-library-for-go
(Windows users install WinFSP, macOS users install macFUSE, everyone else probably has fuse on their system already)

To build run go run ./cmd/build/main.go in the source directory and it should take care of all the configuration.
On non-Windows this just calls go build ./cmd/fs directly. On Windows it looks for the Fuse libraries and uses them, or chooses to dynamically link if they're not found.


The CLI isn't finished yet, so it's going to change, but for right now you should be able to do fs mount /path/*
where * is some kind of valid path. And it will figure out a bunch of default options.

On Windows this can be a drive letter /path/I:, a relative path /path/InsidecurrentDir, an absolute path /path/C:\ipfs, or a UNC path /path/\\localhost\ipfs.
(Make sure your IPFS daemon is running too or all the requests will just fail. You can pass a multiaddr to mount via `fs mount --ipfs="/some/maddr" to use whichever IPFS node, it doesn't have to be on the same machine and you can have multiple different ones mounted at once)

For other things, try calling fs --help, then fs mount --help, etc. for the subcommands.
Some of the docs might be empty or lying though since I'm currently porting things over and changing things.


can show files and folder of unpinned files?

Yeah.
The current in-progress branch has "IPFS", "IPNS", and "PinFS" ("KeyFS" and "MFS" will comes later).
IPFS is an empty root that should accept valid /ipfs/* namespace targets.
PinFS Simply lists your pins in the root directory and relays all sub-requests to IPFS.
IPNS is the same as IPFS but for /ipns/* which also includes things like dns names, so you can do
fs mount fuse ipns /path/N: dir N:\ipfs.io and it works.
KeyFS was like PinFS but for IPNS keys, it was also writable. Making a file/directory/link in there would create a key of the same name which points to it. This isn't ported yet.
Same with MFS, it's just mapped the node's "files root" or an arbitrary CID to the host file system. This isn't ported over yet either.

Currently only Fuse is supported, the 9P interfaces will also be ported over so that you can do fs mount 9p ipfs /ip4/127.0.0.1/tcp/564 like in the previous branches.

@polkovnikov
Copy link

polkovnikov commented Aug 17, 2021

@djdv Thanks for detailed comments!

I successfully managed to build fs.exe using Go command go run ./cmd/build/main.go.

But when I run it like fs mount fuse ipns /path/N: then it complains about WinFsp not being installed. When I try to install WinFsp taken from http://www.secfs.net/winfsp/rel/, this installer complains that I have WinFsp service already running.

But I don't remember that I was installing any WinFsp. If I look into regular location of program, i.e. inside Window's Settings / Apps / Apps & Features there I don't see any WinFsp in the list, although all my other installed programs are listed there.

Also when I try to install through Chocolatey's choco install winfsp it also complains that MSI run failed due to probably program being already present on my system.

And I see this weird behaviour (of WinFsp being installed) I see on both of my two laptops, on both laptops I didn't install WinFsp before.

Seems to me that standard modern Windows installation somehow includes pre-installed WinFsp drivers/service.

But this ghosty pre-installed WinFsp doesn't work with fs.exe, it says that WinFsp is not installed. Also I don't know how to find out if WinFsp is really pre-installed, can you suggest at which location to look for it?

@polkovnikov
Copy link

@djdv Just a side comment - you seems to know a lot about net-ipfs-mount project.

Seems to me that it doesn't recognize bafy-hashes, I created an issue here.

For example dir n:/ipns/ipfs.io/ works but not dir n:/ipns/en.wikipedia-on-ipfs.org/ (no such dir). First one uses old Qm hashes and second one uses new bafy hashes. Both are successfully listed through ipfs ls ....

Do you know by any chance if there's a work around for bafy hashes in that project?

@djdv
Copy link
Contributor Author

djdv commented Aug 17, 2021

Uh oh. That's pretty strange.
There shouldn't be anything pre-installed that conflicts, unless some other piece of software you use also uses/depends on it. Maybe some kind of sshfs, nfs-win, or something similar.
But if that were the case it should still work fine with fs.exe.

The default install ends up here C:\Program Files (x86)\WinFsp. If you can search your file system, you might want to try looking for the files winfsp-x86.sys, winfsp-x86.dll to see if some other program is using them.
The system service is named "WinFSP.Launcher" so you can see if that's running as well with services.msc.

Here's a really crowded screenshot of how everything looks on my current machine + a Windows 7 VM I just installed it on.
(I'm running the latest beta on my machine but I use the latest stable release in the other machine)
Untitled

If you run choco list -lo does it say anything about WinFSP in there?


you seems to know a lot about net-ipfs-mount project.

Sorry, I do not know much about it.
I tried it once a few years ago but only briefly.
It did not work with my repo. Like dips it just hanged for a while until it crashed. But did work with the default (small) repo.

dir n:/ipns/en.wikipedia-on-ipfs.org/

I just tested with my implementation and it seems to work.
To be specific, my system does not use the prefixes so it was just dir N:/en.wikipedia-on-ipfs.org but it did list the contents.
Anything that works with the IPFS CLI should work in this system since they use the same APIs. Including all the valid multicodecs that IPFS uses, like Qm, zD, bafy, etc.
Untitled2

@polkovnikov
Copy link

@djdv

choco list -lo has nothing about WinFsp, but I have dokany inside Chocolatey as it was needed by net-ipfs-mount project.

I searched whole my file system for *winfsp* and found only below following things:

C:\Users\User\Local Settings\Mail.Ru\Disk-O\vcurrent\
 06.08.21│108640│A     │winfsp2017.1.dll
 06.08.21│110688│A     │winfsp2017.2.dll
 06.08.21│126560│A     │winfsp2019.1.dll
 06.08.21│132704│A     │winfsp2020.dll
 C:\Windows\System32\disko\
  25.01.21│157184│A     │winfsp_x64.dll
  25.01.21│163640│A     │winfsp_x64.sys

Both things refer to famous Russian cloud storage service called "Mail.Ru Disk-O". I installed it before.

After uninstalling this application choco install winfsp worked fine!

Also your fs mount fuse ipns /path/N: worked well.

As a stress test I tried to open and browse 82 GB file n:\en.wikipedia-on-ipfs.org\wikipedia_en_all_maxi_2021-02.zim inside Kiwix Chrome extension obtained from here https://www.kiwix.org/en/download/.

I can admit that your fs.exe in Windows works much faster with this 82 GB Wiki + Kiwix. Before I tested this file with Linux Kiwix application and mounted file system through ipfs daemon --mount.

Linux version heavy-loaded my 4 cores (8 threads) with 90-100% of CPU usage and 1 MB/sec download speed for 1 hour, after 1 hour and 1 GB downloaded blocks I managed to open this Wikipedia under Linux. Under Linux I used not Chrome extension, but standalone Kiwix application obtained by link mentioned above.

Your fs.exe version in Windows and Chrome Extension of Kiwix worked much faster - just several minutes of 15-20% CPU usage and 150 MB of downloaded blocks.

So looking forward to integrating your utils into official IPFS package! They look very nice and work well for even huge file. I hope this integration happens soon so that everybody can benefit from great things of IPFS.

@polkovnikov
Copy link

@djdv BTW, do you have following improvement in your file system code?

For example I have very huge file (like 82 GB wikipedia ZIM file from previous comment). Then I want to do very small few-bytes readings from it, for example I'm parsing some metadata byte by byte. Also this small reads can be random access reads.

If I do millions of such few-bytes random read requests to your file system, how do you handle them? Do you just forward all few-bytes small requests to IPFS API? Or you do some caching technique in your code?

For example if I random-read small portion then some cache in your code instead can request 256 KB block from IPFS, then cache this 256KB block into RAM. If I do another small read from existing-in-cache block then your code returns this portion immediately instead of doing API call to IPFS node. If too many blocks were cached then least used (or oldest) block is removed from cache to reduce RAM usage.

Do you have anything like this? This is just an idea how to improve speed of small random reads greatly. Because 82 GB Wiki mentioned above was read by Kiwix really really slow on Linux's standard FUSE of ipfs daemon --mount. It occupied 100% of my 8 cores for 1 hour just to open Wikipedia. Opening means reading into memory and parsing by Kiwix all metadata and full text search index. Seems that Linux standard FUSE version doesn't do any caching but forwards all small requests directly to IPFS node.

@djdv
Copy link
Contributor Author

djdv commented Aug 17, 2021

@polkovnikov
That's excellent 😎
Thanks for testing it out.
This is in single threaded mode, without any caching or streaming tricks. So we have the potential to be a little better in the future too.
This version of the interface is 2 days old lol. But is mostly derived from the same work I did 2 year ago.
Despite the time gap, I haven't actually spent much time focused on this. Sometimes I'll dedicate a weekend to it, or maybe an hour at night here or there.

They look very nice and work well for even huge file.

Glad to hear that.
This is important to me too. I have large repos that are ~1TB in size, and I distribute datasets that span multiple gigabyes. Like my screencasts, vm images, etc.
So this is something I want/need and will be tested against. It doesn't have to just technically work, it has to be practically usable. Otherwise there's no point imo.

looking forward to integrating your utils into official IPFS package

Unfortunately this is not likely to happen. Members of the project already evaluated this effort and myself a while ago. As it turns out, both are unacceptable.
Harsh to hear, but not unfair. :^/
We'll have to wait for an official implementation to manifest from someone else.


Do you just forward all few-bytes small requests to IPFS API? Or you do some caching technique in your code?

At the moment, yeah. We're essentially, just translating formats and calling conventions from Fuse and 9P into ones for the IPFS APIs.
But every layer/module is separated in a way where anything can replace anything if needed.
If it makes sense to introduce our own caching somewhere, we can. And in fact do a number of things irrelevant to the IPFS APIs depending on the context.

To summarize, there's a lot of potential to do whatever, wherever, if it make sense to.
And we can try figuring out what's slow, why it's slow, and what can be done about it - as development progresses.
But as-is we're ahead of the official implementation in a lot of regards. Even with this half broken branch in the middle of migrations.

The much longer version: (click to expand) The most basic example is in Fuse where we actually keep track of open references instead of constructing and deconstructing them each time they're needed. In the old interface that hasn't been ported over yet, there was a lot of logic around making IPNS fast enough to be usable. Thankfully, this shouldn't be our responsibility anymore as IPNS itself has improved significantly in terms of speed. But back then, each request could take long enough to trigger the OS's IO timeout (we're talking upwards of 30 seconds per iop), yet the mount implementation I had was practically instant most of the time for reads and writes. It was good enough that users remarked on it :^)

image
(sorry for never responding Mark)

Today, I shouldn't have to use clever caching techniques in that layer because IPNS is doing some of the same things internally, as well as other tricks. Which is better for everyone.
Performance improvements, should be made in the IPFS APIs if they can be so that each user doesn't have to implement them in their own systems like these.

Kind of like what you're saying with reading into ram, potentially ahead of time.
Internally some of the IPFS APIs handle things like this already so we can get away with just relaying the requests.
I saw a lot of this in the unixfs modification libraries, which is all internal to them so I didn't have to do anything special in my own code that was skirting around MFS.

I had to avoid using the MFS library directly because it liked to deadlock for reasons that are not the callers fault.
Some of those have since been found out and patched, but I hear people are considering rewriting that whole library, which is essentially what I did previously to avoid it. Just calling UnixFS directly with my own libraries.
Doing so, made it practical for real world use. Where using the MFS API at the time (and probably still now) would not be as practical (or stable).
(I hear that library might be getting a rewrite eventually because these concerns are already known)

So there already have been reasons to do things like this, and they have been possible and implemented. And that's only easier to do now with some of the separation between modules, and the work done on IPFS itself.

really slow on Linux's standard FUSE of ipfs daemon --mount

While I did port the code for that over to my interfaces for legacy compatibility. I did not make much effort to inspect it. And have since dropped it entirely since we don't have to be part of go-ipfs.
I found a really bad security concern in it right away, and most of the code is borderline obfuscated.
With that in mind coupled with the fact the underlying library (bazil fuse) was Linux and macOS only (now just Linux only?). It felt safe to disregard it when the goal was to make something portable and maintainable.

However, I know the culprit is not Linux's fuse. It's either bazil's library, the ipfs implementation that uses it, or a combination of both.
The fs binary we're talking about here, works just as well on Windows, macOS, FreeBSD, Linux, and more in my testing. Thanks to the credit of cgofuse. Which on Linux should be using the same fuse that ipfs mount uses.
However, I don't see the same resource requirements with fs.


Also, we should probably use https://github.com/djdv/go-filesystem-utils/discussions for further discussion.

@polkovnikov
Copy link

@djdv So if your project is never going to get into official binary or package then at least would be great if you keep your project public and never close it, and possible maintain it at least from time to time.

Because your project is currently the only way that I have found for IPFS mounting to work on Windows, there are other few projects doing similar stuff but either they don't work or abandoned and outdated. Your project is the only working solution according to my knowledge on Windows.

Would be great if IPFS people at least mention your project on official website (ipfs.io) as a Windows-mounting solution. Because right now in web search engines I only see solution for Unix systems. But world of Windows is very huge, personally I use Windows much more often than Linux.

@djdv
Copy link
Contributor Author

djdv commented Aug 18, 2021

@polkovnikov

keep your project public and never close it
For sure. I think it would be a waste of time to write something and refine it only to prevent people from reading it later.

Your project is the only working solution according to my knowledge on Windows.

Unfortunately it seems so.
(Linux users have told me the same thing in the past too 👀)

never going to get into official binary or package
Would be great if IPFS people at least mention your project on official website

If users find the solution useful, it will spread organically as it enters various states of stable.
We're only days into the revival of this effort, so there's still plenty to iron out even if the core functionality is there. I think it would do more harm than good to bring attention to it early.

Also, the burden to fix fundamental problems with the project, should be on the IPFS team, not their users to compensate for them.
It has FS in the title, it should probably act like an FS. In the official product, maintained by the staff responsible for it.
And they have teams of staff, they don't need to rely on other people to do the work for free for them.

In my opinion, to point to a 3rd party stopgap for a 1st party problem, especially after initially claiming the solution and its author have no merit, would be nothing short of taking advantage of impatience, while also being personally offensive.

It should be said explicitly that while I may disagree with their decisions and remarks, there is no animosity between me and the team. I keep in contact with some of them and they are appreciative of what's being done. But I think there is a mutual understanding of what is and isn't fair.
The feature will be developed and maintained when/while I can, because people (myself included) want it now, not 10 years from now.
But it's up to PL to actually figure out and act on a long term strategy that they can maintain, not their community.

Maybe eventually we can do a handoff where my work is considered done, and they maintain it. But it's going to take time to get it up to their standards.

I hope that message doesn't come off as vitriolic, it's very hard to convey opposing forces and motivations at various levels like that. There's desires and ideals meeting against practicality and fairness that just sound like conflicting contradictions. And my English is terrible too. I hope it makes sense. ┐('~`;)┌

In any case, the focus should stay on the technical aspects and not much else. There's not much point to rehashing, speculation, or other things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic/windows Windows specific
Projects
None yet
Development

No branches or pull requests