Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RemoteFs methods should take immutable self #18

Closed
ok-nick opened this issue Jan 9, 2022 · 7 comments · Fixed by #19
Closed

RemoteFs methods should take immutable self #18

ok-nick opened this issue Jan 9, 2022 · 7 comments · Fixed by #19

Comments

@ok-nick
Copy link
Contributor

ok-nick commented Jan 9, 2022

Just ran across an issue where I'm trying to call methods of a struct implementing RemoteFs across several threads, but since the RemoteFs trait requires that self must be passed mutably, I cannot access it without using some form of interior mutability, like a Mutex or RwLock. I'm currently implementing a file system and there could be thousands of operations happening at once and using locks would mean each operation would have to wait on another resulting in a significant performance impact. Most of the protocols don't require mutability on their methods which is why this feature is reasonable, however, the ones that do could implement interior mutability themselves.

@veeso
Copy link
Member

veeso commented Jan 11, 2022

I admit I've never used interior mutability before, but this will be the right time to start using it.

@veeso
Copy link
Member

veeso commented Jan 11, 2022

Merged, but I just realized that connect and disconnect must be kept as mutable (I don't think will be a problem for you anyway)

@veeso
Copy link
Member

veeso commented Jan 11, 2022

Client implementation:

  • FTP
  • SSH
  • AWS-s3

@ok-nick
Copy link
Contributor Author

ok-nick commented Jan 12, 2022

I took a quick look at the implementations in the various clients and noticed that you used a RefCell instead of something like a RwLock or Mutex. RefCell's are meant for single threading programs whereas the locking ones are meant for multithreading.

I'm not too familiar with the in-depths of multithreading, but one thing to watch out for with a RwLock is writer starvation. Meaning, that there could be too many readers to the point where writers are never able to write. In a library like this, I could see that happening often.
Now if you were to use a Mutex since in most of the protocols you are wrapping the current path, you'd have constant locking and you'd pretty much fall into the same trap I described when I created this issue, where each operation is blocked by the other.

I think the solution here is to use something like arc-swap. Again, I'm not too familiar with this kind of stuff, but with some basic googling and research this is what I've found. IIRC an ArcSwap is basically a RwLock<Arc<T>>, however, instead of waiting for all the RwLock readers to drop before writing (which could happen), the ArcSwap will allow writers to write into a new set of data so that any existing readers could keep reading the old data. I'm not sure if this is just overcomplicating things or if it's the correct route to go, but as I said, this is what I've gathered.

@ok-nick
Copy link
Contributor Author

ok-nick commented Jan 12, 2022

Check out issue #22, might be able to solve the problem without needing any interior mutability.

@ok-nick
Copy link
Contributor Author

ok-nick commented Jan 22, 2022

To get a gist, basically, relative paths should just be removed completely. No matter what, even if you use what I said above there could still be issues. A big one is if you change the current directory then you try and call, for example, list_dir. In that time frame from when u called cwd to list_dir, cwd could've been called again and it would pretty much break everything.

My solution is to always use absolute paths, even in FTP where it would have to send a separate request for that. While this works, it only really works to an extent. A protocol like FTP isn't meant to be multithreaded on a single connection and typically you'd open another if you are working on a file elsewhere. So I think the goal here is to not make protocols that are not meant to be thread-safe, thread-safe. However, I do believe that relative paths should still be removed.

Without relative paths, each method on FTP would first need to send a request to change the working directory, if necessary. Then, to make it actually thread-safe, the remotefs-ftp crate could expose a separate struct that wraps the inner one (that we currently have) to automatically manage new connections and optimize everything for maximum performance and of course, thread safety. This way users could use protocols like FTP as if it were naturally thread-safe.

@ok-nick
Copy link
Contributor Author

ok-nick commented Feb 21, 2022

@veeso any thoughts on this? I might go ahead and implement it,

@veeso veeso removed their assignment Nov 14, 2022
@veeso veeso closed this as completed Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants