-
Notifications
You must be signed in to change notification settings - Fork 21.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implements an evented file system monitor #22254
Conversation
This is the implementation of the file update checker written by Puneet Agarwal for GSoC 2015 (except for the tiny version of the listen gem, which was 3.0.2 in the original patch). Puneet's branch became too out of sync with upstream. This is the final work in one single clean commit. Credit goes in the first line using a convention understood by the contrib app.
3.0.3 has a bug in OS X.
This commit also bases everything on Pathname internally.
In particular files are no longer created in the current working directory, but in a temporary folder.
Mac OS X tries by all means to hide that /var is /private/var, and that is what FSEvents reports back.
This sucks, but otherwise I get occasional Fs on Mac OS X.
"checker" is the name being used everywhere.
❤️ 💚 💓 💛 💙 |
+1111100000000 |
@fxn shouldn't |
For the use case of this monitor it probably does not matter. If you edit a file simultaneously to firing a request... well, you probably can't either know if it should reload or not. Once the flag is true, it stays true. Also, dev mode is single-threaded. But if you guys have a good justification for its need, we could change it. |
The problem is that if one thread actually sets |
@fxposter really? Was not aware of that possibility, I believed it opened the door to a race condition, but that eventually the shared value would be updated. That is, atomic vs not atomic. The main use case for these monitors is not multi-threaded, but I'll read that document you linked to. Meanwhile, please anybody feel free to chime in. |
I don't think it's my place to suggest anything about how you want to implement this, but I will clarify that Rubinius does not have a GIL/GVL and that instance variables do not have volatile semantics. If you want multiple threads to have precisely the same view of a memory location at a point in time, you must use a Mutex, memory fence, or other operation that implies synchronizing the view of that memory location across threads. |
Yes, yes, code that is not thread-safe it is not thread-safe and there may be race conditions and the outcome be non-deterministic. That is clear. What is beyond my understanding is that without synchronization not only you're exposed to race conditions, but that a stale value can be cached by the CPU forever (as said above). Do synchronization idioms actually flush caches in addition to prevent parallel execution? Note that this particular code is not expected to be run by multiple threads, because its main use case is app reloading, which uses constant autoloading, which is not thread-safe per se as of this writing. But we could certainly consider making the monitor thread-safe, and in any case I find interesting and would like to understand that observation made by @fxposter (reading the link pending). |
That's not a realistic scenario, but if that variable for some reasons is not evicted from CPU cache by something else - CPU does not need to remove it and "update" - that's not how CPUs work. Anyway - that's not the thing I'd actually rely upon. :) |
@fxposter but the memory is shared. If you do not synchronize there may be race conditions that may affect who writes last (and in a general scenario which value ends up being stored). But sync'ed or not sync'ed, aren't the threads just writing to a memory location like anything else does? What is different here? |
@fxn yes, you are right - memory is shared. Not CPU caches. And threads can write to CPU caches and nothing forces cache line to be written to the main memory except synchronization or memory barriers. |
@fxposter ah! that was the bit I didn't know about, that threads can write to caches without going through memory. I thought the CPU cache was an optimization transparent to programs, whose only interface was memory. Then, a synchronized block forces that? |
Yes, synchronized block or volatile (atomic, in our case) variables. But if you want to make it synchronized - you want to use synchronized both for reads and writes. |
Awesome, thanks @fxposter! |
@fxn Thank you for improving Rails! 👍 |
FWIW, I know Rails 5+ now depends on concurrent-ruby, which provides facilities for atomic and volatile variables. That's the recommended path forward rather than full synchronization, if you don't need locking. |
Revised in 49a5b40. I chose to lock around the @fxposter does it look good to you? |
@fxn Commented in commit. In general - I don't think that we need mutex there if we change implementation a bit, but with current implementation mutex is needed there and it will work. |
@fxposter It seems we are done. Thanks a lot for your feedback. The fact that I didn't know reading ivars needed synchronization even if its writer was synchronized tells me I need to dig into portable Ruby multithreading. I'll need to find a trustworthy reference, do you have any recommendation? |
Well, it's a tough question... I'd recommend reading stuff on Java Memory Model and C++ Memory Model (the more you read - the better). In terms of people - start with Gil Tene, Martin Thompson, Michael Barker, Aleksey Shipilëv (he is Russian, but sometimes blogs an talks in English), then you'll probably find others (this is 99% about Java and JMM). Jeff Preshing has a lot of interesting stuff in his blog (http://preshing.com/archives/, search "memory", "atomic", "barrier", "sync", "volatile"). Watching talks from Java/C++ conferences related to performance and multithreading also helps. From the Ruby point of view - jRuby uses JMM and https://github.com/jruby/jruby/wiki/Concurrency-in-jruby. MRI uses GIL, which basically means that there is a happens-before relation between any read and write to shared variable because they are basically "synchronized" by the GIL (it doesn't mean that they are atomic, but it means that writes from one thread are visible to the others). Otherwise, MRI has no specified behaviors of how things should or shouldn't work. :( Also, you can dig into internals of https://github.com/ruby-concurrency/concurrent-ruby/ and watch @jdantonio's talks on concurrent ruby (there are a couple of them). |
Awesome reply, thanks very much!!! |
@fxn The talk I gave at RubyConf last week specifically addresses the GIL and why it doesn't make volatility/visibility promises. It may answer your question. The video isn't online at Confreaks yet, but it should be available within a few days. Keep an eye on this page for it to appear. @fxposter Thanks for the shout-out! |
@jdantonio wow, so MRI requires synchronization for reading/writing shared vars in different threads? Waiting for your talk too. |
The TL;DR is that MRI makes no guarantees--it has no formal memory model. The GIL implicitly makes most var reads and writes safe, but nothing would prevent future updates from changing that. There are many operations within MRI which silently release the GIL. The best practice is to not depend in the GIL for thread safety. |
True, but we have no choice. :)
But thread-safety and visibility are not the same thing.
This is the most interesting thing. :) But as far as I understand - if you are in ruby-only land (ie: don't use extensions) - you are safe, because there always be a "happens-before" relationship between writing var in one thread, and reading in the other, isn't it? |
Yes, MRI has happens-before semantics, but not a guarantee. That's a side-effect. If you are only writing for MRI you generally don't have to worry. My personal preference is to use the atomic variables in concurrent-ruby. Then you can take advantage of our stronger guarantees and also support the other runtimes. But I'm biased. :-) |
Implements an evented file system monitor to asynchronously detect changes in the application source code, routes, locales, etc.
To opt-in load the listen gem in
Gemfile
:Original work by @puneet24 for GSoC 2015, later iterated by yours truly.