New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shared mutable state is not thread safe. #663
Comments
I'm not that well versed with multi-threading, but can it be a problem in this case? The worst case is that the key is set twice (multiple times) to the same value, isn't it? |
I did a live stream about this: https://www.youtube.com/watch?v=Wu9LRNOc5pQ Yes, it can be a problem. I would suggest:
|
Thanks for the live stream - you make it very clear 👍 There is significant difference in performance, probably mostly due to less created objects and less garbage collection. Here is a YAML file for benchmark-driver: prelude: |
hash = {:root=>1, :hr=>1, :p=>315, :text=>631, :typographic_sym=>21, :html_element=>1, :blank=>387, :header=>45, :strong=>5, :smart_quote=>24, :ul=>21, :li=>74, :a=>59, :em=>15, :dl=>4, :dt=>14, :dd=>14, :codespan=>88, :codeblock=>94, :blockquote=>29, :math=>1}
dispatcher = Hash.new {|h, k| h[k] = "convert_#{k}" }
mutex = Mutex.new
mutex_dispatcher = Hash.new {|h, k| mutex.synchronize { h[k] = "convert_#{k}" } }
benchmark:
direct: |
hash.each {|name, calls| calls.times { "convert_#{name}" } }
hash: |
hash.each {|name, calls| calls.times { dispatcher[name] } }
hash with mutex: |
hash.each {|name, calls| calls.times { mutex_dispatcher[name] } } Here is the result on Ruby 2.6.5:
The contents of the The |
I like that benchmark but can you also try it in the context of parsing a real document? There is one more point about global state. It's never garbage collected. At least if you used a per-document local instance, you wouldn't need a mutex, and it only exists for the lifetime of the document. At least worth considering. If someone processes a kramdown document, that state will exist now for the lifetime of the process, not the lifetime of the document. Again, I'd opt for simple designs and only build more elaborate designs (with caches like this) when the cost of the implementation is outweighed by the value of the performance improvement. That's obviously subjective, but just for fun why don't you repeat the benchmark with a real document, e.g. a README from the project or something, and see if it actually makes more than a few points of difference. |
Thanks for the feedback! I have added an additional test where the hash is initialized each time:
So putting the dispatcher hash into the converter instance will create a bit more objects and be a bit slower but by not much. Running the benchmark by parsing the Without dispatcher:
With dispatcher:
So in this case also clearly faster with dispatcher. I have also done another test using The version with the dispatch hash removed is clearly slower for larger documents. To sum up: With the dispatcher is faster and the version with the local dispatcher is the way to go to avoid the thread safety issue. |
Just reading this and noticed the solution: kramdown/lib/kramdown/converter/html.rb Line 52 in ef22876
Looks nice. One thought is it might be interesting to use Symbols there, like: @dispatcher = Hash.new {|h, k| h[k] = :"convert_#{k}" } as that would deduplicate the various |
Ah, right, since kramdown/lib/kramdown/converter/html.rb Line 58 in ef22876
|
Right, |
A small benchmark for confirmation: prelude: |
def code
end
symbol = :code
string = 'code'
benchmark:
string: 'send(string)'
symbol: 'send(symbol)' Result on Ruby 2.6.5:
So, yes, symbol is twice as fast, will change the code, thank @eregon! Side note: Much of kramdown was started in the 1.8 era, so there are probably many other places where code could be updated to newer standards... |
Done. |
@gettalong I don't see the change on |
Ah, yes, now it's there! |
kramdown/lib/kramdown/converter/html.rb
Line 55 in 4458c23
The text was updated successfully, but these errors were encountered: