-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance issue with large-ish records #20
Comments
I think the profiling may be misleading -- it looks like the expensive part is calling require 'ruby-prof'
require 'mods'
require 'faraday'
purl = 'https://purl.stanford.edu/pj169kw1971.mods'
mods_file = Faraday.get(purl).body
result = RubyProf.profile do
record = Mods::Record.new.from_str(mods_file)
record.inspect; 1
end
"Mods Version: #{Mods::VERSION}"
RubyProf::FlatPrinter.new(result).print(STDOUT) |
Perhaps. To be totally honest, I'm not a profiling expert. Maybe I should have just left that data out (because it is possibly misleading) but I figure somebody who is more facile with profiling could take this and run with it. I guess more than anything I just wanted to point out that this issue existed (I'll leave the data out next time). I don't quite understand how calling That being said, simply calling require 'mods'
require 'faraday'
purl = 'https://purl.stanford.edu/pj169kw1971.mods'
mods_file = Faraday.get(purl).body
start_time = Time.now
record = Mods::Record.new.from_str(mods_file)
puts record.sort_title.inspect
end_time = Time.now
puts "We took #{end_time - start_time} seconds to get the sort_title"
So in this case I want to get the sort title of this record. In order to do so it took more than 84 seconds. I'll leave what to profile up to somebody who may be more knowledgable in that area (because clearly I am not). |
Sorry -- what I mean is, |
So when you run my code example in #20 (comment) you're getting the sort_title output in 0.3s? I think that's a typical use case, store the record to a variable, call a method on that object to extract a piece of data (the |
(this may be specific to records w/ many subjects, which is the case in the example I'm using)
In looking into performance issues reported in METADOR-56 I have started doing a little profiling in the Mods/StanfordMods/ModsDisplay stack.
I think my profiling has indicated a pretty serious performance issue in the Mods gem when instantiating new records from large-ish record.
Example XML: https://purl.stanford.edu/pj169kw1971.mods (43.7 KB)
(This record has about 280 subject elements in it)
Simply instantiating a new
Mods::Record
viafrom_str
takes more than 1 minute. Admittedly 43.7 KB is not that large of an XML file so I'm inclined to believe that something else is going awry somewhere here or in one of this gem's dependencies.Here is my profiling run:
Result
The text was updated successfully, but these errors were encountered: