Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lazily create nested enums #11

Closed
tkrajacic opened this issue Feb 9, 2015 · 12 comments
Closed

lazily create nested enums #11

tkrajacic opened this issue Feb 9, 2015 · 12 comments
Milestone

Comments

@tkrajacic
Copy link

Reading https://devforums.apple.com/message/1101244#1101244 it makes me think that this would be the right way to create the nested enum structure lazily.

Any thoughts?

@drmohundro
Copy link
Owner

Delayed sounds like a very interesting way to do this. My initial thought had been to keep a queue of element names and then enumerate the queue when the final element was requested, but this might be a more elegant solution. I'll try to get a spike going with some of these ideas.

@tkrajacic
Copy link
Author

Have you had any time to look into this?
I just happen to have a 35MB XML file, with around 500k lines, and it is not pretty ;)
Takes about 16 seconds to parse (in Release mode on my MBP) and takes a whopping 700MB of RAM :D

Profiling shows that most of the time is (kind of obviously) spent in retain and release calls. So it would be hugely beneficial to circumvent Swifts memory management here.
Something like allocating one block of memory and using pointers to address it>
(I guess that should be another issue probably)

Of course, most people will never see these problems when working with small XML files and I could use the raw NSXMLParser. But hey, SWXMLHash is really pretty ;) (and 'Swifty')

@drmohundro
Copy link
Owner

Now that is a perfect use case for lazy parsing! 😃

I haven't had a chance to yet, but it is definitely on my to do list. Would it be possible to share the XML file (or perhaps a sanitized version of it) so that I can easily compare/contrast the differences in performance, etc.? If it is too big of a hassle then no worries.

@tkrajacic
Copy link
Author

Sadly I am not allowed to give out the data :(

@drmohundro
Copy link
Owner

See this gist for my current thoughts on how I think I might approach lazy loading.

Anything I'm missing with that general approach? I'm conceptually thinking of each subscript operation as a stream or path of operations and, given that, I should be able to parse only the XML elements in that stream.

@drmohundro
Copy link
Owner

I've begun work on this and have created #17 to track this work.

@drmohundro
Copy link
Owner

I think this is ready for a first iteration now - apologies on taking so long getting something going here! @tkrajacic do you think you could pull down PR #17 and see how it handles your 35MB file? Before I bring this in to master, I'd like to make sure I'm actually improving the performance :)

A few notes... I initially tried to detect when it was safe to stop parsing, but I don't think that will work with all XML fragments, so the NSXMLParser will still look through the entire document, but will only load results that match what was requested. The actual parsing doesn't get triggered until a call to all or element is made (or a call that calls them like children).

@tkrajacic
Copy link
Author

Ok, I am speechless :)
Without changing anything in my code (I am still calling just the parse function) parsing now takes only 3.2 seconds! (previously 16-20 seconds)
Memory consumption is also way down to around 200-300MB (it increases though when scrolling the table, where I present text of the nodes)

Those are just quick comparisons.

And now for the kicker:
When using your new lazy parsing, I get results almost immediately (hard to measure exactly because I fetch a few nodes and present them in a table.
And the best part, it only takes about 80MB of memory, and when scrolling the table actually never goes higher than maybe 90-100MB and immediately dropping back to 80MB when stopping scrolling.
This is brilliant!
You @drmohundro are the man!

I still can't stop grinning :D

@drmohundro
Copy link
Owner

That's awesome! Glad it works!

Quick question - are you targeting Xcode 6.3 or an earlier version? I only ask because if you're targeting an earlier version, I will just merge this into master. There are very few changes in my PR that assume Swift 1.2, so it would be easy to drop them.

@tkrajacic
Copy link
Author

I am only using Xcode 6.3 anymore.

@drmohundro
Copy link
Owner

Okay - I'll just merge into the xcode-6.3 branch then.

@drmohundro
Copy link
Owner

It's now live in the xcode-6.3 branch - I'll go ahead and close this issue. Thanks so much for your help with both the suggestion and with testing this!

@drmohundro drmohundro added this to the 1.0 milestone Mar 25, 2015
klaaspieter pushed a commit to klaaspieter/SWXMLHash that referenced this issue Nov 18, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants