Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Fix for memory issues #18

Closed
wants to merge 2 commits into from

4 participants

Tomasz Lipinski Matteo Collina Ivar Vasara John Nunemaker
Tomasz Lipinski

Recently I had quite big xml to process (about 45MB xml with 190,000 elements originating from root, and lots of other nested elements) but happymapper keeped crashing with so much data due to full memory consumption. I've made some changes in happymapper's codebase so that it doesn't return whole collection of objects but yields it part by part if I set :limit option. Now if I set it to 1000 for instance, memory usage jumps up only by 20% and stays on this level during whole processing.

Matteo Collina

This is awesome. I'll move from ROXML to your branch asap :).
Could this be merged upstream?

Tomasz Lipinski

Matteo, @burtlo pulled and documented these changes into his fork (option changed to :in_groups_of) burtlo/happymapper@5538d3f.

Ivar Vasara

@jnunemaker any chance you could take a look at this ? It would be great to get this functionality into the main branch..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Showing 2 unique commits by 1 author.

Jun 21, 2011
pplcanfly yield partial result by using :limit option 8f99c16
Sep 10, 2011
Tomasz Lipinski Edited README.rdoc via GitHub 87b90b5
This page is out of date. Refresh to see the latest.

Showing 3 changed files with 45 additions and 67 deletions. Show diff stats Hide diff stats

  1. +2 54 README.rdoc
  2. +25 13 lib/happymapper.rb
  3. +18 0 spec/happymapper_spec.rb
56 README.rdoc
Source Rendered
... ... @@ -1,55 +1,3 @@
1   -= happymapper
  1 +This fork adds an option that allows user to limit results returned by parser.
2 2
3   -== DESCRIPTION:
4   -
5   -XML to object mapping library. I have included examples to help get you going. The specs should also point you in the right direction.
6   -
7   -== FEATURES:
8   -
9   -* Easy to define xml attributes and elements for an object
10   -* Fast because it uses libxml-ruby under the hood
11   -* Automatic conversion of xml to defined objects
12   -
13   -== EXAMPLES:
14   -
15   -Here is a simple example that maps Twitter statuses and users.
16   -
17   - class User
18   - include HappyMapper
19   -
20   - element :id, Integer
21   - element :name, String
22   - element :screen_name, String
23   - element :location, String
24   - element :description, String
25   - element :profile_image_url, String
26   - element :url, String
27   - element :protected, Boolean
28   - element :followers_count, Integer
29   - end
30   -
31   - class Status
32   - include HappyMapper
33   -
34   - element :id, Integer
35   - element :text, String
36   - element :created_at, Time
37   - element :source, String
38   - element :truncated, Boolean
39   - element :in_reply_to_status_id, Integer
40   - element :in_reply_to_user_id, Integer
41   - element :favorited, Boolean
42   - has_one :user, User
43   - end
44   -
45   -See examples directory in the gem for more examples.
46   -
47   -http://github.com/jnunemaker/happymapper/tree/master/examples/
48   -
49   -== INSTALL:
50   -
51   -* gem install happymapper
52   -
53   -== DOCS:
54   -
55   -http://rdoc.info/projects/jnunemaker/happymapper
  3 +https://github.com/burtlo/happymapper/commit/5538d3f56b2d231bf5d8830af330b040328c6a1a
38 lib/happymapper.rb
@@ -96,24 +96,36 @@ def parse(xml, options = {})
96 96 xpath += tag_name
97 97
98 98 nodes = node.find(xpath, Array(namespace))
99   - collection = nodes.collect do |n|
100   - obj = new
101 99
102   - attributes.each do |attr|
103   - obj.send("#{attr.method_name}=",
104   - attr.from_xml_node(n, namespace))
105   - end
  100 + limit = options[:limit] || nodes.size
  101 + return [] if limit == 0
106 102
107   - elements.each do |elem|
108   - obj.send("#{elem.method_name}=",
109   - elem.from_xml_node(n, namespace))
110   - end
  103 + collection = []
  104 + nodes.each_slice(limit) do |slice|
  105 + part = slice.collect do |n|
  106 + obj = new
  107 +
  108 + attributes.each do |attr|
  109 + obj.send("#{attr.method_name}=",
  110 + attr.from_xml_node(n, namespace))
  111 + end
  112 +
  113 + elements.each do |elem|
  114 + obj.send("#{elem.method_name}=",
  115 + elem.from_xml_node(n, namespace))
  116 + end
111 117
112   - obj.send("#{@content}=", n.content) if @content
  118 + obj.send("#{@content}=", n.content) if @content
113 119
114   - obj.class.after_parse_callbacks.each { |callback| callback.call(obj) }
  120 + obj.class.after_parse_callbacks.each { |callback| callback.call(obj) }
115 121
116   - obj
  122 + obj
  123 + end
  124 + if options[:limit] && block_given?
  125 + yield part
  126 + else
  127 + collection += part
  128 + end
117 129 end
118 130
119 131 # per http://libxml.rubyforge.org/rdoc/classes/LibXML/XML/Document.html#M000354
18 spec/happymapper_spec.rb
@@ -395,4 +395,22 @@ def mapping
395 395 end
396 396 end
397 397
  398 + describe "with limit option" do
  399 + it "should return results with limited size: 6" do
  400 + sizes = []
  401 + posts = Post.parse(fixture_file('posts.xml'), :limit => 6) do |a|
  402 + sizes << a.size
  403 + end
  404 + sizes.should == [6, 6, 6, 2]
  405 + end
  406 +
  407 + it "should return results with limited size: 10" do
  408 + sizes = []
  409 + posts = Post.parse(fixture_file('posts.xml'), :limit => 10) do |a|
  410 + sizes << a.size
  411 + end
  412 + sizes.should == [10, 10]
  413 + end
  414 + end
  415 +
398 416 end

Tip: You can add notes to lines in a file. Hover to the left of a line to make a note

Something went wrong with that request. Please try again.