-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use symbols as keys for _reflections
#51726
Conversation
Using a simple benchmark such as: ```ruby Post.transaction do 100.times do post = Post.create!(author_id: 42, title: "A" * 50, body: "a" * 400, tags: "blah blah blah", published_at: Time.now, comments_count: 20) 20.times do post.comments.create!(author_id: 42, body: "a" * 120, tags: "blah blah blah", published_at: Time.now) end end end posts = Post.includes(:comments).to_a ``` This allocate `43,077` objects, and out of these `2,200` are caused by association names being stored as strings internally: ``` Allocated String Report ----------------------------------- 80.00 kB 2000 "post" 2000 activerecord/lib/active_record/reflection.rb:123 8.00 kB 200 "comments" 200 activerecord/lib/active_record/reflection.rb:123 ``` This is because many API accept either symbol or string, and blindly call `to_s` on it. We could avoid this with sprinkling the code with `Symbol == association ? association.name : association.to_s`, but it's ugly. This issue may be entirely solved in a future Ruby version, but it will take years: https://bugs.ruby-lang.org/issues/20350 By using symbols, we both save allocations, and marginally speed up lookups and comparisons, reducing this particular benchmark allocations by 5%. It's a bit unclear why these were ever made strings. Historically symbols were immortal and using them for user supplied data could lead to DOS vulnerability, so this may have been why, but it's not longer a concern since Ruby 2.2.
Using the samebenchmark as rails#51726 A significant part of the memory footprint comes from `Result#each`: ``` Total allocated: 4.61 MB (43077 objects) Total retained: 3.76 MB (27621 objects) allocated memory by file ----------------------------------- 391.52 kB activerecord/lib/active_record/result.rb retained memory by file ----------------------------------- 374.40 kB activerecord/lib/active_record/result.rb ``` After: ``` Total allocated: 4.32 MB (43079 objects) Total retained: 3.65 MB (29725 objects) allocated memory by file ----------------------------------- 101.66 kB activerecord/lib/active_record/result.rb retained memory by file ----------------------------------- 84.70 kB activerecord/lib/active_record/result.rb ```
end | ||
|
||
# Returns a Hash of name of the reflection as the key and an AssociationReflection as the value. | ||
# | ||
# Account.reflections # => {"balance" => AggregateReflection} | ||
# Account.reflections # => {balance: => AggregateReflection} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't reflections
still returning string keys?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. It's public API so I couldn't change it. I messed up the documentation, sorry :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries, I see Rafael already pushed a fix. Thanks!
Using the samebenchmark as rails#51726 A significant part of the memory footprint comes from `Result#each`: ``` Total allocated: 4.61 MB (43077 objects) Total retained: 3.76 MB (27621 objects) allocated memory by file ----------------------------------- 391.52 kB activerecord/lib/active_record/result.rb retained memory by file ----------------------------------- 374.40 kB activerecord/lib/active_record/result.rb ``` Rows are initially stored as arrays, but `Result#each` convert them to hashes. Depending on how many elements they contain, hashes use between 2 and 5 times as much memory than arrays. | Length | Hash | Array | Diff | | --------- | --------- | --------- | --------- | | 1 | 160B | 40B | -120B | | 2 | 160B | 40B | -120B | | 3 | 160B | 40B | -120B | | 4 | 160B | 80B | -80B | | 5 | 160B | 80B | -80B | | 6 | 160B | 80B | -80B | | 7 | 160B | 80B | -80B | | 8 | 160B | 80B | -80B | | 9 | 464B | 160B | -304B | | 10 | 464B | 160B | -304B | | 11 | 464B | 160B | -304B | | 12 | 464B | 160B | -304B | | 13 | 464B | 160B | -304B | | 14 | 464B | 160B | -304B | | 15 | 464B | 160B | -304B | | 16 | 464B | 160B | -304B | | 17 | 912B | 160B | -752B | | 18 | 912B | 160B | -752B | | 19 | 912B | 320B | -592B | | 20 | 912B | 320B | -592B | | 21 | 912B | 320B | -592B | | 22 | 912B | 320B | -592B | | 23 | 912B | 320B | -592B | | 24 | 912B | 320B | -592B | | 25 | 912B | 320B | -592B | | 26 | 912B | 320B | -592B | | 27 | 912B | 320B | -592B | | 28 | 912B | 320B | -592B | | 29 | 912B | 320B | -592B | | 30 | 912B | 320B | -592B | | 31 | 912B | 320B | -592B | | 32 | 912B | 320B | -592B | | 33 | 1744B | 320B | -1424B | | 34 | 1744B | 320B | -1424B | | 35 | 1744B | 320B | -1424B | | 36 | 1744B | 320B | -1424B | | 37 | 1744B | 320B | -1424B | | 38 | 1744B | 320B | -1424B | | 39 | 1744B | 640B | -1104B | | 40 | 1744B | 640B | -1104B | | 41 | 1744B | 640B | -1104B | | 42 | 1744B | 640B | -1104B | | 43 | 1744B | 640B | -1104B | | 44 | 1744B | 640B | -1104B | | 45 | 1744B | 640B | -1104B | | 46 | 1744B | 640B | -1104B | | 47 | 1744B | 640B | -1104B | | 48 | 1744B | 640B | -1104B | | 49 | 1744B | 640B | -1104B | | 50 | 1744B | 640B | -1104B | Rather than to convert rows into hashes, we can loopkup the column index into a single hash common to all rows. To not complexify the code too much, rather than to pass the row array and the column index, we wrap both into an `IndexedRow` object, which uses an extra `40B` object, but that's still less memory even in the worst case. After: ``` Total allocated: 4.32 MB (43079 objects) Total retained: 3.65 MB (29725 objects) allocated memory by file ----------------------------------- 101.66 kB activerecord/lib/active_record/result.rb retained memory by file ----------------------------------- 84.70 kB activerecord/lib/active_record/result.rb ``` As for access speed, it's of course a bit slower, but not by much, it's between `1.5` and `2` times slower, but remains in the 10's of M iterations per second, so I think this overhead is negligible compared to all the work needed to access a model attribute. Also the `LazyAttributeSet` class only access this once, after the attribute is casted, the resulting value is still stored in a regular `Hash`.
Using the samebenchmark as rails#51726 A significant part of the memory footprint comes from `Result#each`: ``` Total allocated: 4.61 MB (43077 objects) Total retained: 3.76 MB (27621 objects) allocated memory by file ----------------------------------- 391.52 kB activerecord/lib/active_record/result.rb retained memory by file ----------------------------------- 374.40 kB activerecord/lib/active_record/result.rb ``` Rows are initially stored as arrays, but `Result#each` convert them to hashes. Depending on how many elements they contain, hashes use between 2 and 5 times as much memory than arrays. | Length | Hash | Array | Diff | | --------- | --------- | --------- | --------- | | 1 | 160B | 40B | -120B | | 2 | 160B | 40B | -120B | | 3 | 160B | 40B | -120B | | 4 | 160B | 80B | -80B | | 5 | 160B | 80B | -80B | | 6 | 160B | 80B | -80B | | 7 | 160B | 80B | -80B | | 8 | 160B | 80B | -80B | | 9 | 464B | 160B | -304B | | 10 | 464B | 160B | -304B | | 11 | 464B | 160B | -304B | | 12 | 464B | 160B | -304B | | 13 | 464B | 160B | -304B | | 14 | 464B | 160B | -304B | | 15 | 464B | 160B | -304B | | 16 | 464B | 160B | -304B | | 17 | 912B | 160B | -752B | | 18 | 912B | 160B | -752B | | 19 | 912B | 320B | -592B | | 20 | 912B | 320B | -592B | | 21 | 912B | 320B | -592B | | 22 | 912B | 320B | -592B | | 23 | 912B | 320B | -592B | | 24 | 912B | 320B | -592B | | 25 | 912B | 320B | -592B | | 26 | 912B | 320B | -592B | | 27 | 912B | 320B | -592B | | 28 | 912B | 320B | -592B | | 29 | 912B | 320B | -592B | | 30 | 912B | 320B | -592B | | 31 | 912B | 320B | -592B | | 32 | 912B | 320B | -592B | | 33 | 1744B | 320B | -1424B | | 34 | 1744B | 320B | -1424B | | 35 | 1744B | 320B | -1424B | | 36 | 1744B | 320B | -1424B | | 37 | 1744B | 320B | -1424B | | 38 | 1744B | 320B | -1424B | | 39 | 1744B | 640B | -1104B | | 40 | 1744B | 640B | -1104B | | 41 | 1744B | 640B | -1104B | | 42 | 1744B | 640B | -1104B | | 43 | 1744B | 640B | -1104B | | 44 | 1744B | 640B | -1104B | | 45 | 1744B | 640B | -1104B | | 46 | 1744B | 640B | -1104B | | 47 | 1744B | 640B | -1104B | | 48 | 1744B | 640B | -1104B | | 49 | 1744B | 640B | -1104B | | 50 | 1744B | 640B | -1104B | Rather than to convert rows into hashes, we can loopkup the column index into a single hash common to all rows. To not complexify the code too much, rather than to pass the row array and the column index, we wrap both into an `IndexedRow` object, which uses an extra `40B` object, but that's still less memory even in the worst case. After: ``` Total allocated: 4.32 MB (43079 objects) Total retained: 3.65 MB (29725 objects) allocated memory by file ----------------------------------- 101.66 kB activerecord/lib/active_record/result.rb retained memory by file ----------------------------------- 84.70 kB activerecord/lib/active_record/result.rb ``` As for access speed, it's of course a bit slower, but not by much, it's between `1.5` and `2` times slower, but remains in the 10's of M iterations per second, so I think this overhead is negligible compared to all the work needed to access a model attribute. Also the `LazyAttributeSet` class only access this once, after the attribute is casted, the resulting value is still stored in a regular `Hash`.
Using the samebenchmark as rails#51726 A significant part of the memory footprint comes from `Result#each`: ``` Total allocated: 4.61 MB (43077 objects) Total retained: 3.76 MB (27621 objects) allocated memory by file ----------------------------------- 391.52 kB activerecord/lib/active_record/result.rb retained memory by file ----------------------------------- 374.40 kB activerecord/lib/active_record/result.rb ``` Rows are initially stored as arrays, but `Result#each` convert them to hashes. Depending on how many elements they contain, hashes use between 2 and 5 times as much memory than arrays. | Length | Hash | Array | Diff | | --------- | --------- | --------- | --------- | | 1 | 160B | 40B | -120B | | 2 | 160B | 40B | -120B | | 3 | 160B | 40B | -120B | | 4 | 160B | 80B | -80B | | 5 | 160B | 80B | -80B | | 6 | 160B | 80B | -80B | | 7 | 160B | 80B | -80B | | 8 | 160B | 80B | -80B | | 9 | 464B | 160B | -304B | | 10 | 464B | 160B | -304B | | 11 | 464B | 160B | -304B | | 12 | 464B | 160B | -304B | | 13 | 464B | 160B | -304B | | 14 | 464B | 160B | -304B | | 15 | 464B | 160B | -304B | | 16 | 464B | 160B | -304B | | 17 | 912B | 160B | -752B | | 18 | 912B | 160B | -752B | | 19 | 912B | 320B | -592B | | 20 | 912B | 320B | -592B | | 21 | 912B | 320B | -592B | | 22 | 912B | 320B | -592B | | 23 | 912B | 320B | -592B | | 24 | 912B | 320B | -592B | | 25 | 912B | 320B | -592B | | 26 | 912B | 320B | -592B | | 27 | 912B | 320B | -592B | | 28 | 912B | 320B | -592B | | 29 | 912B | 320B | -592B | | 30 | 912B | 320B | -592B | | 31 | 912B | 320B | -592B | | 32 | 912B | 320B | -592B | | 33 | 1744B | 320B | -1424B | | 34 | 1744B | 320B | -1424B | | 35 | 1744B | 320B | -1424B | | 36 | 1744B | 320B | -1424B | | 37 | 1744B | 320B | -1424B | | 38 | 1744B | 320B | -1424B | | 39 | 1744B | 640B | -1104B | | 40 | 1744B | 640B | -1104B | | 41 | 1744B | 640B | -1104B | | 42 | 1744B | 640B | -1104B | | 43 | 1744B | 640B | -1104B | | 44 | 1744B | 640B | -1104B | | 45 | 1744B | 640B | -1104B | | 46 | 1744B | 640B | -1104B | | 47 | 1744B | 640B | -1104B | | 48 | 1744B | 640B | -1104B | | 49 | 1744B | 640B | -1104B | | 50 | 1744B | 640B | -1104B | Rather than to convert rows into hashes, we can loopkup the column index into a single hash common to all rows. To not complexify the code too much, rather than to pass the row array and the column index, we wrap both into an `IndexedRow` object, which uses an extra `40B` object, but that's still less memory even in the worst case. After: ``` Total allocated: 4.32 MB (43079 objects) Total retained: 3.65 MB (29725 objects) allocated memory by file ----------------------------------- 101.66 kB activerecord/lib/active_record/result.rb retained memory by file ----------------------------------- 84.70 kB activerecord/lib/active_record/result.rb ``` As for access speed, it's of course a bit slower, but not by much, it's between `1.5` and `2` times slower, but remains in the 10's of M iterations per second, so I think this overhead is negligible compared to all the work needed to access a model attribute. Also the `LazyAttributeSet` class only access this once, after the attribute is casted, the resulting value is still stored in a regular `Hash`.
Using the samebenchmark as rails#51726 A significant part of the memory footprint comes from `Result#each`: ``` Total allocated: 4.61 MB (43077 objects) Total retained: 3.76 MB (27621 objects) allocated memory by file ----------------------------------- 391.52 kB activerecord/lib/active_record/result.rb retained memory by file ----------------------------------- 374.40 kB activerecord/lib/active_record/result.rb ``` Rows are initially stored as arrays, but `Result#each` convert them to hashes. Depending on how many elements they contain, hashes use between 2 and 5 times as much memory than arrays. | Length | Hash | Array | Diff | | --------- | --------- | --------- | --------- | | 1 | 160B | 40B | -120B | | 2 | 160B | 40B | -120B | | 3 | 160B | 40B | -120B | | 4 | 160B | 80B | -80B | | 5 | 160B | 80B | -80B | | 6 | 160B | 80B | -80B | | 7 | 160B | 80B | -80B | | 8 | 160B | 80B | -80B | | 9 | 464B | 160B | -304B | | 10 | 464B | 160B | -304B | | 11 | 464B | 160B | -304B | | 12 | 464B | 160B | -304B | | 13 | 464B | 160B | -304B | | 14 | 464B | 160B | -304B | | 15 | 464B | 160B | -304B | | 16 | 464B | 160B | -304B | | 17 | 912B | 160B | -752B | | 18 | 912B | 160B | -752B | | 19 | 912B | 320B | -592B | | 20 | 912B | 320B | -592B | | 21 | 912B | 320B | -592B | | 22 | 912B | 320B | -592B | | 23 | 912B | 320B | -592B | | 24 | 912B | 320B | -592B | | 25 | 912B | 320B | -592B | | 26 | 912B | 320B | -592B | | 27 | 912B | 320B | -592B | | 28 | 912B | 320B | -592B | | 29 | 912B | 320B | -592B | | 30 | 912B | 320B | -592B | | 31 | 912B | 320B | -592B | | 32 | 912B | 320B | -592B | | 33 | 1744B | 320B | -1424B | | 34 | 1744B | 320B | -1424B | | 35 | 1744B | 320B | -1424B | | 36 | 1744B | 320B | -1424B | | 37 | 1744B | 320B | -1424B | | 38 | 1744B | 320B | -1424B | | 39 | 1744B | 640B | -1104B | | 40 | 1744B | 640B | -1104B | | 41 | 1744B | 640B | -1104B | | 42 | 1744B | 640B | -1104B | | 43 | 1744B | 640B | -1104B | | 44 | 1744B | 640B | -1104B | | 45 | 1744B | 640B | -1104B | | 46 | 1744B | 640B | -1104B | | 47 | 1744B | 640B | -1104B | | 48 | 1744B | 640B | -1104B | | 49 | 1744B | 640B | -1104B | | 50 | 1744B | 640B | -1104B | Rather than to convert rows into hashes, we can loopkup the column index into a single hash common to all rows. To not complexify the code too much, rather than to pass the row array and the column index, we wrap both into an `IndexedRow` object, which uses an extra `40B` object, but that's still less memory even in the worst case. After: ``` Total allocated: 4.32 MB (43079 objects) Total retained: 3.65 MB (29725 objects) allocated memory by file ----------------------------------- 101.66 kB activerecord/lib/active_record/result.rb retained memory by file ----------------------------------- 84.70 kB activerecord/lib/active_record/result.rb ``` As for access speed, it's of course a bit slower, but not by much, it's between `1.5` and `2` times slower, but remains in the 10's of M iterations per second, so I think this overhead is negligible compared to all the work needed to access a model attribute. Also the `LazyAttributeSet` class only access this once, after the attribute is casted, the resulting value is still stored in a regular `Hash`.
Using the samebenchmark as rails#51726 A significant part of the memory footprint comes from `Result#each`: ``` Total allocated: 4.61 MB (43077 objects) Total retained: 3.76 MB (27621 objects) allocated memory by file ----------------------------------- 391.52 kB activerecord/lib/active_record/result.rb retained memory by file ----------------------------------- 374.40 kB activerecord/lib/active_record/result.rb ``` Rows are initially stored as arrays, but `Result#each` convert them to hashes. Depending on how many elements they contain, hashes use between 2 and 5 times as much memory than arrays. | Length | Hash | Array | Diff | | --------- | --------- | --------- | --------- | | 1 | 160B | 40B | -120B | | 2 | 160B | 40B | -120B | | 3 | 160B | 40B | -120B | | 4 | 160B | 80B | -80B | | 5 | 160B | 80B | -80B | | 6 | 160B | 80B | -80B | | 7 | 160B | 80B | -80B | | 8 | 160B | 80B | -80B | | 9 | 464B | 160B | -304B | | 10 | 464B | 160B | -304B | | 11 | 464B | 160B | -304B | | 12 | 464B | 160B | -304B | | 13 | 464B | 160B | -304B | | 14 | 464B | 160B | -304B | | 15 | 464B | 160B | -304B | | 16 | 464B | 160B | -304B | | 17 | 912B | 160B | -752B | | 18 | 912B | 160B | -752B | | 19 | 912B | 320B | -592B | | 20 | 912B | 320B | -592B | | 21 | 912B | 320B | -592B | | 22 | 912B | 320B | -592B | | 23 | 912B | 320B | -592B | | 24 | 912B | 320B | -592B | | 25 | 912B | 320B | -592B | | 26 | 912B | 320B | -592B | | 27 | 912B | 320B | -592B | | 28 | 912B | 320B | -592B | | 29 | 912B | 320B | -592B | | 30 | 912B | 320B | -592B | | 31 | 912B | 320B | -592B | | 32 | 912B | 320B | -592B | | 33 | 1744B | 320B | -1424B | | 34 | 1744B | 320B | -1424B | | 35 | 1744B | 320B | -1424B | | 36 | 1744B | 320B | -1424B | | 37 | 1744B | 320B | -1424B | | 38 | 1744B | 320B | -1424B | | 39 | 1744B | 640B | -1104B | | 40 | 1744B | 640B | -1104B | | 41 | 1744B | 640B | -1104B | | 42 | 1744B | 640B | -1104B | | 43 | 1744B | 640B | -1104B | | 44 | 1744B | 640B | -1104B | | 45 | 1744B | 640B | -1104B | | 46 | 1744B | 640B | -1104B | | 47 | 1744B | 640B | -1104B | | 48 | 1744B | 640B | -1104B | | 49 | 1744B | 640B | -1104B | | 50 | 1744B | 640B | -1104B | Rather than to convert rows into hashes, we can loopkup the column index into a single hash common to all rows. To not complexify the code too much, rather than to pass the row array and the column index, we wrap both into an `IndexedRow` object, which uses an extra `40B` object, but that's still less memory even in the worst case. After: ``` Total allocated: 4.32 MB (43079 objects) Total retained: 3.65 MB (29725 objects) allocated memory by file ----------------------------------- 101.66 kB activerecord/lib/active_record/result.rb retained memory by file ----------------------------------- 84.70 kB activerecord/lib/active_record/result.rb ``` As for access speed, it's of course a bit slower, but not by much, it's between `1.5` and `2` times slower, but remains in the 10's of M iterations per second, so I think this overhead is negligible compared to all the work needed to access a model attribute. Also the `LazyAttributeSet` class only access this once, after the attribute is casted, the resulting value is still stored in a regular `Hash`.
Using the same benchmark as in rails#51726 `replace_keys` always allocate a multiple arrays to support composite primary keys, even though most primary keys likely aren't composite. And even when they are, we can avoid needless allocations by not using `Array#zip`. This reduce allocations by 18% (8k) on that benchmark. Before: ``` Total allocated: 4.88 MB (44495 objects) Total retained: 4.16 MB (32043 objects) allocated memory by file ----------------------------------- ... 320.00 kB activerecord/lib/active_record/associations/belongs_to_association.rb allocated objects by file ----------------------------------- ... 8000 activerecord/lib/active_record/associations/belongs_to_association.rb ``` After: ``` Total allocated: 4.56 MB (36495 objects) Total retained: 4.16 MB (32041 objects) ``` NB: `belongs_to_association` doesn't show up in top files anymore
Using the same benchmark as in rails#51726 `replace_keys` always allocate a multiple arrays to support composite primary keys, even though most primary keys likely aren't composite. And even when they are, we can avoid needless allocations by not using `Array#zip`. This reduce allocations by 18% (8k) on that benchmark. Before: ``` Total allocated: 4.88 MB (44495 objects) Total retained: 4.16 MB (32043 objects) allocated memory by file ----------------------------------- ... 320.00 kB activerecord/lib/active_record/associations/belongs_to_association.rb allocated objects by file ----------------------------------- ... 8000 activerecord/lib/active_record/associations/belongs_to_association.rb ``` After: ``` Total allocated: 4.56 MB (36495 objects) Total retained: 4.16 MB (32041 objects) ``` NB: `belongs_to_association` doesn't show up in top files anymore
Using the samebenchmark as rails#51726 A significant part of the memory footprint comes from `Result#each`: ``` Total allocated: 4.61 MB (43077 objects) Total retained: 3.76 MB (27621 objects) allocated memory by file ----------------------------------- 391.52 kB activerecord/lib/active_record/result.rb retained memory by file ----------------------------------- 374.40 kB activerecord/lib/active_record/result.rb ``` Rows are initially stored as arrays, but `Result#each` convert them to hashes. Depending on how many elements they contain, hashes use between 2 and 5 times as much memory than arrays. | Length | Hash | Array | Diff | | --------- | --------- | --------- | --------- | | 1 | 160B | 40B | -120B | | 2 | 160B | 40B | -120B | | 3 | 160B | 40B | -120B | | 4 | 160B | 80B | -80B | | 5 | 160B | 80B | -80B | | 6 | 160B | 80B | -80B | | 7 | 160B | 80B | -80B | | 8 | 160B | 80B | -80B | | 9 | 464B | 160B | -304B | | 10 | 464B | 160B | -304B | | 11 | 464B | 160B | -304B | | 12 | 464B | 160B | -304B | | 13 | 464B | 160B | -304B | | 14 | 464B | 160B | -304B | | 15 | 464B | 160B | -304B | | 16 | 464B | 160B | -304B | | 17 | 912B | 160B | -752B | | 18 | 912B | 160B | -752B | | 19 | 912B | 320B | -592B | | 20 | 912B | 320B | -592B | | 21 | 912B | 320B | -592B | | 22 | 912B | 320B | -592B | | 23 | 912B | 320B | -592B | | 24 | 912B | 320B | -592B | | 25 | 912B | 320B | -592B | | 26 | 912B | 320B | -592B | | 27 | 912B | 320B | -592B | | 28 | 912B | 320B | -592B | | 29 | 912B | 320B | -592B | | 30 | 912B | 320B | -592B | | 31 | 912B | 320B | -592B | | 32 | 912B | 320B | -592B | | 33 | 1744B | 320B | -1424B | | 34 | 1744B | 320B | -1424B | | 35 | 1744B | 320B | -1424B | | 36 | 1744B | 320B | -1424B | | 37 | 1744B | 320B | -1424B | | 38 | 1744B | 320B | -1424B | | 39 | 1744B | 640B | -1104B | | 40 | 1744B | 640B | -1104B | | 41 | 1744B | 640B | -1104B | | 42 | 1744B | 640B | -1104B | | 43 | 1744B | 640B | -1104B | | 44 | 1744B | 640B | -1104B | | 45 | 1744B | 640B | -1104B | | 46 | 1744B | 640B | -1104B | | 47 | 1744B | 640B | -1104B | | 48 | 1744B | 640B | -1104B | | 49 | 1744B | 640B | -1104B | | 50 | 1744B | 640B | -1104B | Rather than to convert rows into hashes, we can loopkup the column index into a single hash common to all rows. To not complexify the code too much, rather than to pass the row array and the column index, we wrap both into an `IndexedRow` object, which uses an extra `40B` object, but that's still less memory even in the worst case. After: ``` Total allocated: 4.32 MB (43079 objects) Total retained: 3.65 MB (29725 objects) allocated memory by file ----------------------------------- 101.66 kB activerecord/lib/active_record/result.rb retained memory by file ----------------------------------- 84.70 kB activerecord/lib/active_record/result.rb ``` As for access speed, it's of course a bit slower, but not by much, it's between `1.5` and `2` times slower, but remains in the 10's of M iterations per second, so I think this overhead is negligible compared to all the work needed to access a model attribute. Also the `LazyAttributeSet` class only access this once, after the attribute is casted, the resulting value is still stored in a regular `Hash`.
Using the samebenchmark as rails#51726 A significant part of the memory footprint comes from `Result#each`: ``` Total allocated: 4.61 MB (43077 objects) Total retained: 3.76 MB (27621 objects) allocated memory by file ----------------------------------- 391.52 kB activerecord/lib/active_record/result.rb retained memory by file ----------------------------------- 374.40 kB activerecord/lib/active_record/result.rb ``` Rows are initially stored as arrays, but `Result#each` convert them to hashes. Depending on how many elements they contain, hashes use between 2 and 5 times as much memory than arrays. | Length | Hash | Array | Diff | | --------- | --------- | --------- | --------- | | 1 | 160B | 40B | -120B | | 2 | 160B | 40B | -120B | | 3 | 160B | 40B | -120B | | 4 | 160B | 80B | -80B | | 5 | 160B | 80B | -80B | | 6 | 160B | 80B | -80B | | 7 | 160B | 80B | -80B | | 8 | 160B | 80B | -80B | | 9 | 464B | 160B | -304B | | 10 | 464B | 160B | -304B | | 11 | 464B | 160B | -304B | | 12 | 464B | 160B | -304B | | 13 | 464B | 160B | -304B | | 14 | 464B | 160B | -304B | | 15 | 464B | 160B | -304B | | 16 | 464B | 160B | -304B | | 17 | 912B | 160B | -752B | | 18 | 912B | 160B | -752B | | 19 | 912B | 320B | -592B | | 20 | 912B | 320B | -592B | | 21 | 912B | 320B | -592B | | 22 | 912B | 320B | -592B | | 23 | 912B | 320B | -592B | | 24 | 912B | 320B | -592B | | 25 | 912B | 320B | -592B | | 26 | 912B | 320B | -592B | | 27 | 912B | 320B | -592B | | 28 | 912B | 320B | -592B | | 29 | 912B | 320B | -592B | | 30 | 912B | 320B | -592B | | 31 | 912B | 320B | -592B | | 32 | 912B | 320B | -592B | | 33 | 1744B | 320B | -1424B | | 34 | 1744B | 320B | -1424B | | 35 | 1744B | 320B | -1424B | | 36 | 1744B | 320B | -1424B | | 37 | 1744B | 320B | -1424B | | 38 | 1744B | 320B | -1424B | | 39 | 1744B | 640B | -1104B | | 40 | 1744B | 640B | -1104B | | 41 | 1744B | 640B | -1104B | | 42 | 1744B | 640B | -1104B | | 43 | 1744B | 640B | -1104B | | 44 | 1744B | 640B | -1104B | | 45 | 1744B | 640B | -1104B | | 46 | 1744B | 640B | -1104B | | 47 | 1744B | 640B | -1104B | | 48 | 1744B | 640B | -1104B | | 49 | 1744B | 640B | -1104B | | 50 | 1744B | 640B | -1104B | Rather than to convert rows into hashes, we can loopkup the column index into a single hash common to all rows. To not complexify the code too much, rather than to pass the row array and the column index, we wrap both into an `IndexedRow` object, which uses an extra `40B` object, but that's still less memory even in the worst case. After: ``` Total allocated: 4.32 MB (43079 objects) Total retained: 3.65 MB (29725 objects) allocated memory by file ----------------------------------- 101.66 kB activerecord/lib/active_record/result.rb retained memory by file ----------------------------------- 84.70 kB activerecord/lib/active_record/result.rb ``` As for access speed, it's of course a bit slower, but not by much, it's between `1.5` and `2` times slower, but remains in the 10's of M iterations per second, so I think this overhead is negligible compared to all the work needed to access a model attribute. Also the `LazyAttributeSet` class only access this once, after the attribute is casted, the resulting value is still stored in a regular `Hash`.
Using the same benchmark as rails#51726 A significant part of the memory footprint comes from `Result#each`: ``` Total allocated: 4.61 MB (43077 objects) Total retained: 3.76 MB (27621 objects) allocated memory by file ----------------------------------- 391.52 kB activerecord/lib/active_record/result.rb retained memory by file ----------------------------------- 374.40 kB activerecord/lib/active_record/result.rb ``` Rows are initially stored as arrays, but `Result#each` convert them to hashes. Depending on how many elements they contain, hashes use between 2 and 5 times as much memory than arrays. | Length | Hash | Array | Diff | | --------- | --------- | --------- | --------- | | 1 | 160B | 40B | -120B | | 2 | 160B | 40B | -120B | | 3 | 160B | 40B | -120B | | 4 | 160B | 80B | -80B | | 5 | 160B | 80B | -80B | | 6 | 160B | 80B | -80B | | 7 | 160B | 80B | -80B | | 8 | 160B | 80B | -80B | | 9 | 464B | 160B | -304B | | 10 | 464B | 160B | -304B | | 11 | 464B | 160B | -304B | | 12 | 464B | 160B | -304B | | 13 | 464B | 160B | -304B | | 14 | 464B | 160B | -304B | | 15 | 464B | 160B | -304B | | 16 | 464B | 160B | -304B | | 17 | 912B | 160B | -752B | | 18 | 912B | 160B | -752B | | 19 | 912B | 320B | -592B | | 20 | 912B | 320B | -592B | | 21 | 912B | 320B | -592B | | 22 | 912B | 320B | -592B | | 23 | 912B | 320B | -592B | | 24 | 912B | 320B | -592B | | 25 | 912B | 320B | -592B | | 26 | 912B | 320B | -592B | | 27 | 912B | 320B | -592B | | 28 | 912B | 320B | -592B | | 29 | 912B | 320B | -592B | | 30 | 912B | 320B | -592B | | 31 | 912B | 320B | -592B | | 32 | 912B | 320B | -592B | | 33 | 1744B | 320B | -1424B | | 34 | 1744B | 320B | -1424B | | 35 | 1744B | 320B | -1424B | | 36 | 1744B | 320B | -1424B | | 37 | 1744B | 320B | -1424B | | 38 | 1744B | 320B | -1424B | | 39 | 1744B | 640B | -1104B | | 40 | 1744B | 640B | -1104B | | 41 | 1744B | 640B | -1104B | | 42 | 1744B | 640B | -1104B | | 43 | 1744B | 640B | -1104B | | 44 | 1744B | 640B | -1104B | | 45 | 1744B | 640B | -1104B | | 46 | 1744B | 640B | -1104B | | 47 | 1744B | 640B | -1104B | | 48 | 1744B | 640B | -1104B | | 49 | 1744B | 640B | -1104B | | 50 | 1744B | 640B | -1104B | Rather than to convert rows into hashes, we can loopkup the column index into a single Hash common to all rows. To not complexify the code too much, rather than to pass the row array and the column index, we wrap both into an `IndexedRow` object, which uses an extra `40B` object, but that's still less memory even in the worst case. After: ``` Total allocated: 4.32 MB (43079 objects) Total retained: 3.65 MB (29725 objects) allocated memory by file ----------------------------------- 101.66 kB activerecord/lib/active_record/result.rb retained memory by file ----------------------------------- 84.70 kB activerecord/lib/active_record/result.rb ``` As for access speed, it's of course a bit slower, but not by much, it's between `1.5` and `2` times slower, but remains in the 10's of M iterations per second, so I think this overhead is negligible compared to all the work needed to access a model attribute. Also the `LazyAttributeSet` class only access this once, after the attribute is casted, the resulting value is still stored in a regular `Hash`.
related: rails/rails#51726 `_reflect_on_association` is also a private API and will may be broken, but it works for both String and Symbol, absorbing type differences, so use this one.
Using the same benchmark as in rails#51726 `replace_keys` always allocate a multiple arrays to support composite primary keys, even though most primary keys likely aren't composite. And even when they are, we can avoid needless allocations by not using `Array#zip`. This reduce allocations by 18% (8k) on that benchmark. Before: ``` Total allocated: 4.88 MB (44495 objects) Total retained: 4.16 MB (32043 objects) allocated memory by file ----------------------------------- ... 320.00 kB activerecord/lib/active_record/associations/belongs_to_association.rb allocated objects by file ----------------------------------- ... 8000 activerecord/lib/active_record/associations/belongs_to_association.rb ``` After: ``` Total allocated: 4.56 MB (36495 objects) Total retained: 4.16 MB (32041 objects) ``` NB: `belongs_to_association` doesn't show up in top files anymore
Using the same benchmark as rails#51726 A significant part of the memory footprint comes from `Result#each`: ``` Total allocated: 4.61 MB (43077 objects) Total retained: 3.76 MB (27621 objects) allocated memory by file ----------------------------------- 391.52 kB activerecord/lib/active_record/result.rb retained memory by file ----------------------------------- 374.40 kB activerecord/lib/active_record/result.rb ``` Rows are initially stored as arrays, but `Result#each` convert them to hashes. Depending on how many elements they contain, hashes use between 2 and 5 times as much memory than arrays. | Length | Hash | Array | Diff | | --------- | --------- | --------- | --------- | | 1 | 160B | 40B | -120B | | 2 | 160B | 40B | -120B | | 3 | 160B | 40B | -120B | | 4 | 160B | 80B | -80B | | 5 | 160B | 80B | -80B | | 6 | 160B | 80B | -80B | | 7 | 160B | 80B | -80B | | 8 | 160B | 80B | -80B | | 9 | 464B | 160B | -304B | | 10 | 464B | 160B | -304B | | 11 | 464B | 160B | -304B | | 12 | 464B | 160B | -304B | | 13 | 464B | 160B | -304B | | 14 | 464B | 160B | -304B | | 15 | 464B | 160B | -304B | | 16 | 464B | 160B | -304B | | 17 | 912B | 160B | -752B | | 18 | 912B | 160B | -752B | | 19 | 912B | 320B | -592B | | 20 | 912B | 320B | -592B | | 21 | 912B | 320B | -592B | | 22 | 912B | 320B | -592B | | 23 | 912B | 320B | -592B | | 24 | 912B | 320B | -592B | | 25 | 912B | 320B | -592B | | 26 | 912B | 320B | -592B | | 27 | 912B | 320B | -592B | | 28 | 912B | 320B | -592B | | 29 | 912B | 320B | -592B | | 30 | 912B | 320B | -592B | | 31 | 912B | 320B | -592B | | 32 | 912B | 320B | -592B | | 33 | 1744B | 320B | -1424B | | 34 | 1744B | 320B | -1424B | | 35 | 1744B | 320B | -1424B | | 36 | 1744B | 320B | -1424B | | 37 | 1744B | 320B | -1424B | | 38 | 1744B | 320B | -1424B | | 39 | 1744B | 640B | -1104B | | 40 | 1744B | 640B | -1104B | | 41 | 1744B | 640B | -1104B | | 42 | 1744B | 640B | -1104B | | 43 | 1744B | 640B | -1104B | | 44 | 1744B | 640B | -1104B | | 45 | 1744B | 640B | -1104B | | 46 | 1744B | 640B | -1104B | | 47 | 1744B | 640B | -1104B | | 48 | 1744B | 640B | -1104B | | 49 | 1744B | 640B | -1104B | | 50 | 1744B | 640B | -1104B | Rather than to convert rows into hashes, we can loopkup the column index into a single Hash common to all rows. To not complexify the code too much, rather than to pass the row array and the column index, we wrap both into an `IndexedRow` object, which uses an extra `40B` object, but that's still less memory even in the worst case. After: ``` Total allocated: 4.32 MB (43079 objects) Total retained: 3.65 MB (29725 objects) allocated memory by file ----------------------------------- 101.66 kB activerecord/lib/active_record/result.rb retained memory by file ----------------------------------- 84.70 kB activerecord/lib/active_record/result.rb ``` As for access speed, it's of course a bit slower, but not by much, it's between `1.5` and `2` times slower, but remains in the 10's of M iterations per second, so I think this overhead is negligible compared to all the work needed to access a model attribute. Also the `LazyAttributeSet` class only access this once, after the attribute is casted, the resulting value is still stored in a regular `Hash`.
Using the same benchmark as in rails#51726 `replace_keys` always allocate a multiple arrays to support composite primary keys, even though most primary keys likely aren't composite. And even when they are, we can avoid needless allocations by not using `Array#zip`. This reduce allocations by 18% (8k) on that benchmark. Before: ``` Total allocated: 4.88 MB (44495 objects) Total retained: 4.16 MB (32043 objects) allocated memory by file ----------------------------------- ... 320.00 kB activerecord/lib/active_record/associations/belongs_to_association.rb allocated objects by file ----------------------------------- ... 8000 activerecord/lib/active_record/associations/belongs_to_association.rb ``` After: ``` Total allocated: 4.56 MB (36495 objects) Total retained: 4.16 MB (32041 objects) ``` NB: `belongs_to_association` doesn't show up in top files anymore
Using the same benchmark as rails#51726 A significant part of the memory footprint comes from `Result#each`: ``` Total allocated: 4.61 MB (43077 objects) Total retained: 3.76 MB (27621 objects) allocated memory by file ----------------------------------- 391.52 kB activerecord/lib/active_record/result.rb retained memory by file ----------------------------------- 374.40 kB activerecord/lib/active_record/result.rb ``` Rows are initially stored as arrays, but `Result#each` convert them to hashes. Depending on how many elements they contain, hashes use between 2 and 5 times as much memory than arrays. | Length | Hash | Array | Diff | | --------- | --------- | --------- | --------- | | 1 | 160B | 40B | -120B | | 2 | 160B | 40B | -120B | | 3 | 160B | 40B | -120B | | 4 | 160B | 80B | -80B | | 5 | 160B | 80B | -80B | | 6 | 160B | 80B | -80B | | 7 | 160B | 80B | -80B | | 8 | 160B | 80B | -80B | | 9 | 464B | 160B | -304B | | 10 | 464B | 160B | -304B | | 11 | 464B | 160B | -304B | | 12 | 464B | 160B | -304B | | 13 | 464B | 160B | -304B | | 14 | 464B | 160B | -304B | | 15 | 464B | 160B | -304B | | 16 | 464B | 160B | -304B | | 17 | 912B | 160B | -752B | | 18 | 912B | 160B | -752B | | 19 | 912B | 320B | -592B | | 20 | 912B | 320B | -592B | | 21 | 912B | 320B | -592B | | 22 | 912B | 320B | -592B | | 23 | 912B | 320B | -592B | | 24 | 912B | 320B | -592B | | 25 | 912B | 320B | -592B | | 26 | 912B | 320B | -592B | | 27 | 912B | 320B | -592B | | 28 | 912B | 320B | -592B | | 29 | 912B | 320B | -592B | | 30 | 912B | 320B | -592B | | 31 | 912B | 320B | -592B | | 32 | 912B | 320B | -592B | | 33 | 1744B | 320B | -1424B | | 34 | 1744B | 320B | -1424B | | 35 | 1744B | 320B | -1424B | | 36 | 1744B | 320B | -1424B | | 37 | 1744B | 320B | -1424B | | 38 | 1744B | 320B | -1424B | | 39 | 1744B | 640B | -1104B | | 40 | 1744B | 640B | -1104B | | 41 | 1744B | 640B | -1104B | | 42 | 1744B | 640B | -1104B | | 43 | 1744B | 640B | -1104B | | 44 | 1744B | 640B | -1104B | | 45 | 1744B | 640B | -1104B | | 46 | 1744B | 640B | -1104B | | 47 | 1744B | 640B | -1104B | | 48 | 1744B | 640B | -1104B | | 49 | 1744B | 640B | -1104B | | 50 | 1744B | 640B | -1104B | Rather than to convert rows into hashes, we can loopkup the column index into a single Hash common to all rows. To not complexify the code too much, rather than to pass the row array and the column index, we wrap both into an `IndexedRow` object, which uses an extra `40B` object, but that's still less memory even in the worst case. After: ``` Total allocated: 4.32 MB (43079 objects) Total retained: 3.65 MB (29725 objects) allocated memory by file ----------------------------------- 101.66 kB activerecord/lib/active_record/result.rb retained memory by file ----------------------------------- 84.70 kB activerecord/lib/active_record/result.rb ``` As for access speed, it's of course a bit slower, but not by much, it's between `1.5` and `2` times slower, but remains in the 10's of M iterations per second, so I think this overhead is negligible compared to all the work needed to access a model attribute. Also the `LazyAttributeSet` class only access this once, after the attribute is casted, the resulting value is still stored in a regular `Hash`.
Using a simple benchmark such as:
Full benchmark
This allocates
43,077
objects, and out of these2,200
are caused by association names being stored as strings internally:This is because many API accept either symbol or string, and blindly call
to_s
on it. We could avoid this with sprinkling the code withSymbol == association ? association.name : association.to_s
, but it's ugly.This issue may be entirely solved in a future Ruby version, but it will take years: https://bugs.ruby-lang.org/issues/20350
By using symbols, we both save allocations, and marginally speed up lookups and comparisons, reducing this particular benchmark allocations by 5%.
It's a bit unclear why these were ever made strings. Historically symbols were immortal and using them for user supplied data could lead to DOS vulnerability, so this may have been why, but it's not longer a concern since Ruby 2.2.