New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DDP adding too much magic to hash containers? #75
Comments
This is mostly JSON::XS fault. Since version 2.25, it gave up sorting tied hashes, but has never implemented that check completely. Since you rarely encounter non-tied SvRMAGICAL hashes, that lived up until now. But it's still a strange decision on Data::Printer's part to use Hash::Util::FieldHash that eats an enormous amount of memory for the task it's used for and also permanently slows down any access to the structure you've passed to it. |
Oh.. That's not good to hear on both points. Regardless I'm probably going to have to drop DDP after reading your second paragraph. I'm currently trying to use it to format data in log output, but I think I'd end up with a significant performance hit as the code I'm working with already bottlenecks fairly heavily on logging. |
I'll provide a suitable patch to Cpanel folks in a couple of days. As for Lehmann, you can try, but i'm quite skeptical about it) And about memory/performance hit - it matters only in a really high-pressured applications. If you examine just some structures from time to time - it's OK (as we do at $work, and that's why I'm so paranoid about performance). If you'll just throw away what you've just DDP'ed (like a request vars) - it's also OK. The only case when it really matters - is when you constantly dump long-living objects. For me, Data::Printer is nevertheless a tool of great convenience. |
Sadly, the exact use is for examining pieces of a very large tree of database structures that sometimes get dumped to disk as JSON (hence this issue). And it's being done in logs, not only for debugging -- I think this is the worst possible scenario for DDP :( Thank you very much for submitting a patch for JSON though, it's very appreciated :) |
@Timbus thanks for reporting, and apologies for not saying anything back when you did. Perl itself does not guarantee that the internal representation of a variable remains the same, and other people have bumped into a similar issue (read the comments below the article for a complete explanation of the issue). The gist of it is just like @dur-randir++ said: Again like @dur-randir said, this is mostly a JSON::XS issue (last I checked mlehmann argued that it is actually an issue in Perl and refuses to patch his code over a bug in Perl). Finally, regarding any noticeable slowdowns or bumps in memory usage: @dur-randir I am very open to a PR that replaces fieldhash() usage completely if you (or anyone for that matter) can do it, together with some sort of benchmark showing us the gains - and of course with some tests so we can make sure everything is still running smoothly :D Thank you both again! |
@garu, it's great that you're concerned about a quality of your modules. I'll try to summarize what's wrong with fieldhash().
I can write a custom module that will be much more like a direct 'exists $p->{_seen}->{_object_id($item)}' equivalent with two available trade-offs:
References will still be extended to PVMG type, but without any flags set/other observable effects. But this has one problem - this will be a XS module, and DDP currently has only pure-perl or core deps. I can make it optional and provide a common layer for both it and fieldhash() usage, if you approve this path. |
Furthermore, the following code leaks 1 SV per iteration:
|
Uh, I know this is pretty late and isn't related to my issue anymore, but I personally think @dur-randir 's addition would be fantastic and I'd certainly be happy to use it, although I sadly cant speak on behalf of @garu approving it or not. I think I'll reopen the issue. At the very least the title is still relevant. |
Usually when i walk a possibly recursive data structure I use a %seen hash of refaddr(). Is there a problem that FieldHash solves that refaddr can't? |
Sorry for the long wait! @nrdvana the problem with using refaddr is that Perl will reuse references to destroyed objects: use strict;
use warnings;
use Scalar::Util qw(refaddr);
{
package Foo;
sub new {
my $class = shift;
my $string = shift;
return bless {}, $class;
}
}
for(1..10) {
my $obj = Foo->new;
print "Object's reference is " . refaddr($obj) . "\n";
}
So if we want to make sure we are looking at the right object at all times, we need something like fieldhashes. That said, since the main usage of Data::Printer expects to forget about variables after they are printed, and variables are not supposed to go away while we are printing them (and we hold the extra references) then we should be ok with refaddr, so I dropped fieldhashes altogether. Let's see how it goes! |
t/011-class.t fails on all pers except 5.18 exact for this reason - ICanHazStringMethodTwo->new returns the same object address as ICanHazStringMethodOne->new. I see two options for the API:
|
So, I have a strange issue using DDP in combination with JSON::XS. I'm not entirely sure who is at fault here but I thought I'd have a better chance bugging you and not MLehmann..
Here's the issue:
Note the hash is unordered.. even though 'canonical' should order the keys alphabetically. This problem does not occur if I don't pass the hash to DDP, which means something in DDP is changing it (and Devel::Peek confirms that DDP adds a ton of magic to the hash).
Looking at JSON::XS source shows that it sorts differently based on SV magic, but I'm not enough of a wizard to know what it's actually doing. All I know is that someone is doing something wrong, somewhere.
The text was updated successfully, but these errors were encountered: