Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDLapi warnings in parallel calculations with PDL::Opt::NonLinear #6

Closed
YuryPakhomov opened this issue Apr 8, 2024 · 4 comments
Closed

Comments

@YuryPakhomov
Copy link

Hello all!

I use PDL::Opt::NonLinear module in my code. It works fine but slowly due to complex minimization function.
The minimization function can be paralleled with Parallel::ForkManager. It works fine in separate call.
But when I try to use the function in PDL::Opt::NonLinear module, many warnings appear:

	(in cleanup) INVALID PDL MAGICNO, got hex=0xf02100 (37593680)
 at /usr/local/share/perl5/Parallel/ForkManager/Child.pm line 26.
	eval {...} called at /usr/local/share/perl5/Parallel/ForkManager/Child.pm line 26
Warning: special data without datasv is not freed currently!! at test.pl line 0.
	eval {...} called at test.pl line 0
Warning: special data without datasv is not freed currently!! at test.pl line 0.
	eval {...} called at test.pl line 0
Warning: special data without datasv is not freed currently!! at test.pl line 0.
	eval {...} called at test.pl line 0
Warning: special data without datasv is not freed currently!! at test.pl line 0.
	eval {...} called at test.pl line 0
Warning: special data without datasv is not freed currently!! at test.pl line 0.
	eval {...} called at test.pl line 0

The simple code to reproduce this output is:

#! /usr/bin/perl
use PDL;
use PDL::NiceSlice;
use PDL::Opt::NonLinear;
use Parallel::ForkManager;

my $pm = Parallel::ForkManager->new(4);

my $x=pdl(1);
my $fx=pdl(0);
my $gx=zeroes(nelem($x));

# This single call works fine
#fg_func($fx,$gx,$x); exit;

# Optimization algorithm
                my $bounds =  zeroes(nelem($x),2);
                $bounds(0,0).=  pdl(0.0);
                $bounds(0,1).= pdl(10.0);
                
                my $tbounds = zeroes(nelem($x));
                $tbounds.=2;

                my $gtol = pdl(0.9);
                my $pgtol = pdl(0.1);
                my $factr = pdl(1e7);
                my $m = pdl(10);
                my $print = pdl(-1);
                my $maxit = pdl(long,15);
                my $info = pdl(long,1);
                my $iv = zeroes(long,44);
                my $v = zeroes(29);

                lbfgsb($fx, $gx, $x, $m, $bounds, $tbounds, $maxit, $factr, $pgtol, $gtol,
                $print, $info,$iv, $v,\&fg_func);

sub fg_func{
 my ($f, $g, $x) = @_;
 $f .= 0;
 $g .= pdl(0);

# Parallel calculations block. This is demo without any calculation. Create external thread and then terminate it.
# If to comment this block then program works fine
 DATA_LOOP:
 for my $i (0 .. nelem($x)-1){
  my $ppid = $pm->start and next DATA_LOOP;
  $pm->finish($i); # Terminates the child process
 }
 $pm->wait_all_children;

 return 0;
}

The Warning points to "Child.pm" line 26, which contains CORE::exit($x || 0) of subroutine finish:

sub finish {
  my ($s, $x, $r)=@_;

  $s->store($r);

    CORE::exit($x || 0);
}

The messages "INVALID PDL MAGICNO" and "Warning: special data without datasv is not freed currently!" are generated by Basic/Core/pdlapi.c

@mohawk2
Copy link
Member

mohawk2 commented Apr 13, 2024

The good news is that I can reproduce that locally! More to follow.

@mohawk2
Copy link
Member

mohawk2 commented Apr 13, 2024

I suspect this is some horrible interaction with garbage collection and processes exiting. If we change your script from using $pm->finish to POSIX::_exit, the error doesn't occur. I don't understand why yet, since the PDL objects should be destroyed as normal in sub-processes.

There's a clue in that turning on PDL::Core::set_debugging(1) shows that the ndarray that's getting the cleanup message has state PDL_DONTTOUCHDATA, which triggers the message. That's being set in PDL::Opt::NonLinear's C wrapper for Perl functions (like your fg_func), which is why you don't see errors when you call your function standalone, it's specifically a P:O:NL thing.

@mohawk2
Copy link
Member

mohawk2 commented Apr 13, 2024

I think that's somewhat the answer; P:O:NL isn't designed to be used in the way you're using it. It sets up some very thin ndarrays (with a data pointer, a dims array, and a bit of state to say "don't touch"), then calls the user-supplied Perl function with them. On return, it then cleans those up, but your call to exit in the middle stops the last part, so they're visible to Perl which tries to call DESTROY.

A workaround is to use POSIX::_exit as identified above. Also, PDL is working as designed, so I'm transferring this issue to P:O:NL.

@mohawk2 mohawk2 transferred this issue from PDLPorters/pdl Apr 13, 2024
@mohawk2
Copy link
Member

mohawk2 commented Apr 25, 2024

There is nothing else that can be done on this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants