Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime of program taking forever .... #3140

Closed
pprocacci opened this issue Aug 24, 2019 · 10 comments
Closed

Runtime of program taking forever .... #3140

pprocacci opened this issue Aug 24, 2019 · 10 comments

Comments

@pprocacci
Copy link

pprocacci commented Aug 24, 2019

The Problem

42 seconds of runtime to dynamically load a module which only calls new().

Expected Behavior

I expect the loading and execution of new() to be only seconds at most.

Actual Behavior

/usr/bin/time -h ./doit.pl6
40.79s real 36.02s user 4.78s sys

Steps to Reproduce

project_dir/
         | ------doit.pl6
         | ------lib/
                    | ----- Test.pm6
         | ------auto-lib/
                    | ------ Test/
                                | ----- Service/
                                          | ------ Something/
                                                      | ------- Else.pm6
                                          | ------ Something.pm6
                                | ----- Service.pm6

The above is the incomplete directory structure I am working with. For a better idea of the number of files/directories in each path:

find auto-lib/Test/Service -depth 1 -type f | wc -l
     185
find auto-lib/Test/Service -depth 1 -type d | wc -l
     185
find auto-lib/Test/Service/Something -depth 1 -type d | wc -l
       0
find auto-lib/Test/Service/Something -depth 1 -type f | wc -l
    1904

There are a lot of modules that I'm trying to avoid 'use'ing and instead only load the needed modules at runtime. I'm leaning on the FALLBACK method in auto-lib/Test/Service/Something.pm6 (below) to avoid having to do so. It's my understanding that the FALLBACK() method in my example (below) shouldn't matter because all I'm doing is using the method new().

In short:
doit.pl6 loads lib/Test.pm6 at compile time.
lib/Test.pm6 loads auto-lib/Service/Something at runtime.
auto-lib/Service/Something loads lib/Test.pm6 at compile time.
auto-lib/Service/Something should load an unknown method name at runtime using FALLBACK()

I only Instantiate Test and Something with methods new() and nothing else.
It's obvious to me that perl6 is doing something I don't expect it to do, but there's no indication as to what that may be. A full tarball of my project can be provided if necessary.

doit.pl6

#!/usr/bin/env perl6

use lib <lib auto-lib>;
use Test;

my $client = Test.client('Something');

lib/Test.pm6

unit class Test;

use Test::Service;

my Test $instance;

method new(*%named)
{
  return $instance //= self.bless(|%named) ;
}

method instance()
{
  $instance //= Test.new;
  $instance;
}

method client(Test::Service $service! --> Test)
{
  my Str $class = "Test::Service::{$service}";
  require ::($class);
  ::($class).new;
}

auto-lib/Test/Service.pm6 - * Note <RAM ...> is incomplete and just here to show example.

subset Test::Service of Str where any <RAM ...>;

auto-lib/Test/Service/Something.pm6

use Test;

class Test::Service::Something is Test {

  method FALLBACK($name, *@rest, *%rest) {
    my Str $class = "Test::Service::Somthing::{$name}";
    require ::($class);
    ::($class).new(|@rest,|%rest);
  }

}

auto-lib/Test/Service/Something/Else.pm6

use Test::Service::Something::NeedThis;
use Test::Service::Something::NeedThat;

class Test::Service::Something::Else {
  has Test::Service::Something::NeedThis $.NeedThis;
  has Test::Service::Something::NeedThat $.NeedThat;
}

Environment

  • Operating system: FreeBSD 12
  • Compiler version (perl6 -v):

This is Rakudo Star version 2019.03.1 built on MoarVM version 2019.03
implementing Perl 6.d.

@pprocacci
Copy link
Author

Additional information:

Moving the directory that the FALLBACK() method would use to a different name (one that won't resolve any dynamically loaded modules) results in an error. Why?

mv auto-lib/Test/Service/Something auto-lib/Test/Service/Something.orig
./doit.pl
You cannot create an instance of this type (Test::Service::Something)
  in method client at /mnt/tank/Projects/test/lib/Test.pm6 (Test) line 24
  in block <unit> at ./doit.pl6 line 6

new() is already resolved as per normal. There's no reason for FALLBACK() to even be involved .... that is until I call a method that isn't handled by the class.

Does FALLBACK() do something (like preload possible methods) or something?
Am I misunderstanding the documentation?

https://docs.perl6.org/language/typesystem#index-entry-FALLBACK_%28method%29
A method with the special name FALLBACK will be called when other means to resolve the name produce no result.

@lizmat
Copy link
Contributor

lizmat commented Sep 27, 2019

Have you considered installing the module?

I'm pretty sure the problem is that you're using use lib, which means that Perl 6 needs to create a SHA1 of all files in the directories that are accessible from the directory that is indicated with "use lib". Could you verify that theory by moving the file that you want to use to an "empty" directory tree and then use lib that directory and check the performance then?

@pprocacci
Copy link
Author

I'm pretty sure the problem is that you're using use lib, which means that Perl 6 needs to create a SHA1 of all files in the directories that are accessible from the directory that is indicated with "use lib".

Is this actually true?

test.pl6:

use lib <mylib>;
say 'test';

; truss -s 1024 -o data ./test.pl6 2>/dev/null && fgrep mylib data
read(3,"#!/usr/bin/env perl6\n\nuse lib ;\n\nsay 'test';\n",1048576) = 52 (0x34)

I see no indication that perl6 is doing this at all. In other words, the mere existence of use lib <mylib> doesn't cause perl6 to open the mylib directory and perform a directory traversal.

In fact, the mylib directory hasn't even existed during my testing.

The only time a directory traversal is performed is when I include a module either statically or dynamically via:

use Module;
or
require ::($class);

$class here is irrelevant, but it's doing what it should be doing:

fstatat(AT_FDCWD,"/mnt/tank/Projects/tmp/mmylib/META6.json",0x7fffffffc270,0x0) ERR#2 'No such file or directory'
fstatat(AT_FDCWD,"/mnt/tank/Projects/tmp/mmylib",0x7fffffffc270,0x0) ERR#2 'No such file or directory'
fstatat(AT_FDCWD,"/mnt/tank/Projects/tmp/mmylib/META6.json",0x7fffffffc270,0x0) ERR#2 'No such file or directory'
fstatat(AT_FDCWD,"/mnt/tank/Projects/tmp/mmylib/Test/This/Out.pm6",0x7fffffffc270,0x0) ERR#2 'No such file or directory'
fstatat(AT_FDCWD,"/mnt/tank/Projects/tmp/mmylib/Test/This/Out.pm",0x7fffffffc270,0x0) ERR#2 'No such file or directory'

You'll notice that it searches for META6.json first followed by the exact module name, but never does a traversal of any kind.

Another thing to note before continuing on with my response, it appears multiple stat(2) calls are happening for each file it's looking for within the directory tree:

fstatat(AT_FDCWD,"/mnt/tank/Projects/tmp/auto-lib/Test/Service/Something/.",{ mode=drwxr-xr-x ,inode=11616,size=3,blksize=131072 },0x0) = 0 (0x0)
fstatat(AT_FDCWD,"/mnt/tank/Projects/tmp/auto-lib/Test/Service/Something/.",{ mode=drwxr-xr-x ,inode=11616,size=3,blksize=131072 },0x0) = 0 (0x0)
fstatat(AT_FDCWD,"/mnt/tank/Projects/tmp/auto-lib/Test/Service/Something/.",{ mode=drwxr-xr-x ,inode=11616,size=3,blksize=131072 },0x0) = 0 (0x0)
mmap(0x0,8192,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34484649984 (0x807720000)
fstatat(AT_FDCWD,"/mnt/tank/Projects/tmp/auto-lib/Test/Service/Something/..",{ mode=drwxr-xr-x ,inode=11614,size=4,blksize=131072 },0x0) = 0 (0x0)
fstatat(AT_FDCWD,"/mnt/tank/Projects/tmp/auto-lib/Test/Service/Something/..",{ mode=drwxr-xr-x ,inode=11614,size=4,blksize=131072 },0x0) = 0 (0x0)
fstatat(AT_FDCWD,"/mnt/tank/Projects/tmp/auto-lib/Test/Service/Something/..",{ mode=drwxr-xr-x ,inode=11614,size=4,blksize=131072 },0x0) = 0 (0x0)
mmap(0x0,8192,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34484658176 (0x807722000)
fstatat(AT_FDCWD,"/mnt/tank/Projects/tmp/auto-lib/Test/Service/Something/Else.pm6",{ mode=-rw-r--r-- ,inode=11617,size=228,blksize=4096 },0x0) = 0 (0x0)
fstatat(AT_FDCWD,"/mnt/tank/Projects/tmp/auto-lib/Test/Service/Something/Else.pm6",{ mode=-rw-r--r-- ,inode=11617,size=228,blksize=4096 },0x0) = 0 (0x0)```

This probably belongs in another ticket, but I've mentioned it here simply to keep track of for myself.


Back to my response ..... WHY is the directory traversal happening when I load a dynamic module (compile time untested)?

Doesn't Service::Blah::Test.new imply <lib>/Service/Blah/Test.pm6 ?
Perl6 should know where to find this file without having to do the traversal ... no?  Am I mistaken?

I guess it boils down to me not understanding why perl6 needs to do the traversal in the first place.
I know you said it's because perl6 needs to SHA1 all files ... but why?
In my opinion, perl6 should only be interested in the files that are required to run a given program, of which are defined by the programmers.  Perl6 should know exactly where to find these files without the traversal.  Proactively searching through and SHA1 via a traversal seems extremely wasteful.

Why can't perl6 simply check some internal cache for a file with a matching SHA1 and use it, otherwise translate the Service::Blah::Test accordingly to <lib>/Service/Blah/Test.pm6.

Thanks for responding.

@niner
Copy link
Collaborator

niner commented Sep 27, 2019 via email

@pprocacci
Copy link
Author

I think I'm starting to understanding and this is due to the compiled nature of perl6.

perl5 doesn't have this problem because it will always re-read, re-parse and re-eval all and only the source files it needs to deliver a response. If a source file changes, the output will change accordingly and it will do so only looking at the minimum number of source files it has too.

Is there any way to run a perl6 program bypassing this "compiled nature" because in my use case, this actually does more harm than good.

@pprocacci
Copy link
Author

You can disregard my last comment. I'll just stick with perl5.

Thanks for the input guys!

@lizmat
Copy link
Contributor

lizmat commented Sep 27, 2019

@pprocacci Sorry to hear this. Would appreciate if you could tell us (as a comment to this issue) what drove you to that decision. I'm not saying that we could fix that in any amount of time, but it would at least let us know what could be a decision point for using / not using Perl 6.

Thank you in advance!

@pprocacci
Copy link
Author

pprocacci commented Sep 30, 2019

In short, it's not production ready. I LIKE perl6. I don't like continuously running into things which I expect to just work. All 5 of the below problems I've encountered while trying to write a single library. Each time I run into a problem, I put it down for several months only to come back later with what I perceive as a workaround, and bam .... another problem.

Sockets

#2162

EVAL

#2885

RACE

Though I haven't created a ticket regarding this, using race for @something.race in parts of my program has resulted in a core dump. If I do run into again I'll report it.

Excessive syscalls

Again not reported (minor), but detailed within this ticket:
fstatat(AT_FDCWD,"/mnt/tank/Projects/tmp/auto-lib/Test/Service/Something/.",{ mode=drwxr-xr-x ,inode=11616,size=3,blksize=131072 },0x0) = 0 (0x0)
fstatat(AT_FDCWD,"/mnt/tank/Projects/tmp/auto-lib/Test/Service/Something/.",{ mode=drwxr-xr-x ,inode=11616,size=3,blksize=131072 },0x0) = 0 (0x0)
fstatat(AT_FDCWD,"/mnt/tank/Projects/tmp/auto-lib/Test/Service/Something/.",{ mode=drwxr-xr-x ,inode=11616,size=3,blksize=131072 },0x0) = 0 (0x0)

RE interpolation in character classes (doc problem)

Raku/doc#2999

And then there's this ticket, which I now understand to be a major difference between perl5 and 6. In perl5, the number of files in a directory imposes no runtime penalty if designed correctly. In perl6 I feel at this time the only workaround is to essentially cat *.pl6 > huge_file.pm6 to avoid large directory traversals.

Don't misunderstand me, I'll continue using perl6 for my less complex needs, but this current project I can't seem to make any headway on due to the continuous issues that crop up. I'm always running into something. ;)

@niner
Copy link
Collaborator

niner commented Sep 30, 2019 via email

@pprocacci
Copy link
Author

pprocacci commented Sep 30, 2019

I am attempting to write a perl6 sdk for Amazon Web Services much like https://metacpan.org/pod/Paws
My perl6 code (the codebase you are curious about) takes the botocore sdk service definitions https://github.com/boto/botocore and converts the shape definitions to perl6 objects.

Attached is the code I've created thus far. It's very much a work in progress and it's very fluid and has been changing quite a bit. Also note, I had no intention of sharing this until it was what I would consider to be nearing completion ... which it currently is not. Due note however, that I've changed the 1000's of files as initially claimed and have instead essentially packed that data into single service files.

perl6-aws-sdk.tar.gz

tar -zxf perl6-aws-sdk.tar.gz
cd perl6-paws.best.so.far
git clone https://github.com/boto/botocore.git
./builder-bin/generate

Following the above, the ./builder-bin/generate perl6 script did but no longer creates 1000's of files.
Now, just a single service definition since our discussion above. (I'd much rather keep separate files though).

Running the above currently creates auto-lib/Paws/Service/<service>.pm6 for each defined aws service packing all shapes within that service file whereas before it was creating:

auto-lib/Paws/Service/<service>.pm6
auto-lib/Paws/Service/<service>/<shape1>.pm6
auto-lib/Paws/Service/<service>/<shape2>.pm6
...
auto-lib/Paws/Service/<service>/<shape300>.pm6
...

..... resulting in 1000's of files separating each shape into their own files under their own service directory.

builder-bin/generate has two comments that I've added detailing the few problems I've run into as well.

As for the service files themselves, once ./builder-bin/generate has been ran, you can look at any of the resulting service files (Ex: auto-lib/Paws/Service/Pinpoint.pm6) to understand how this one file when split out according to shape definition could result in 1000's of files. ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants