Skip to content

Commit

Permalink
Performance boost: reduce instantiations of re.Scanner
Browse files Browse the repository at this point in the history
- Lexer is the same for every Spec parser in spack, so don't build it
  every time.

- This improves time to import package.py files a lot, as a Lexer
  doesn't have to be constructed for every spc in the packages.

- To concretize dealii:
  - Before: ~20 sec
  - After:  ~6 sec
  • Loading branch information
tgamblin committed May 29, 2016
1 parent 6dcdb50 commit e8b4d5f
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion lib/spack/spack/spec.py
Original file line number Diff line number Diff line change
Expand Up @@ -2015,10 +2015,13 @@ def __init__(self):
(r'\s+', lambda scanner, val: None)])


# Lexer is always the same for every parser.
_lexer = SpecLexer()

class SpecParser(spack.parse.Parser):

def __init__(self):
super(SpecParser, self).__init__(SpecLexer())
super(SpecParser, self).__init__(_lexer)
self.previous = None

def do_parse(self):
Expand Down

6 comments on commit e8b4d5f

@tgamblin
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davydden: see if this helps some more with concretization time. I cherypicked this from a branch with some other optimizations that aren't ready yet, so the times above may be a little fast, but this should result in a speedup nonetheless.

@davydden
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgamblin : from my unscientific measurements (on iPhone), the time of spack spec deali@dev went down from ~14sec to ~8sec on the current master. So it does help 👍

@tgamblin
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davydden: if you go into the openssl package and change url_for_version to _url_for_version, how much does that affect your concretization time? I noticed that including openssl adds ~3sec to concretization for me. I'd like to figure out a better way to handle the version check in there, maybe by a separate command.

@tgamblin
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alalazo: any thoughts on openssl?

@alalazo
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgamblin I didn't profile the package yet, but I assume that most of the time is spent in :

def check_for_outdated_release(self, version, warnings_given_to_user):
   ...   

in particular in the urllib.urlopen call. We may try to move to urllib2 and fine tune some of the defaults (like timeout or similar), but this will help only in those cases where we don't have a connection available.

On the other hand this unique behavior in url_for_version was explicitly requested in an issue some time ago and I believe that it has some points : now anybody using a spack built openssl gets a clear warning as long as the openssl version gets deprecated, so I am not sure I am in favor of removing it.

Just out of curiosity, would setting something like :

packages:
  openssl:  
    buildable: False
    version: ['@external']
    paths:
      openssl@external: /usr/lib/x86_64-linux-gnu

help with concretization time ? If so we may just provide a sensible default for openssl in the etc directory.

@tgamblin
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alalazo: Yep 😄. That's exactly where the three seconds go. And I remember the issue. People are concerned about installing outdated (and insecure) openssl versions.

I think the issue here is that url_for_version isn't really the right place for this. It gets called in a few different places, including in Package.__init__, which is called during concretization. That means commands like spack spec will make URL connections, which is clunky, and url_for_version has to record that it was called to avoid making lots of connections. So, url_for_version doesn't really have enough information to figure out whether/when it needs to do a check. It would be nice, for example, if the check could figure out that we're on a machine without a network connection (like LLNL's closed network).

I was thinking that it would be better to add some better control points for checks like this. What do you think of adding a few overridable methods to Package to give package authors some more control? e.g., I was thinking of adding these:

  1. Package.pre_install_dag(self, spec): executes after concretization of the spec to be installed, so the DAG is complete, but nothing has been fetched/staged yet, and no deps have been installed.
  2. Package.pre_install(self, spec): executes right before this package is installed. Its dependencies will have already been installed, and it will have been fetched and staged.
  3. Package.post_install(self, spec): executes right after this package is installed. Package is installed and build stage is still present.
  4. Package.post_install_dag(self, spec): executes after all packages in a DAG have been installed.

By default they would do nothing, but packages that want them could implement them (the way they currently implement patch()). The openssl check could easily go in pre_install_dag, which would be guaranteed to be invoked once, right before a package that depends on openssl is installed. The _dag routines should probably be called in postorder for each DAG.

Please sign in to comment.