Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use cYAML if it is available #2010

Merged
merged 1 commit into from
Oct 13, 2016
Merged

Use cYAML if it is available #2010

merged 1 commit into from
Oct 13, 2016

Conversation

tgamblin
Copy link
Member

I threw this together working on #1562, and thought I'd just get it merged.

Spack ships with its own PyYAML module in lib/spack/external/yaml, but it doesn't use the C libyaml because we don't want to require compiling anything -- Spack must run out of the box.

cYAML is pretty fast, though, and with things like #1015 and #1535 the main bottleneck to operations like spack find is YAML parsing.

This patch makes Spack use an already-installed PyYAML and the CParser and CLoader from libyaml if they're available. It makes a spack find on @adamjstewart's 400+ installed packages from #1562 run 3x faster (1s vs 3s) on my laptop.

@alalazo @davydden @adamjstewart @citibeth

@adamjstewart
Copy link
Member

Oh cool! I'll have to try that out once this is merged.

@tgamblin tgamblin merged commit d861a52 into develop Oct 13, 2016
@citibeth
Copy link
Member

@tgamblin

I like making Spack run faster. But I'd like to make sure that there is a way to bootstrap a Spack-installed cYaml if the system doesn't provide one. We can't count on system cYaml being available, or the user being able to install it if it is; and once a few of these "improvements" pile up, it will become increasingly necessary to use them to experience Spack as it is intended.

@tgamblin
Copy link
Member Author

We just need to add libyaml to spack and then it's easy to bootstrap.

@alalazo
Copy link
Member

alalazo commented Oct 14, 2016

@tgamblin Do you think this is worth a new row in the test matrix (one in which we pip install yaml before running the tests) ?

@citibeth
Copy link
Member

I think we should test by spack install yaml before running the tests.

On Fri, Oct 14, 2016 at 5:39 AM, Massimiliano Culpo <
notifications@github.com> wrote:

@tgamblin https://github.com/tgamblin Do you think this is worth a new
row in the test matrix (one in which we pip install yaml before running
the tests) ?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#2010 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AB1cdwcGXhcYgDsKZ2y5Zd4yvlMBRXkGks5qz03SgaJpZM4KVUX_
.

@adamjstewart
Copy link
Member

pip install yaml would be much quicker than spack install yaml. If the goal is to make the tests go faster (and be more likely to succeed within the global timeout), then I think pip would work better. No reason to test Spack's yaml package for every commit to Spack if no one is editing it. That would fit better in the package-specific tests I outlined in #1576.

Let me submit a PR and see if it affects the timing at all.

@citibeth
Copy link
Member

Sounds reasonable.

On Fri, Oct 14, 2016 at 11:25 AM, Adam J. Stewart notifications@github.com
wrote:

pip install yaml would be much quicker than spack install yaml. If the
goal is to make the tests go faster (and be more likely to succeed within
the global timeout
https://docs.travis-ci.com/user/common-build-problems/#My-builds-are-timing-out),
then I think pip would work better. No reason to test Spack's yaml package
for every commit to Spack if no one is editing it. That would fit better in
the package-specific tests I outlined in #1576
#1576.

Let me submit a PR and see if it affects the timing at all.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#2010 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AB1cd-VsHnAWVySOvGdCjWkk0ItfSEtVks5qz57tgaJpZM4KVUX_
.

@tgamblin
Copy link
Member Author

@alalazo I think that's a good idea. I was going to say that YAML is really stable -- 3.11 came out in 2014, but new versions of pyyaml and libyaml actually just came out. So I think it would be useful to test.

@pramodk
Copy link
Contributor

pramodk commented Oct 15, 2016

@tgamblin : After this merge, on bluegene-q system I see following:

$ ./bin/spack --version
Traceback (most recent call last):
  File "./bin/spack", line 98, in <module>
    import spack
  File "/home/kumbhar/tmp_tk/spack/lib/spack/spack/__init__.py", line 74, in <module>
    repo = spack.repository.RepoPath()
  File "/home/kumbhar/tmp_tk/spack/lib/spack/spack/repository.py", line 131, in __init__
    repo_dirs = spack.config.get_config('repos')
  File "/home/kumbhar/tmp_tk/spack/lib/spack/spack/config.py", line 403, in get_config
    data = scope.get_section(section)
  File "/home/kumbhar/tmp_tk/spack/lib/spack/spack/config.py", line 248, in get_section
    data   = _read_config_file(path, schema)
  File "/home/kumbhar/tmp_tk/spack/lib/spack/spack/config.py", line 323, in _read_config_file
    data = syaml.load(f)
  File "/home/kumbhar/tmp_tk/spack/lib/spack/spack/util/spack_yaml.py", line 213, in load
    return yaml.load(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/yaml/__init__.py", line 71, in load
    return loader.get_single_data()
  File "/usr/lib64/python2.6/site-packages/yaml/constructor.py", line 37, in get_single_data
    node = self.get_single_node()
  File "_yaml.pyx", line 702, in _yaml.CParser.get_single_node (ext/_yaml.c:7647)
  File "_yaml.pyx", line 905, in _yaml.CParser._parse_next_event (ext/_yaml.c:10396)
yaml.reader.ReaderError: unacceptable character #x0000: control characters are not allowed
  in "/home/kumbhar/tmp_tk/spack/etc/spack/defaults/repos.yaml", position 546

If I go to the commit previous to this merge, it works fine:

$ git checkout b27f4e3aebf6c1ecbefffc22a9a27a4fe9ce0ab3
$ ./bin/spack --version
0.9.1

(note that this happens on one of the three bg-q system where I am trying)

@tgamblin
Copy link
Member Author

Can you attach /home/kumbhar/tmp_tk/spack/etc/spack/repos.yaml?

@pramodk
Copy link
Contributor

pramodk commented Oct 15, 2016

$ ll spack/etc/spack/repos.yaml
ls: cannot access spack/etc/spack/repos.yaml: No such file or directory

do you mean spack/etc/spack/defaults/repos.yaml? (its unmodified, i cloned new repo for testing).
repos.yaml.txt

pramodk added a commit to pramodk/spack that referenced this pull request Oct 22, 2016
@tgamblin tgamblin deleted the features/fast-yaml branch December 31, 2016 04:59
muffgaga pushed a commit to electronicvisions/yashchiki that referenced this pull request May 6, 2021
Spack can make use of the faster c-implementation for yaml parsing:
spack/spack#2010

Hence, we should install it prior to installing with spack as parsing
yaml files is a huge bottleneck.

Change-Id: Ief8b0c5fd6cfd91ea84120d639ab43f6629495a5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants