New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The profiling options should work uniformly for applications and benchmark suites #1771

Closed
ldcasillas-progreso opened this Issue Feb 9, 2016 · 10 comments

Comments

Projects
None yet
2 participants
@ldcasillas-progreso

ldcasillas-progreso commented Feb 9, 2016

I can build a profiled version of my application easily with this command:

stack install \
    --executable-profiling  \
    --library-profiling \
    --ghc-options='-fprof-auto'

When I run this application with +RTS -p -RTS on the command line, it writes a .prof file as expected, and this profile covers cost centers both from my Cabal package's library and executable components.

However, the equivalent doesn't work smoothly when I try to profile the same package's benchmark suite:

stack bench \
    --executable-profiling  \
    --library-profiling \
    --ghc-options='-fprof-auto' \
    --benchmark-arguments='+RTS -p -RTS'

This does produce a bench.prof file, but the file only reports on cost centers in the benchmark component's module and its library dependencies (mostly Criterion and its dependencies), and does not report on my Cabal package's library component's cost centers at all. This is a bummer, needless to say, because it'd be natural to use the benchmark suite to profile my Cabal package's library component!

@ldcasillas-progreso

This comment has been minimized.

ldcasillas-progreso commented Feb 9, 2016

This is with a Stack that I built from Git:

% stack --version
Version 1.0.3, Git revision 260ea31e787593fe27d002985f45739d8a04a498 x86_64
@ldcasillas-progreso

This comment has been minimized.

ldcasillas-progreso commented Feb 9, 2016

I seem to have missed this ticket when I searched earlier: #1759

So I tried stack bench --profile, but this still does not profile my package's library component.

@mgsloan

This comment has been minimized.

Collaborator

mgsloan commented Feb 10, 2016

I'm surprised that this doesn't successfully profile your package's library. If you can figure out what --ghc-options are necessary to make this happen, that'd be very interesting. Also, it'd be interesting if stack clean makes the next stack bench --profile provide the results you're looking for. If it does, then that means somewhere along the line there's an issue with dirtiness checking.

@mgsloan mgsloan added this to the Support milestone Feb 11, 2016

@ldcasillas-progreso

This comment has been minimized.

ldcasillas-progreso commented Feb 11, 2016

stack clean doesn't help. And I don't know whether it's the profiling or stack that's causing this, but I don't feel confident that my --ghc-options arguments are effectual when I do stack bench --profile. (And by "don't feel confident" I mean I'm having a hard time telling apart lack of understanding vs. documentation gaps vs. bugs.)

For example:

  • My .cabal file specifies -threaded -rtsopts -with-rtsopts=-N in its GHC-Options field;
  • My benchmark suite has benchmarks that use parallelism;

When I run stack bench it runs on multiple cores, but when I try the following it runs on just one core:

stack bench \
    --executable-profiling  \
    --library-profiling \
    --ghc-options='-fprof-auto -threaded' \
    --benchmark-arguments='+RTS -p -N -RTS'

If it helps any, the project is here:

Fair warning, on my 4-core machine the benchmark suite pegs all CPUs and runs for over 10 minutes...

@mgsloan

This comment has been minimized.

Collaborator

mgsloan commented Feb 11, 2016

This should work:

stack bench --profile --ghc-options -threaded --benchmark-arguments='+RTS -N -RTS'

But yeah, I'm not seeing it use multiple cores. I haven't taken a look at the profiling results.

@ldcasillas-progreso

This comment has been minimized.

ldcasillas-progreso commented Feb 11, 2016

I just tried your suggestion with version 1.0.3, Git revision 260ea31 x86_64, and no luck.

Then I did a stack upgrade --git, which brought me to version 1.0.3, Git revision 4fc1b8d x86_64, and tried again, and still no dice. Clearly there's an environmental factor we're not accounting for...

@mgsloan

This comment has been minimized.

Collaborator

mgsloan commented Feb 12, 2016

It's possible that there isn't an environmental thing, since I do reproduce it not using multiple cores, and the profiling results might be not be what we're looking for (haven't checked).

I took a look at the code involved and noticed there might be an issue with the combination of --profile and --benchmark-arguments - there wasn't a space between the arguments that end up getting passed to runhaskell Setup.hs bench. I've pushed a commit fixing this, does it resolve the issue?

I'm still not seeing it use more than one core. Is an invocation like bench +RTS -N -RTS +RTS -p -RTS supposed to be equivalent to? bench +RTS -N -p -RTS? I hope so!

mgsloan added a commit that referenced this issue Feb 12, 2016

@mgsloan

This comment has been minimized.

Collaborator

mgsloan commented Feb 18, 2016

Closing due to lack of response. Re-open if it's still an issue.

@mgsloan mgsloan closed this Feb 18, 2016

@ldcasillas-progreso

This comment has been minimized.

ldcasillas-progreso commented Feb 18, 2016

Sorry for the delay, I've been really busy and my day job isn't Haskell-related. I just ran the following, on a clean source tree:

stack bench --profile --ghc-options='-threaded -fprof-auto' --benchmark-arguments='theo1'

Observations:

  1. My sequential and parallel benchmark cases run in the same amount of time, and the process' CPU usage never goes above 100% (one core).
  2. The bench.prof file reports that the command run was bench +RTS -N -p -RTS theo1
  3. I see cost centers for my library module now. Not the ones I'd like to see, but I'm beginning to suspect that what I'm seeing is GHC inlining definitions differently into my main program and my benchmark.

So the only clear issue I'm observing at the moment is that the benchmark doesn't run in threaded mode when I use --profile.

Version 1.0.3, Git revision 3f7b50fe598ec83eda5d393fe5b08a99ce5ac1e9 x86_64
@ldcasillas-progreso

This comment has been minimized.

ldcasillas-progreso commented Feb 18, 2016

I've opened a new issue just for the SMP bit: #1808

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment