Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved performance by switching to vector-hashtables #5966

Closed
nad opened this issue Jun 20, 2022 · 6 comments · Fixed by #5969
Closed

Improved performance by switching to vector-hashtables #5966

nad opened this issue Jun 20, 2022 · 6 comments · Fixed by #5969
Assignees
Labels
performance Slow type checking, interaction, compilation or execution of Agda programs type: bug Issues and pull requests about actual bugs type: enhancement Issues and pull requests about possible improvements
Milestone

Comments

@nad
Copy link
Contributor

nad commented Jun 20, 2022

A performance debugging story:

I noticed that Agda had become slower. I had some benchmark logs from earlier this year lying around, so I knew roughly how fast Agda could be for that benchmark, and a recent version of Agda was significantly slower. However, when I compiled some variants of Agda from around the time of the benchmark logs the program was still slow. Fortunately I had an old, already compiled binary which was fast (compared to the slow ones). That binary did not correspond to any commit on the main branch of Agda, but I could find the commit in my reflog.

When I checked out and compiled that commit Agda was still slow. I had used the same version of GHC in both cases (according to +RTS --info). File sizes suggest that I did not use the optimise-heavily flag for the fast binary, but I compiled Agda both with and without this flag, and got a significant slowdown in both cases. The variant compiled without optimise-heavily was still a couple of megabytes larger than the fast binary, so presumably different libraries were used. The newly compiled binary mostly used libraries that were already released on Hackage when the fast binary was compiled, with the notable exception of unordered-containers 0.2.18.0.

The changelog for unordered-containers 0.2.18.0 mentions that one operation is sometimes slower, so I tried to use 0.2.16.0 instead (a version that was already installed, and presumably used for the fast binary). With that change the file size was almost the same as for the fast binary, the difference was only a couple of kilobytes. However, the new binary was still slow, perhaps about 70% slower for my benchmark.

My next step was to check which libraries I had several installed copies of (except for unordered-containers). There were two, hashable and hashtables. After I switched from hashable 1.3.5.0 to 1.4.0.2 and from hashtables 1.2.4.2 to 1.3 the file sizes differed by 64 bytes. The changelogs for these packages did not indicate any interesting performance changes, but after switching the performance was similar to that of the fast binary.

What lessons can be drawn from this? If I had used Stack, then it would presumably have been easier to reproduce the old build. When cabal-install is used it might make sense to store all library versions and installation flags together with benchmark logs (perhaps it suffices to use cabal v2-freeze).

@nad nad added type: bug Issues and pull requests about actual bugs performance Slow type checking, interaction, compilation or execution of Agda programs labels Jun 20, 2022
@nad nad added this to the 2.6.3 milestone Jun 20, 2022
@nad nad self-assigned this Jun 20, 2022
@nad
Copy link
Contributor Author

nad commented Jun 21, 2022

Some timings for type-checking and loading the standard library, with several variants of a recent version of Agda (in no case compiled with the flag optimise-heavily on, and unordered-containers 0.2.19.1 was used every time):

  • With hashable 1.3.5.0 and hashtables 1.2.4.2:
    real    24m26,807s
    user    24m6,744s
    sys     0m23,546s
    
    real    5m33,799s
    user    5m32,012s
    sys     0m2,453s
    
  • With hashable 1.4.0.2 and hashtables 1.3:
    real    11m23,394s
    user    11m11,504s
    sys     0m13,874s
    
    real    1m8,838s
    user    1m7,875s
    sys     0m1,159s
    
  • With hashable 1.3.5.0 and vector-hashtables 0.1.1.1:
    real    9m34,885s
    user    9m22,526s
    sys     0m14,142s
    
    real    0m49,188s
    user    0m47,832s
    sys     0m1,521s
    
  • With hashable 1.4.0.2 and vector-hashtables 0.1.1.1:
    real    9m35,074s
    user    9m22,589s
    sys     0m14,229s
    
    real    0m48,334s
    user    0m46,967s
    sys     0m1,514s
    

What do you think, should we switch from hashtables to vector-hashtables? It is unclear to me how well-maintained the latter package will be in the future, and no Hackage package seems to depend on it. However, the changes involved are small (these hashtables are only used by the serialiser/deserialiser), so it would be easy to switch back in the future.

@nad nad changed the title Major slowdown with downgraded packages Performance issues/opportunities involving hash tables Jun 21, 2022
@nad nad added the type: enhancement Issues and pull requests about possible improvements label Jun 21, 2022
nad added a commit that referenced this issue Jun 21, 2022
@andreasabel andreasabel linked a pull request Jun 21, 2022 that will close this issue
@andreasabel
Copy link
Member

What are the lessons learned from your analysis?

  • Require hashtables >= 1.3 (I suppose.)
  • Anything on hashable?
  • Anything on unordered-containers?

@nad
Copy link
Contributor Author

nad commented Jun 21, 2022

  • Require hashtables >= 1.3 (I suppose.)

If hashtables is used. I have already committed such a change, but that requirement is only imposed starting from GHC 9.0.

  • Anything on hashable?

No.

  • Anything on unordered-containers?

No.

@nad
Copy link
Contributor Author

nad commented Jun 21, 2022

I managed to compile Agda, with vector-hashtables, using all versions of GHC that we support. The test suite does not complain. Do you think we should switch?

@andreasabel
Copy link
Member

The changes in your PR amount to swapping new for initialize 0 and some import changes. These can be wrapped in a small compatibility module, like Agda.Utils.Hashtables. Then, we do not have to worry about whether we should switch so much, because it will be easy to switch back. (E.g. if you keep the code for hashtables in a comment.)

nad added a commit that referenced this issue Jun 22, 2022
This change was suggested by Andreas Abel.
nad added a commit that referenced this issue Jun 22, 2022
Andreas Abel contributed to this commit.
@nad
Copy link
Contributor Author

nad commented Jun 22, 2022

I added Agda.Utils.HashTable. I did not comment out any code, instead I first added the module, and then I switched to vector-hashtables in a separate commit.

nad added a commit that referenced this issue Jun 22, 2022
This change was suggested by Andreas Abel.
nad added a commit that referenced this issue Jun 22, 2022
Andreas Abel contributed to this commit.
@nad nad closed this as completed in #5969 Jun 22, 2022
@andreasabel andreasabel changed the title Performance issues/opportunities involving hash tables Performance enhancement: switching to vector-hashtables library Oct 24, 2022
@nad nad changed the title Performance enhancement: switching to vector-hashtables library Improved performance by switching to vector-hashtables Oct 25, 2022
andreasabel pushed a commit that referenced this issue Jan 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Slow type checking, interaction, compilation or execution of Agda programs type: bug Issues and pull requests about actual bugs type: enhancement Issues and pull requests about possible improvements
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants