Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Organize performance benchmarks #95

Conversation

countvajhula
Copy link
Collaborator

@countvajhula countvajhula commented Mar 8, 2023

Summary of Changes

The SDK has been a bit of a "ball of mud" in that scripts were just added as they became necessary. This PR introduces some order by organizing the modules into:

  1. local benchmarks that exercise individual forms
  2. nonlocal benchmarks that exercise many components at once
  3. module loading

In addition, we're often interested in checking for regression of any of these in relation to some baseline, and also comparison of these in relation to some other language like Racket. Both of these are now implemented using the same regression logic, i.e. "regression" is just the case where we're comparing the performance of the language in relation to itself (for either local, nonlocal, or module benchmarks). Similarly, "competitive" benchmarks are nonlocal benchmarks (only, since competitive isn't well-defined for local and module benchmarks) run independently for each language and then compared for "regression."

The new layout of the SDK is:

qi-sdk
├── info.rkt
└── profile
    ├── loading
    │   ├── loadlib.rkt
    │   └── report.rkt
    ├── local
    │   ├── base.rkt
    │   ├── benchmarks.rkt
    │   └── report.rkt
    ├── nonlocal
    │   ├── intrinsic.rkt
    │   ├── qi
    │   │   └── main.rkt
    │   ├── racket
    │   │   └── main.rkt
    │   ├── report-competitive.rkt
    │   ├── report-intrinsic.rkt
    │   └── spec.rkt
    ├── regression.rkt
    ├── report.rkt
    └── util.rkt

Each category (folder) contains a report.rkt file which is the actual CLI for that benchmarking category. This does result in some duplication, but I think that could be minimized in the future via countvajhula/cli#3.

Some standard features available to all of the benchmarking (via CLI) include the ability to:

  • configure the output format (e.g. CSV or JSON -- ./report.rkt -f csv)
  • check regression (./report.rkt -f json > before.json for the reference data, and then ./report.rkt -r before.json)
  • configure which benchmarks are run (-s <benchmark-name> ...)

The format of the output is just the one used by the github-action-benchmark tool that we use here, so we should now be able to add all the benchmarks there (i.e. nonlocal too).

This is in support of #78 . As we consider compiler optimizations (e.g. in #76), we can define and add nonlocal benchmarks in the appropriate paths for each such candidate optimization, to validate whether the optimization does what we're expecting.

Public Domain Dedication

  • In contributing, I relinquish any copyright claims on my contribution and freely release it into the public domain in the simple hope that it will provide value.

(Why: The freely released, copyright-free work in this repository represents an investment in a better way of doing things called attribution-based economics. Attribution-based economics is based on the simple idea that we gain more by giving more, not by holding on to things that, truly, we could only create because we, in our turn, received from others. As it turns out, an economic system based on attribution -- where those who give more are more empowered -- is significantly more efficient than capitalism while also being stable and fair (unlike capitalism, on both counts), giving it transformative power to elevate the human condition and address the problems that face us today along with a host of others that have been intractable since the beginning. You can help make this a reality by releasing your work in the same way -- freely into the public domain in the simple hope of providing value. Learn more about attribution-based economics at drym.org, tell your friends, do your part.)

@countvajhula countvajhula mentioned this pull request Mar 8, 2023
29 tasks
@countvajhula
Copy link
Collaborator Author

This is ready for review if anyone has time to give it a look. If not, no worries as there will be time for final review on the integration branch 🙂

;; uninstalled.

(define-runtime-path lexical-module-path ".")
(current-load-relative-directory lexical-module-path)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this file isn't meant to be run as a script, it might be worth using parameterize around the eval calls rather than setting this globally for the current thread (personal pet peeve when modules have unnecessary side effects :) )

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good call, I made the change.

[require-data (if (member? (report-type) (list "all" "loading"))
(list (profile-load "qi"))
null)]
[output (append local-data require-data)])
[output (~ local-data nonlocal-data require-data)])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with this ~; is it a synonym for append or output formatting somewhere?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the operator described in this blog post, and provided by the relation library 🙂

@countvajhula countvajhula merged commit 0d62895 into drym-org:lets-write-a-qi-compiler Jul 21, 2023
1 of 6 checks passed
@countvajhula countvajhula deleted the nonlocal-benchmarks-starter branch July 21, 2023 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants