Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some minor optimizations #483

Merged
merged 2 commits into from
Mar 30, 2024
Merged

Some minor optimizations #483

merged 2 commits into from
Mar 30, 2024

Conversation

pacak
Copy link
Contributor

@pacak pacak commented Mar 12, 2024

  • Add a benchmark (that works on stable toolchain) to establish the baseline
  • Disable timing by default, enable when needed
  • Disable execution counting by default

after:

newton  fastest       │ slowest       │ median        │ mean          │ samples │ iters
╰─ run  4.9 µs        │ 171.4 µs      │ 5.25 µs       │ 6.218 µs      │ 4723907 │ 4723907

before:

    newton  fastest       │ slowest       │ median        │ mean          │ samples │ iters
    ╰─ run  6.317 µs      │ 171.8 µs      │ 6.729 µs      │ 7.75 µs       │ 3807698 │ 3807698

Fixes #482

@pacak
Copy link
Contributor Author

pacak commented Mar 12, 2024

Commit that adds benchmark is not necessary, I can drop it or change to use criterion.

@pacak
Copy link
Contributor Author

pacak commented Mar 12, 2024

I'll add some docs in a second, but I'm not sure what the failures are about.

Should be working now.

@pacak pacak force-pushed the main branch 2 times, most recently from a642ad4 to 9821c93 Compare March 12, 2024 20:50
Copy link
Member

@stefan-k stefan-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank a lot for your work! I have a few comments. I understand why you made set_counting part of the State trait, but I'd rather not expose this functionality that way. Is it possible to make it part of the interface of the various states, but not the State trait?
This essentially highlights some of the issues I have with the current design of state handling (#478). I will have to think about it, but that may take a bit.

Btw, have you tried whether making line 224 in executor.rs optional helps? Unfortunately I can't make a review comment there because the PR doesn't touch this code. I mean this line:

state.func_counts(&self.problem);

I understand that it does not remove all counts handling, but I'm just interested how that affects performance.

Finally I would prefer to not have the benchmark as part of this PR, but you can of course keep it in the PR until it is ready to be merged.

crates/argmin/src/core/executor.rs Outdated Show resolved Hide resolved
crates/argmin/src/core/state/mod.rs Outdated Show resolved Hide resolved
*count = v
if self.counting_enabled {
for (k, &v) in problem.counts.iter() {
let count = self.counts.entry(k.to_string()).or_insert(0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, have you been able to find out whether hashing is the problem or the allocation of .to_string()? In case it was the latter, I was wondering if Cow would help?
If it's hashing, I was wondering if replacing HashMap<String, u64> with HashMap<SomeEnum, u64> would help, where

enum SomeEnum {
    CostCount,
    GradientCount,
    OperatorCount,
    JacobianCount,
    HessianCount,
    AnnealCount,
    Other(String),
}

(or something along those lines). That would cover all the function calls that in argmin, but also allows one to count function calls defined in external code via Other. It also avoids .to_string().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I remember it's 35% allocation and 65% hashing. HashMap is DoS resistant by default. Changing the hash function or switching to BTreeMap combined with enum approach you propose might help a bit I guess, but this changes public API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind changing the public API in that case. Why do you propose a BTreeMap instead of a HashMap? Hashing an enum should be quick (I expect it to use it's integer representation as a hash with a bit of additional overhead for hashing Other(String)). Anyway I think this goes beyond this PR and can certainly be left as a future improvement.

@pacak
Copy link
Contributor Author

pacak commented Mar 13, 2024

Is it possible to make it part of the interface of the various states, but not the State trait?

Should be possible for as long as I don't add "enabled if observers are present". Since you suggest to remove that logic - I can move it to part of the each interface. Will try that today.

Btw, have you tried whether making line 224 in executor.rs optional helps?
I understand that it does not remove all counts handling, but I'm just interested how that affects performance.

This removes about half of the overhead related to counting I think. Better than nothing

Finally I would prefer to not have the benchmark as part of this PR, but you can of course keep it in the PR until it is ready to be merged.

Absolutely, I just wanted to show what exactly I'm measuring. Will drop before merging.

@pacak
Copy link
Contributor Author

pacak commented Mar 13, 2024

  1. made a change to comments
  2. renamed set_counting to counting and moved it to corresponding states instead of a trait
  3. removed logic that enables counting when observer is present

Copy link
Member

@stefan-k stefan-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the very late reply, unfortunately I wasn't able to get to it earlier.

crates/argmin/src/core/executor.rs Outdated Show resolved Hide resolved
*count = v
if self.counting_enabled {
for (k, &v) in problem.counts.iter() {
let count = self.counts.entry(k.to_string()).or_insert(0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind changing the public API in that case. Why do you propose a BTreeMap instead of a HashMap? Hashing an enum should be quick (I expect it to use it's integer representation as a hash with a bit of additional overhead for hashing Other(String)). Anyway I think this goes beyond this PR and can certainly be left as a future improvement.

@pacak
Copy link
Contributor Author

pacak commented Mar 26, 2024

Hashing an enum should be quick (I expect it to use it's integer representation as a hash with a bit of additional overhead for hashing Other(String)).

By default HashMap uses SipHash 1-3 with random seed which is slow. What's more underlying SwissTable hash doesn't perform well when values are not random. Either way - I don't mind which implementation is used as long as it can be disabled. To minimize the overhead of running with it enabled if changing the API is not a problem I'd look into enum_map crate. SomeEnum you proposed above can then be parametrized with some type T for custom measurements by external users. enum_map maps values into an array so should be pretty fast. But yea, this is outside of the scope of this pull request.

@pacak
Copy link
Contributor Author

pacak commented Mar 26, 2024

New changes:

  1. I removed the benchmark
  2. Adding an observer will not enable timing

@pacak
Copy link
Contributor Author

pacak commented Mar 29, 2024

Fixed the failing test. FWIW I'm getting test failures around the places I never touched...

This one is related to floating point accuracy - probably EPSILON is too large..

$ cargo test -p argmin --all-features

---- solver::gaussnewton::gaussnewton_linesearch::tests::test_next_iter_regression stdout ----
thread 'solver::gaussnewton::gaussnewton_linesearch::tests::test_next_iter_regression' panicked at crates/argmin/src/solver/gaussnewton/gaussnewton_linesearch.rs:445:9:
assert_relative_eq!(state.param.as_ref().unwrap()[1], 2.25f64, epsilon = f64::EPSILON)

    left  = 2.2499999999999964
    right = 2.25


note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    solver::gaussnewton::gaussnewton_linesearch::tests::test_next_iter_regression

Not sure what is this about at all

$ cargo test --all-features
   Compiling argmin-math v0.4.0 (/home/pacak/ej/argmin/crates/argmin-math)
error[E0277]: the trait bound `num_complex::Complex<f32>: ndarray_linalg::Lapack` is not satisfied
  --> crates/argmin-math/src/ndarray_m/inv.rs:18:13
   |
18 |             Array2<$t>: Inverse,
   |             ^^^^^^^^^^^^^^^^^^^ the trait `ndarray_linalg::Lapack` is not implemented for `num_complex::Complex<f32>`
...
38 | make_inv!(Complex<f32>);
   | ----------------------- in this macro invocation
   |
   = help: the following other types implement trait `ndarray_linalg::Lapack`:
             f32
             f64
             nalgebra::Complex<f32>
             nalgebra::Complex<f64>
   = note: required for `ArrayBase<OwnedRepr<num_complex::Complex<f32>>, ndarray::Dim<[usize; 2]>>` to implement `Inverse`
   = help: see issue #48214
   = note: this error originates in the macro `make_inv` (in Nightly builds, run with -Z macro-backtrace for more info)

error[E0277]: the trait bound `num_complex::Complex<f64>: ndarray_linalg::Lapack` is not satisfied
  --> crates/argmin-math/src/ndarray_m/inv.rs:18:13
   |
18 |             Array2<$t>: Inverse,
   |             ^^^^^^^^^^^^^^^^^^^ the trait `ndarray_linalg::Lapack` is not implemented for `num_complex::Complex<f64>`
...
39 | make_inv!(Complex<f64>);
   | ----------------------- in this macro invocation
   |
   = help: the following other types implement trait `ndarray_linalg::Lapack`:
             f32
             f64
             nalgebra::Complex<f32>
             nalgebra::Complex<f64>
   = note: required for `ArrayBase<OwnedRepr<num_complex::Complex<f64>>, ndarray::Dim<[usize; 2]>>` to implement `Inverse`
   = help: see issue #48214
   = note: this error originates in the macro `make_inv` (in Nightly builds, run with -Z macro-backtrace for more info)

For more information about this error, try `rustc --explain E0277`.
error: could not compile `argmin-math` (lib) due to 2 previous errors
warning: build failed, waiting for other jobs to finish...
error: could not compile `argmin-math` (lib test) due to 2 previous errors

@stefan-k
Copy link
Member

Fixed the failing test. FWIW I'm getting test failures around the places I never touched...

I suspect you need to rebase onto main. There was a Rust version update and that typically causes new clippy lints to fail, but that was fixed in #488 .

This one is related to floating point accuracy - probably EPSILON is too large..

$ cargo test -p argmin --all-features
...

We've had this before -- it typically only fails locally, not in the CI. We couldn't figure out why ... if it does fail in the CI too I'm happy to increase the epsilon, but judging from the current state of the CI run, it seems to pass.

Not sure what is this about at all

That's really strange... but it seems to pass in the CI.

I hope to be able to do another review today! :)

@codecov-commenter
Copy link

codecov-commenter commented Mar 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.09%. Comparing base (d5e1f3c) to head (79a274a).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #483      +/-   ##
==========================================
- Coverage   92.11%   92.09%   -0.03%     
==========================================
  Files         178      178              
  Lines       24398    24455      +57     
==========================================
+ Hits        22475    22521      +46     
- Misses       1923     1934      +11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pacak
Copy link
Contributor Author

pacak commented Mar 29, 2024

I suspect you need to rebase onto main.

Done. In fact I have scripts to automagically rebase onto origin/master, then someone smart decided to rename the default branch.

Copy link
Member

@stefan-k stefan-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a lot!

@stefan-k stefan-k merged commit a9e3eb7 into argmin-rs:main Mar 30, 2024
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Various non productive overhead
3 participants