mat: format complex dense matrices #1707

wamuir · 2021-08-12T15:09:11Z

Please take a look.

This PR adds a CFormat function in gonum/mat to format complex dense matrices; this function is intended to complement the current Format function. Relevant issue #641 and a relevant PR #1414, although that PR appears to have stalled. Further, PR #1414 takes the approach of wholesale duplicating format.go, which may not be preferred.

Summary:

Following #1414 (comment), an objective was to avoid code duplication while implementing a formatter for complex matrices. To summarize the proposed approach: value formatters are added for formatting floats and complex numbers, which are then used during matrix formatting. Since these value formatters satisfy the fmt.Formatter interface, fmt.State can be passed, offering simplification in several areas (e.g., re-construction of a format string is no longer necessary). As it would be inefficient to spin up a a new formatter for each element in a matrix, a value formatter holds a buffer for reuse.

To tackle widthing, a state type is proposed (satisfying fmt.State) that can be fit to a matrix and that stores information on column widths, using the existing widther type. As before, iterating through matrix elements is required. After a call to At, a state will respond to subsequent Width calls with the relevant column width. Recommendations toward improving implementation are appreciated.

Notwithstanding the need for additional internal types, this feels simpler to me on the whole and I believe this approach should make extending to new syntaxes or data types easier, if desired. Thanks for reviewing -- will address feedback/comments.

Bench:

I've included a few benchmarks below. These compare the current and proposed implementation when formatting 10x10, 100x100 and 1000x1000 dense matrices.

implementation	benchmark	iter	time	bytes	alloc
current	BenchmarkFormat10	36134	33063 ns/op	2693 B/op	16 allocs/op
current	BenchmarkFormat100	358	3747403 ns/op	1672025 B/op	137 allocs/op
current	BenchmarkFormat1000	3	334034817 ns/op	163557800 B/op	1066 allocs/op
proposed	BenchmarkFormat10	22086	54554 ns/op	2886 B/op	19 allocs/op
proposed	BenchmarkFormat100	170	7072273 ns/op	1670752 B/op	49 allocs/op
proposed	BenchmarkFormat1000	2	697675398 ns/op	163542896 B/op	80 allocs/op

codecov-commenter · 2021-08-13T07:48:43Z

Codecov Report

Merging #1707 (5e086de) into master (85ca896) will increase coverage by 0.03%.
The diff coverage is 92.59%.

❗ Current head 5e086de differs from pull request most recent head 6ce02c8. Consider uploading reports for the commit 6ce02c8 to get more accurate results

@@            Coverage Diff             @@
##           master    #1707      +/-   ##
==========================================
+ Coverage   73.26%   73.29%   +0.03%     
==========================================
  Files         510      511       +1     
  Lines       60132    60152      +20     
==========================================
+ Hits        44056    44090      +34     
+ Misses      13584    13581       -3     
+ Partials     2492     2481      -11

Impacted Files	Coverage Δ
mat/format_value.go	`84.41% <84.41%> (ø)`
mat/format.go	`98.67% <98.21%> (+8.88%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 85ca896...6ce02c8. Read the comment docs.

sbinet · 2021-08-13T07:55:57Z

just a drive-by comment:
could you run the benchmarks before/after the proposed optimization a couple of times (say 10-20) and show the result of benchstat?
with the limited statistics you presented in the commit message, I get this rather poor statistical power result:

$> benchstat ./ref.txt ./new.txt 
name        old time/op    new time/op    delta
Format10      33.1µs ± 0%    54.6µs ± 0%   ~     (p=1.000 n=1+1)
Format100     3.75ms ± 0%    7.07ms ± 0%   ~     (p=1.000 n=1+1)
Format1000     334ms ± 0%     698ms ± 0%   ~     (p=1.000 n=1+1)

name        old alloc/op   new alloc/op   delta
Format10      2.69kB ± 0%    2.89kB ± 0%   ~     (p=1.000 n=1+1)
Format100     1.67MB ± 0%    1.67MB ± 0%   ~     (p=1.000 n=1+1)
Format1000     164MB ± 0%     164MB ± 0%   ~     (p=1.000 n=1+1)

name        old allocs/op  new allocs/op  delta
Format10        16.0 ± 0%      19.0 ± 0%   ~     (p=1.000 n=1+1)
Format100        137 ± 0%        49 ± 0%   ~     (p=1.000 n=1+1)
Format1000     1.07k ± 0%     0.08k ± 0%   ~     (p=1.000 n=1+1)

ie:

$> go test -run=NONE -bench=Format -benchmem -count=20 |& tee ref.txt
## apply optimizations
$> go test -run=NONE -bench=Format -benchmem -count=20 |& tee new.txt
$> benchstat ./ref.txt ./new.txt

thanks.

PS: benchstat is available here:

golang.org/x/perf/benchstat

kortschak · 2021-08-13T07:57:54Z

@wamuir Thank you for doing this. I've only taken a brief look, but it looks like nice work so far. I hope to be able to have a deeper look over the weekend.

mat/format_test.go

mat/format.go

mat/format_value.go

mat/format.go

mat/format_value.go

wamuir · 2021-08-13T14:24:06Z

just a drive-by comment:
could you run the benchmarks before/after the proposed optimization a couple of times (say 10-20) and show the result of benchstat?
with the limited statistics you presented in the commit message, I get this rather poor statistical power result:

$> benchstat ./ref.txt ./new.txt 
name        old time/op    new time/op    delta
Format10      33.1µs ± 0%    54.6µs ± 0%   ~     (p=1.000 n=1+1)
Format100     3.75ms ± 0%    7.07ms ± 0%   ~     (p=1.000 n=1+1)
Format1000     334ms ± 0%     698ms ± 0%   ~     (p=1.000 n=1+1)

name        old alloc/op   new alloc/op   delta
Format10      2.69kB ± 0%    2.89kB ± 0%   ~     (p=1.000 n=1+1)
Format100     1.67MB ± 0%    1.67MB ± 0%   ~     (p=1.000 n=1+1)
Format1000     164MB ± 0%     164MB ± 0%   ~     (p=1.000 n=1+1)

name        old allocs/op  new allocs/op  delta
Format10        16.0 ± 0%      19.0 ± 0%   ~     (p=1.000 n=1+1)
Format100        137 ± 0%        49 ± 0%   ~     (p=1.000 n=1+1)
Format1000     1.07k ± 0%     0.08k ± 0%   ~     (p=1.000 n=1+1)

ie:

$> go test -run=NONE -bench=Format -benchmem -count=20 |& tee ref.txt
## apply optimizations
$> go test -run=NONE -bench=Format -benchmem -count=20 |& tee new.txt
$> benchstat ./ref.txt ./new.txt

thanks.

PS: benchstat is available here:

* [golang.org/x/perf/benchstat](https://golang.org/x/perf/benchstat)

Thanks @sbinet, I wasn't aware of benchstat. Cool. Output is below. I added -benchtime=1x, because this appears to otherwise perform tests using the pooled mean? If true, I'd think the dispersion of time/op (±) is otherwise understated. I guess that's a nit on go test for not having a flag to output variance (or a nit on me for not being able to find it).

name	old time/op	new time/op	delta
Format10	44.2µs ±17%	73.9µs ±38%	+67.07% (p=0.000 n=20+18)
Format100	3.28ms ± 3%	7.55ms ± 5%	+130.21% (p=0.000 n=20+18)
Format1000	329ms ± 1%	764ms ± 2%	+132.17% (p=0.000 n=20+20)

Increase in clock time is statistically significant. @kortschak provided review comments, which I'll work through over the weekend and then rerun the bench. One of these comments is to use micro-benchmarking and I expect this will offer greater insight into differences. Addressing his comments may also close the current gap (e.g., his comment regarding fmt.Fprint and reflect).

wamuir · 2021-08-14T15:36:53Z

Most recent commit addresses review comments, except for the switch to sub-benchmarks. I'll need to push a second commit to update the benchmarks. I attempted some further simplification of the format functions and I think this was productive. I also added comments along the way. Updated benchstat output is below.

name	old time/op	new time/op	delta
Format10	44.9µs ±10%	62.9µs ±31%	+40.08% (p=0.000 n=18+20)
Format100	3.31ms ± 1%	4.27ms ± 1%	+29.20% (p=0.000 n=18+20)
Format1000	326ms ± 1%	426ms ± 1%	+30.41% (p=0.000 n=20+20)

…menting

wamuir · 2021-08-14T20:30:43Z

name	old time/op	new time/op	delta
Format/10/General	52.6µs ±17%	68.8µs ±18%	+30.74% (p=0.000 n=19+19)
Format/10/DotByte	49.5µs ±21%	67.9µs ±17%	+37.19% (p=0.000 n=19+20)
Format/10/Excerpt	50.3µs ±11%	66.1µs ±21%	+31.40% (p=0.000 n=19+20)
Format/10/Prefix	49.4µs ±10%	66.4µs ±17%	+34.54% (p=0.000 n=17+20)
Format/10/Squeeze	53.0µs ±22%	69.7µs ±14%	+31.49% (p=0.000 n=20+20)
Format/10/MATLAB	51.5µs ±12%	66.4µs ±12%	+28.79% (p=0.000 n=20+19)
Format/10/MATLAB#	43.0µs ±18%	42.9µs ±23%	~ (p=0.967 n=20+20)
Format/10/Python	50.4µs ±13%	64.0µs ± 9%	+26.99% (p=0.000 n=20+19)
Format/10/Python#	42.4µs ±37%	43.1µs ±30%	~ (p=0.757 n=20+20)
Format/100/General	3.13ms ± 7%	4.43ms ± 9%	+41.42% (p=0.000 n=17+20)
Format/100/DotByte	3.94ms ± 1%	4.35ms ± 8%	+10.32% (p=0.000 n=19+20)
Format/100/Excerpt	4.00ms ± 1%	4.54ms ± 6%	+13.45% (p=0.000 n=20+20)
Format/100/Prefix	4.00ms ± 1%	5.61ms ± 1%	+40.23% (p=0.000 n=20+16)
Format/100/Squeeze	4.42ms ± 1%	5.35ms ±18%	+21.01% (p=0.000 n=14+20)
Format/100/MATLAB	4.68ms ±15%	6.46ms ±13%	+37.92% (p=0.000 n=20+20)
Format/100/MATLAB#	2.91ms ± 3%	2.90ms ± 2%	~ (p=0.247 n=20+20)
Format/100/Python	6.33ms ±16%	6.38ms ±35%	~ (p=0.890 n=20+20)
Format/100/Python#	3.56ms ±13%	1.69ms ± 1%	-52.45% (p=0.000 n=20+19)
Format/1000/General	306ms ± 0%	405ms ± 0%	+32.08% (p=0.000 n=19+20)
Format/1000/DotByte	303ms ± 0%	392ms ± 0%	+29.54% (p=0.000 n=20+20)
Format/1000/Excerpt	308ms ± 1%	405ms ± 0%	+31.76% (p=0.000 n=20+20)
Format/1000/Prefix	307ms ± 0%	404ms ± 0%	+31.67% (p=0.000 n=20+20)
Format/1000/Squeeze	307ms ± 0%	405ms ± 0%	+31.83% (p=0.000 n=20+20)
Format/1000/MATLAB	307ms ± 0%	405ms ± 0%	+31.99% (p=0.000 n=19+20)
Format/1000/MATLAB#	164ms ± 1%	164ms ± 1%	~ (p=0.561 n=19+19)
Format/1000/Python	308ms ± 1%	406ms ± 1%	+31.86% (p=0.000 n=20+18)
Format/1000/Python#	164ms ± 1%	164ms ± 1%	~ (p=0.604 n=20+19)
[Geo mean]	3.74ms	4.43ms	+18.29%

vladimir-ch · 2023-08-01T10:09:45Z

This PR has gone a bit stale. @wamuir are you still interested in submitting this change?

wamuir · 2023-08-01T15:56:18Z

@vladimir-ch Yes. I think this needs another review. I can make revisions.

wamuir added 8 commits August 11, 2021 18:33

mat: benchmark formatter

4523652

mat: verify interface compliance for types in format

4a834a6

mat: return formatter by reference

9d360ce

mat: add value formatter for floats

c18d067

mat: generalize matrix formatter; implement widthing state

3e8ad02

mat: buffer line endings in format funcs

4f0fe9c

mat: use fmt to write strings while formatting mats to foreign syntax

adc58ed

mat: format complex matrices

5e086de

kortschak reviewed Aug 13, 2021

View reviewed changes

mat: address review comments; simplify format funcs; improve code com…

655cda0

…menting

wamuir force-pushed the mat-formats branch from 2fd8830 to 6ce02c8 Compare August 14, 2021 20:20

mat: use sub-benchmarks for format

6ce02c8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mat: format complex dense matrices #1707

mat: format complex dense matrices #1707

wamuir commented Aug 12, 2021

codecov-commenter commented Aug 13, 2021 •

edited

sbinet commented Aug 13, 2021

kortschak commented Aug 13, 2021

wamuir commented Aug 13, 2021 •

edited

wamuir commented Aug 14, 2021

wamuir commented Aug 14, 2021

vladimir-ch commented Aug 1, 2023

wamuir commented Aug 1, 2023

mat: format complex dense matrices #1707

Are you sure you want to change the base?

mat: format complex dense matrices #1707

Conversation

wamuir commented Aug 12, 2021

codecov-commenter commented Aug 13, 2021 • edited

Codecov Report

sbinet commented Aug 13, 2021

kortschak commented Aug 13, 2021

wamuir commented Aug 13, 2021 • edited

wamuir commented Aug 14, 2021

wamuir commented Aug 14, 2021

vladimir-ch commented Aug 1, 2023

wamuir commented Aug 1, 2023

codecov-commenter commented Aug 13, 2021 •

edited

wamuir commented Aug 13, 2021 •

edited