-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove regexp use #40
Conversation
@@ -128,6 +129,8 @@ func TestRAMInBytes(t *testing.T) { | |||
assertSuccessEquals(t, 32, RAMInBytes, "32.3") | |||
tmp := 32.3 * MiB | |||
assertSuccessEquals(t, int64(tmp), RAMInBytes, "32.3 mb") | |||
tmp = 0.3 * MiB | |||
assertSuccessEquals(t, int64(tmp), RAMInBytes, "0.3MB") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be 0.3
<space>
MB
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, never mind; expected
is the first argument 😞 we should swap those; always confuses me if assertions use <expected> <actual>
@@ -99,11 +99,11 @@ func TestFromHumanSize(t *testing.T) { | |||
assertSuccessEquals(t, 32.5*KB, FromHumanSize, "32.5 kB") | |||
assertSuccessEquals(t, 32, FromHumanSize, "32.5 B") | |||
assertSuccessEquals(t, 300, FromHumanSize, "0.3 K") | |||
assertSuccessEquals(t, 300, FromHumanSize, ".3kB") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slightly on the fence if we want to accept .<number>
(and <number>.
) or not 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Came up with some additional test-cases (TBD if the valid/invalids are in the right one);
- tests with leading/trailing whitespace (I think generally we should be "ok" with those, and trim before use)
- tests with trailing
.
(should be an error) - tests with multiple (but valid) units, e.g.
bb
(looks like those aren't detected) - TBD: multiple spaces between number and unit (probably should be an error)
- TBD: number with
.
, but no number following (should we accept those?)
func TestFromHumanSize(t *testing.T) {
assertSuccessEquals(t, 0, FromHumanSize, " 0")
assertSuccessEquals(t, 0, FromHumanSize, " 0b")
assertSuccessEquals(t, 0, FromHumanSize, " 0B")
assertSuccessEquals(t, 0, FromHumanSize, " 0 B")
assertSuccessEquals(t, 0, FromHumanSize, "0 ")
assertSuccessEquals(t, 0, FromHumanSize, "0b ")
assertSuccessEquals(t, 0, FromHumanSize, "0B ")
assertSuccessEquals(t, 0, FromHumanSize, "0 B ")
assertSuccessEquals(t, 0, FromHumanSize, "0")
assertSuccessEquals(t, 0, FromHumanSize, "0b")
assertSuccessEquals(t, 0, FromHumanSize, "0B")
assertSuccessEquals(t, 0, FromHumanSize, "0 B")
assertSuccessEquals(t, 32, FromHumanSize, "32")
assertSuccessEquals(t, 32, FromHumanSize, "32b")
assertSuccessEquals(t, 32, FromHumanSize, "32B")
assertSuccessEquals(t, 32, FromHumanSize, "32")
assertSuccessEquals(t, 32*KB, FromHumanSize, "32k")
assertSuccessEquals(t, 32*KB, FromHumanSize, "32K")
assertSuccessEquals(t, 32*KB, FromHumanSize, "32kb")
assertSuccessEquals(t, 32*KB, FromHumanSize, "32Kb")
assertSuccessEquals(t, 32*MB, FromHumanSize, "32Mb")
assertSuccessEquals(t, 32*GB, FromHumanSize, "32Gb")
assertSuccessEquals(t, 32*TB, FromHumanSize, "32Tb")
assertSuccessEquals(t, 32*PB, FromHumanSize, "32Pb")
assertSuccessEquals(t, 32.5*KB, FromHumanSize, "32.5kB")
assertSuccessEquals(t, 32.5*KB, FromHumanSize, "32.5 kB")
assertSuccessEquals(t, 32, FromHumanSize, "32.5 B")
assertSuccessEquals(t, 300, FromHumanSize, "0.3 K")
assertSuccessEquals(t, 300, FromHumanSize, ".3kB")
assertError(t, FromHumanSize, "")
assertError(t, FromHumanSize, ".")
assertError(t, FromHumanSize, ". ")
assertError(t, FromHumanSize, " ")
assertError(t, FromHumanSize, " ")
assertError(t, FromHumanSize, " .")
assertError(t, FromHumanSize, " . ")
assertError(t, FromHumanSize, "0.")
assertError(t, FromHumanSize, "0. ")
assertError(t, FromHumanSize, "0.b")
assertError(t, FromHumanSize, "0.B")
assertError(t, FromHumanSize, "-0")
assertError(t, FromHumanSize, "-0b")
assertError(t, FromHumanSize, "-0B")
assertError(t, FromHumanSize, "-0 b")
assertError(t, FromHumanSize, "-0 B")
assertError(t, FromHumanSize, "-32")
assertError(t, FromHumanSize, "-32b")
assertError(t, FromHumanSize, "-32B")
assertError(t, FromHumanSize, "-32 b")
assertError(t, FromHumanSize, "-32 B")
assertError(t, FromHumanSize, "32.")
assertError(t, FromHumanSize, "32.b")
assertError(t, FromHumanSize, "32.B")
assertError(t, FromHumanSize, "32. b")
assertError(t, FromHumanSize, "32. B")
assertError(t, FromHumanSize, "32b.")
assertError(t, FromHumanSize, "32B.")
assertError(t, FromHumanSize, "32 b.")
assertError(t, FromHumanSize, "32 B.")
assertError(t, FromHumanSize, "32 bb")
assertError(t, FromHumanSize, "32 BB")
assertError(t, FromHumanSize, "32 b b")
assertError(t, FromHumanSize, "32 B B")
assertError(t, FromHumanSize, "32 b")
assertError(t, FromHumanSize, "32 B")
assertError(t, FromHumanSize, "hello")
assertError(t, FromHumanSize, "-32")
assertError(t, FromHumanSize, " 32 ")
assertError(t, FromHumanSize, "32m b")
assertError(t, FromHumanSize, "32bm")
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, as to what should pass and what should fail, I think we should follow the rules of strconv.ParseFloat
(plus our own way of handling suffixes). This is why I decided we should pass .3
as it's a valid floating point number in those programming languages/environments I am familiar with.
Adding your tests now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is why I decided we should pass .3 as it's a valid floating point number in those programming languages/environments I am familiar wit
Ah, gotcha, yes, that makes sense.
With the leading/trailing .
I was a bit concerned about those coming from "idk" (accidental things), so was veering on the "safe side".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assertSuccessEquals(t, 0, FromHumanSize, " 0")
assertSuccessEquals(t, 0, FromHumanSize, " 0b")
assertSuccessEquals(t, 0, FromHumanSize, " 0B")
assertSuccessEquals(t, 0, FromHumanSize, " 0 B")
Neither the old nor the new code tolerates extra leading or trailing space. We can certainly relax that, although my original intent was to replicate the functionality as-is (with the exception of accepting strings like ".33 GB" or "345. B", as described in the last commit).
go.mod
Outdated
@@ -0,0 +1,3 @@ | |||
module github.com/docker/go-units |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking if this repository should also live in the moby
org; wondering what's best; add a go.mod
already, or wait with adding until after we moved; wdyt?
In either case, may be better to do it separate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, let's move this to moby first.
The go.mod is added for selfish reasons only -- so I can actually run go test
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
reminds me, I also want to look at https://github.com/tonistiigi/units, and see where the gaps (and overlaps) are between the two, as both end up in our dependency tree; perhaps we can unify them
(I know there's some possibly "bad" choices in docker/go-units
, so maybe there's reasons for one over the other, but 🤷♂️)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #41 about move to under moby org. Once done we can move from there.
Looking into https://github.com/tonistiigi/units
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, Tonis' repo is about printing/output, and does not cover conversion/input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup; it's "the other side of go-units", so mostly complementary; but in the same area, so it could make sense to have these together (not urgent, but an option).
I checked with Tonis, and he wasn't against, so it's an option we can consider (not "MUST", but can)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it could make sense to have these together
Perhaps it's better to keep it separate, as they solve complementary, but different tasks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! overall looks good; left some comments;
also, can you drop the first commit for now (adding the go.mod
)?
Such sizes are quite valid but were never tested. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Shows on my laptop (6 iterations processed by benchstat): name time/op ParseSize-4 10.6µs ± 3% name alloc/op ParseSize-4 3.26kB ± 0% name allocs/op ParseSize-4 72.0 ± 0% Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This way, error messages will show correct line information. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Co-authored-by: Kir Kolyshkin <kolyshkin@gmail.com>
Using a regular expression brings in the whole regexp and regexp/syntax packages, which increase the resulting binary size by about 130K (Go 1.18.1, Linux/amd64). Besides, regular expressions are generally slow and incur some initialization overhead (to compile all global regexp.MustComplile variables). This, unlike the size difference, is not the main motivation for this commit, but it feels like it should have also been mentioned. A quick benchmark comparison shows a huge improvement (again, this is not why this is done, nevertheless it pleases the eye): name old time/op new time/op delta ParseSize-4 10.6µs ± 3% 2.6µs ±29% -75.10% (p=0.002 n=6+6) name old alloc/op new alloc/op delta ParseSize-4 3.26kB ± 0% 0.20kB ± 0% -93.75% (p=0.000 n=7+6) name old allocs/op new allocs/op delta ParseSize-4 72.0 ± 0% 26.0 ± 0% -63.89% (p=0.000 n=7+6) Compatibility note: As a result, we are now a but more relaxed to the input, allowing e.g. ".4 Gb", or "-0", or "234. B", following the rules of strconv.ParseFloat. It seems that those were previously rejected as a result of a regex being used, not deliberately. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
We only use a single method, RAMInBytes, which can be easily implemented locally. Do that (based on docker/go-units#40, so the implementation is fully backward-compatible, except for the addition of treating -1), and remove docker/go-units dependency. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
We only use a single method, RAMInBytes, which can be easily implemented locally. Do that (based on docker/go-units#40, so the implementation is fully backward-compatible, except for the addition of treating -1), and remove docker/go-units dependency. This implementation relies on strings.Cut() which is only available since Go 1.18. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
We only use a single method, RAMInBytes, which can be easily implemented locally. Do that (based on docker/go-units#40, so the implementation is fully backward-compatible, except for the addition of treating -1), and remove docker/go-units dependency. This implementation relies on strings.Cut() which is only available since Go 1.18. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
We only use a single method, RAMInBytes, which can be easily implemented locally. Do that (based on docker/go-units#40, so the implementation is fully backward-compatible, except for the addition of treating -1), and remove docker/go-units dependency. This implementation relies on strings.Cut() which is only available since Go 1.18. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
We only use a single method, RAMInBytes, which can be easily implemented locally. Do that (based on docker/go-units#40, so the implementation is fully backward-compatible, except for the addition of treating -1), and remove docker/go-units dependency. This implementation relies on strings.Cut() which is only available since Go 1.18. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Mostly to include docker/go-units#40 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
... to improve performance by not using regular expressions [1]. However, the potential reduction in binary size is lost because Toolbx already uses the 'regexp' package to check if a string might be the ID of an image or a valid container name. [1] go-units commit 737572633c434ce2 docker/go-units@737572633c434ce2 docker/go-units#40
... to improve performance by not using regular expressions [1]. However, the potential reduction in binary size is lost because Toolbx already uses the 'regexp' package to check if a string might be the ID of an image or a valid container name. [1] go-units commit 737572633c434ce2 docker/go-units@737572633c434ce2 docker/go-units#40 containers#1420
Copy/paste from the most important commit:
size: stop using regexp
For the rest, please see commit messages.