-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
builtin: Make all string.index methods return ?int (second try) #13758
builtin: Make all string.index methods return ?int (second try) #13758
Conversation
All string index methods that end with `_opt` return `?int` with none as the value when the substring is not found and the index methods that do not end in `_opt` return `int` with -1 as the returned value when the substring is not found
@JalonSolov @yuyi98 @spytheman I think this one will make everyone happy. |
I for myself am really not a fan of duplicated methods. So having second variants (those ending with Also this PR doesn't endorse safe-by-default behavior - i.e. the syntactically shorter variant shall be the one returning All in all what I'd be interested in the most are the (rudimentary) benchmarks @spytheman asked for in the previous PR discussion. |
The benchmarks. Int version: Each code was built using To run this I generated an First I made them For the benchmark I commented out the lines where I defined the variables The
You can check the code I used in https://gist.github.com/txgruppi/31663efdb380ced7f8d31f6f5f4c3d98 |
Thanks for the benchmarks. |
I've made some simplifications to the Here is the result on my machine:
Except for the last case ( When using -prod, both clang and gcc managed to optimize well the difference. I have to note however, that when I compared the more common mixed usage ( |
Another important consideration is readability - the Compare: for i := 0; i < max; i++ {
mut start := -1
mut end := -1
for {
start = raw.index_after(atag, end)
if start == -1 {
break
}
start = raw.index_after(href, start)
if start == -1 {
break
}
start += 6
end = start
start = raw.index_after(gt, end)
if start == -1 {
break
}
start += 1
end = raw.index_after(lt, start)
if end == -1 {
break
}
}
checksum+=u64(end)
} vs for i := 0; i < max; i++ {
mut start := -1
mut end := -1
for {
start = raw.index_after_opt(atag, end) or { break }
start = raw.index_after_opt(href, start) or { break }
start += 6
end = start
start = raw.index_after_opt(gt, end) or { break }
start += 1
end = raw.index_after_opt(lt, start) or { break }
}
checksum+=u64(end)
} Given all of the above, I think the faster In this way, users could call the more inconvenient, but faster versions where that matters, and the more convenient _opt variants, outside of hot loops, while code duplication will be minimized, and breaking changes will be avoided. @medvednikov what do you think? |
Actually, I am wrong about this - currently on master, .index() does return an In that case, I do support using |
Any change we make will be a breaking change. In the current state we have some methods returning Let me know what is the final decision after you're all in agreement so I can update the PR. Cheers. |
# Conflicts: # vlib/toml/checker/checker.v
I think it is ready for you. |
I tried running |
@txgruppi please fix VLS server/diagnostics.v:12:24: error: index_after() returns an option, so it should have either an `or {}` block, or `?` at the end
10 | }
11 |
12 | line_colon_idx := msg.index_after(':', 2) // deal with `d:/v/...:2:4: error: ...`
| ~~~~~~~~~~~~~~~~~~~
13 | if line_colon_idx < 0 {
14 | return none
server/diagnostics.v:21:23: error: index_after() returns an option, so it should have either an `or {}` block, or `?` at the end
19 | }
20 |
21 | col_colon_idx := msg.index_after(':', line_colon_idx + 1)
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
22 | colon_sep_idx := msg.index_after(':', col_colon_idx + 1)
23 | msg_type_colon_idx := msg.index_after(':', colon_sep_idx + 1)
server/diagnostics.v:22:23: error: index_after() returns an option, so it should have either an `or {}` block, or `?` at the end
20 |
21 | col_colon_idx := msg.index_after(':', line_colon_idx + 1)
22 | colon_sep_idx := msg.index_after(':', col_colon_idx + 1)
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
23 | msg_type_colon_idx := msg.index_after(':', colon_sep_idx + 1)
24 | if msg_type_colon_idx == -1 || col_colon_idx == -1 || colon_sep_idx == -1 {
server/diagnostics.v:23:28: error: index_after() returns an option, so it should have either an `or {}` block, or `?` at the end
21 | col_colon_idx := msg.index_after(':', line_colon_idx + 1)
22 | colon_sep_idx := msg.index_after(':', col_colon_idx + 1)
23 | msg_type_colon_idx := msg.index_after(':', colon_sep_idx + 1)
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
24 | if msg_type_colon_idx == -1 || col_colon_idx == -1 || colon_sep_idx == -1 {
25 | return error('idx is -1') |
Any news here? |
assert x.index_any('ef') == 4 | ||
assert x.index_any('fe') == 4 | ||
assert x.index_any_int('ef') == 4 | ||
assert x.index_any_int('fe') == 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we test the none
versions also? Or if not in addition to _int
versions, then at least the none
versions as they are generally wrappers around the _int
versions which will be thus tested as well.
This holds for other _test
files as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JalonSolov I think except for this suggestion I think it is ready for merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One file conflict still to be resolved, and need the tests to run again.
What are the failing checks? I think we can re-run CI as the PR looks OK to me. |
@txgruppi the PR is fine, only a small fix was missing in |
This PR is ready to be merged then, as the only failing test is due to vlang/ved#140 not yet merged |
I want to update you all. I was adding more tests to cover the new methods but I got some urgent stuff to do at work and this is taking all the time I had to work on this MR. As soon as things go back to normal at work I will add tests. |
(it was to resolve conflicts, not useless don't worry Delyan :)) |
* master: (24 commits) tools: build c2v in non verbose mode by default tools: use os.system for the c2v runs to monitor the progress more easily tools: fix `v vet file.v` for `return if x { y // comment } else { z }` ast: fix array of reference sumtype appending (vlang#14797) pref: disable gc for translated code builder: add -c when building object files sokol: mark pub structs checker: improve pub struct check (fix vlang#14446) (vlang#14777) tools: do show the output of c2v, when it fails tools: fix the first run of `v translate hw.c` ci: use VTEST_JUST_ESSENTIAL=1 for the -cstrict test-self task in ubuntu-clang too cmd: enable `v translate`, download and install c2v ci: use VTEST_JUST_ESSENTIAL=1 for the ubuntu -cstrict gcc task (prevent 2 hour runs) native: initial support for `defer` (vlang#14779) parser, cgen: temporary prefix ++ for translated code tools: handle fn attributes/comments more robustly, when `v missdoc` is run (vlang#14774) cgen: add a minor optimisation for array.push_many (vlang#14770) pref: is_o orm: mysql fixes (vlang#14772) builder: handle linker errors when building .o files ...
…-funcs-return-opt-int
…t' into string-index-funcs-return-opt-int
# Conflicts: # vlib/v/parser/parser.v
This one has been out a while - any further thoughts? |
6 conflicting files need to be fixed, now. |
The string type has come index methods like (index, index_after, etc) and some return
int
and others return?int
.This MR makes all of them return
?int
.The tests (
v test-all
) have beem updated to work with the changes but there are some tests that were skipped in my machine.All string index methods that end with
_opt
return?int
with none as the value when the substring is not foundand the index methods that do not end in
_opt
returnint
with -1 as the returned value when the substring is not foundThis PR is the second iteration on the idea. Origin PR here: #13693
It was easier to dump the last changes and redo them from the latest weekly tag.
VLS PR: vlang/vls#329