New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generic count
for stdlib use outside len
#217
Comments
I like |
for js unicode there's https://nim-lang.org/docs/unicode.html#runeLen%2Cstring |
Why do we need |
The point here is to make sure generic code doesn't use it. If we have a concept like Also, this doesn't close nim-lang/Nim#14162, I miscommunicated, |
Well so More importantly, the premise is wrong: In practice length information in a generic context can be valuable and speed up computations even if len is O(N): var result = newSeq[type(iter)](iter2.len) # ok, need to iterate over the C string once
var i = 0
for x in iter2: # ok, need to iterate over it once again, but now it's in the cache
result[i] = x
inc i
result
|
The point here is we separate everything that has template count(x: Indexable): int = x.len when Indexable is implemented, or for now just I'll add a |
len
for cstring and use generic count
insteadcount
for stdlib use outside len
As I said, the issue is unconvincing for today's computers. Preallocated sizes are often more important to have than avoiding O(N) traversals and if |
The issue has less to do with the hardware and more to do with the programmers being able to communicate with one another as to the nature of the operation. You admit that O(n) won't always be avoided, so you can't also argue that |
For ensembles it can be In any case, trees, graphs and other linked data-structure might not store counts of a subgraph/subtree and a |
Weird RFC, closing since there's no real point here |
Originally discussed in nim-lang/Nim#14162. Writing so many RFCs is taxing but this needs its own place to be discussed.
Problem:
The length of some data structures (lists, cstrings) are non-constant. This can cause some algorithms that use the length but don't need it, like generatingcstring
is a victim of improper use by templates/generic procs that check for alen
overload on a type.cstring
slen
is O(n) with n pointer dereferences which is needlessly complex for anything namedlen
.newSeq(x.len)
for efficiency instead of an empty seq and adding to it, to be needlessly slow.Proposal:
len
should be conventionally defined for procs that have constant/fast access to their length, and there should be a generic unarycount
(maybe go in sequtils? can debate name) that iterates over the type.len
for cstring should be deprecated outside of JS.This works on top of
sequtils.count
since that one takes 2 arguments, same withcountIt
. It's also in line with how other languages implement and apply meaning tocount
. An alternative implementation could be:After defining this original count, we can extend it to better support types that have
len
defined, like so:This implementation would break the use of
count
for iterators, but the point ofcount
is to write better algorithms for data structures, you can just usetoSeq
at that point.countIt
works otherwise.Flaws & backwards compatibility: The name
count
should be backwards compatible, other options includegetLen
,countLen
.Deprecating cstring.len is the crux of this issue.I have an idea as to how you can deprecate it for JS too. JS cstring actually counts unicode characters as 1 character (nim-lang/Nim#10911), so we might use a new overload instead oflen
here, but I don't know what the name would be.If not, and JS cstring len should be kept, then that's also doable, but it would be hard to document. Declaration branching between different backends is never fun and when it's for a deprecated symbol that's even worseJS cstring is a topic for another day, not deprecatinglen
for cstrings is fine,c_strlen
should stay in system anywayThe main backwards compatibility issue is porting standard library procs to
count
, this would only happen when we start using concepts though and that's going to be backwards incompatible in of itselfThe text was updated successfully, but these errors were encountered: