You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed that at least a few of your functions that take AbstractString can't actually deal with AbstractString, and/or does not handle non-ASCII strings, for example:
julia>parse(Identifier, Test.GenericString("a"))
ERROR: ArgumentError: regex matching is only available for the String type; use String(s) to convert
julia> DataToolkitBase.stringdist("ÆaÆb", "ÆaÆb")
ERROR: StringIndexError: invalid index [2], valid nearby indices [1]=>'Æ', [3]=>'a'
In general, I've found that
You rarely need to index into strings. If you do, use nextind, prevind etc.
Any time you need a pointer to a string (e.g. when matching a Regex to it), you need it to be Union{String, SubString{String}}
If you test your functions with Test.GenericString containing non-ASCII characters, you catch most of these bugs.
The text was updated successfully, but these errors were encountered:
Hi Jakob, thanks for taking a look and making this issue. This project is beginning to move onto the testing/documenting/tweaking phase, so your comments are particularly appreciated. I'll see what I can do to make those functions better behaved.
If you might have any other comments, please do share them. I'd really like this project to be as well-constructed as I can make it, and so far I'm the only one who's really looked into the code, so I'm very keen on feedback.
I really just noticed it when I copy-pasted your stringdist function to test in the context of a comment of yours. I don't use your package, just wanted to let you know
Yea, I'm pretty sure at this stage I'm basically the only user of this package 😛, I really want to avoid locking in bad design decisions though, which is why I'm desperate keen to get more eyes on the code.
Just mentioning this in case anything might come of it, but I understand this is a drive-by issue 🙂
I noticed that at least a few of your functions that take
AbstractString
can't actually deal withAbstractString
, and/or does not handle non-ASCII strings, for example:In general, I've found that
nextind
,prevind
etc.Union{String, SubString{String}}
Test.GenericString
containing non-ASCII characters, you catch most of these bugs.The text was updated successfully, but these errors were encountered: