InStr() Behavior: Blank "Needle" #13

Uberi opened this Issue Apr 25, 2012 · 1 comment


None yet
2 participants

Uberi commented Apr 25, 2012

When InStr() is called with a blank Needle parameter, the search always succeeds, and the match is found at the first character. Besides the fact that this behavior is undocumented, this seems counterintuitive to me.

I don't want to get into a philisophical debate about whether the nothingness is there if it doesn't exist, so I'll just give a few cases where it has led to bugs :P

Case 1:

Position := 1
While, InStr("0123456789",SubStr(Text,Position,1))
    Position ++

The bug here is subtle; the loop is infinite because when SubStr() goes past the end of the string, it returns a blank string, and InStr() will always match that at position 1.

Case 2:

If InStr("Red|Orange|Yellow|Green|Blue|Indigo|Violet",Color)
    MsgBox, The color is valid.

Here, we see another bug; if Color is blank, InStr() will match in the list anyways.


Of course, this would be a backwards-compatibility breaking change. As a result I can't recommend putting it into the v1 series of AHK, but at least documenting it would be nice.

Perhaps something to consider to v2?


Lexikos commented May 24, 2012

It is documented, in the same way that every other result is documented:

InStr: Returns the position of an occurrence of the string Needle in the string Haystack.

The following demonstrates InStr returning the position of an occurence of the empty string in Haystack:

Haystack := "any string"
Needle := ""
p := InStr(Haystack, Needle)
MsgBox % SubStr(Haystack, p, StrLen(Needle)) = Needle ? "Needle matches" : "fail"

The current behaviour is consistent with RegEx, which can match an empty string at any position in any string (excluding certain zero-length assertions).

Note that AutoHotkey uses strstr/wcsstr or a similar function, depending on StringCaseSense.

strstr/wcsstr: If strSearch points to a string of zero length, the function returns str.

As for philosophy (or semantics): the empty string exists; it merely does not contain any characters.

documenting it

The current documentation for InStr is a "wall of text". The longer it gets, the more difficult it is to find any specific piece of information. At some point the documentation for InStr and numerous other functions should be split into separate pages and restructured, as I've done for NumGet and some others in the v2 docs. Then I think it would be acceptable for the empty string behaviour to be documented explicitly, since as you pointed out, it can be a pitfall.

changing it

I see your point, but it doesn't seem enough. I'm not convinced that it is counter-intuitive, or that any alternative is intuitive. The current behaviour has been around a long time, in AutoHotkey and other languages. If you think this topic warrants further discussion, I suggest starting a thread on the forum.

Lexikos closed this May 24, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment