The split_word_tokens function lacks documentation, leaving users in the dark about its behavior. While it works fine with strings containing only words, it fails to tokenize correctly when the input mixes words and symbols, resulting in an inaccurate tokenization.

Providing a documentation comment about its expected behavior could lead to replacing the library if it is not intended or correcting the tests.