Skip to content

Commit

Permalink
Revert change: zero-width spaces are not spaces
Browse files Browse the repository at this point in the history
In case, it's better to deal with them in `processTerm` as a
normalization.
  • Loading branch information
lucaong committed Mar 1, 2024
1 parent 02615f2 commit 94f595e
Show file tree
Hide file tree
Showing 3 changed files with 4 additions and 6 deletions.
1 change: 0 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@
invalid operators at compile time.
- More informative error when specifying an invalid value for `combineWith`
in JavaScript (in TypeScript this would be a compile time error)
- Consider also Unicode zero-width spaces as separators in the tokenizer
- Use the Unicode flag to simplify the tokenizer regular expression

## v6.3.0 - 2023-11-22
Expand Down
3 changes: 1 addition & 2 deletions src/MiniSearch.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -1682,7 +1682,7 @@ e forse del mio dir poco ti cale`
},
{
id: 2,
text: 'The estimates range from roughly 1 in 100 to 1 in 100,000. The higher figures come from the working engineers, and the very low figures from management. What are the causes and consequences of this lack of \u200Bagreement? Since 1 part in 100,000 would imply that one could put a Shuttle up each day for 300 years expecting to lose only one, we could properly ask "What is the cause of management\'s fantastic faith in the machinery?"'
text: 'The estimates range from roughly 1 in 100 to 1 in 100,000. The higher figures come from the working engineers, and the very low figures from management. What are the causes and consequences of this lack of agreement? Since 1 part in 100,000 would imply that one could put a Shuttle up each day for 300 years expecting to lose only one, we could properly ask "What is the cause of management\'s fantastic faith in the machinery?"'
}
]
const ms = new MiniSearch({ fields: ['text'] })
Expand All @@ -1693,7 +1693,6 @@ e forse del mio dir poco ti cale`

expect(ms.search('300').length).toBeGreaterThan(0)
expect(ms.search('machinery').length).toBeGreaterThan(0)
expect(ms.search('agreement').length).toBeGreaterThan(0)
})

it('supports non-latin alphabets', () => {
Expand Down
6 changes: 3 additions & 3 deletions src/MiniSearch.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2032,6 +2032,6 @@ const objectToNumericMap = <T>(object: { [key: string]: T }): Map<number, T> =>
return map
}

// This regular expression matches any Unicode space (including zero-width
// spaces \u200B-\u200D and \uFEFF), newline, or punctuation character
const SPACE_OR_PUNCTUATION = /[\n\r\p{Z}\p{P}\u200B-\u200D\uFEFF]/u
// This regular expression matches any Unicode space, newline, or punctuation
// character
const SPACE_OR_PUNCTUATION = /[\n\r\p{Z}\p{P}]/u

0 comments on commit 94f595e

Please sign in to comment.