Skip to content

Commit

Permalink
docs: Add new features and fix import usage
Browse files Browse the repository at this point in the history
  • Loading branch information
FoxxMD committed Oct 26, 2023
1 parent 7337c2d commit 69b0872
Show file tree
Hide file tree
Showing 13 changed files with 140 additions and 18 deletions.
39 changes: 34 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,12 +51,14 @@ Pass a list of `ComparisonStrategy` objects using `{strategies: []}` to define w

The average of the scores from all passed strategies is returned as `highScore` (and `highScoreWeighted`) from `stringSameness()`

When no strategies are explicitly passed a default set of strategies is used, found in `@foxxmd/string-sameness/strategies`:
When no strategies are explicitly passed a default set of strategies is used, found in `import {defaultStrategies} from @foxxmd/string-sameness;`:

* [Dice's Coefficient](https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient) in [`diceSimilarities.ts`](/src/matchingStrategies/diceSimilarity.ts)
* [Cosine Similarity](https://en.wikipedia.org/wiki/Cosine_similarity) in [`cosineSimilarities.ts`](/src/matchingStrategies/cosineSimilarity.ts)
* [Levenshtein Distance](https://en.wikipedia.org/wiki/Levenshtein_distance) in [`levenSimilarities.ts`](/src/matchingStrategies/levenSimilarity.ts)

Strategies can be accessed individually using `import {strategies} from @foxxmd/string-sameness`

### Bring Your Own Strategy

Use your own strategy by creating an object that conforms to `ComparisonStrategy`:
Expand Down Expand Up @@ -107,12 +109,14 @@ const result = stringSameness('This is one sentence', 'This is another sentence'

Pass a list of functions using `{transforms: []}` to transform the strings before comparison. When not explicitly provided a default set of functions is applied to normalize the strings (to remove trivial differences):

* normalize unicode EX convert Ö => O
* convert to lowercase
* trim (remove whitespace at beginning/end)
* remove non-alphanumeric characters (punctuation and newlines)
* replace any instances of 2 or more consecutive whitespace with 1 whitespace

This default set of functions is exported as `defaultStrCompareTransformFuncs`.
* The default set of transformer functions is exported as `import {strDefaultTransforms} from @foxxmd/string-sameness;`
* All built-in transformers can be found at `import {transforms} from @foxxmd/string-sameness;`

Example of supplying your own transform functions:

Expand All @@ -128,17 +132,42 @@ const myFuncs = [
const result = stringSameness('This is one sentence', 'This is another sentence', {transforms: myFuncs});
```

## Token Re-ordering

If tokens (word) ordering in the strings is not important you can choose to have string-sameness attempt to re-order all words before comparing sameness. This makes comparison scores much closer to "absolute sameness in all characters within string". EX:

* `this is correct order`
* `order correct this is`

Scores 60 **without** reordering

Scores 100 **with** reordering

Behavior caveats:

* The **second** string argument is reordered to match the **first** string argument
* If the second string is longer than the first than any non-matched words are concatenated to the end of the re-ordered string in the same order they were found

To use:

```js
import {stringSameness} from '@foxxmd/string-sameness';

const res = stringSameness(strA, strB, {reorder: true});
```

## Factory

For convenience, a factory function is also provided:

```ts
import {createStringSameness} from "@foxxmd/string-sameness";
import {levenStrategy} from "@foxxmd/string-sameness/strategies";
import {createStringSameness, strategies} from "@foxxmd/string-sameness";
import {myTransforms, myStrats} from './util';

const {levenStrategy} = strategies;

// sets the default object to used with the third argument for `stringSameness`
const myCompare = createStringSameness({transforms: myTransforms, strategies: myStrats});
const myCompare = createStringSameness({transforms: myTransforms, strategies: [levenStrategy, ...myStrats]});

// uses myTransforms and myStrats
const plainResult = myCompare('This is one sentence', 'This is another sentence');
Expand Down
12 changes: 12 additions & 0 deletions dist/commonjs/atomic.d.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,18 @@
export interface StringComparisonOptions {
/**
* An array of transformations to apply to each string before comparing similarity
* */
transforms?: StringTransformFunc[];
/**
* An array of strategies used to score similarity. All strategies scores are combined for an average high score.
* */
strategies?: ComparisonStrategy<ComparisonStrategyResultValue>[];
/**
* Reorder second string so its token match order of first string as closely as possible
*
* Useful when only the differences in content are important, but not the order of the content
* */
reorder?: boolean;
}
export interface StringSamenessResult {
strategies: {
Expand Down
1 change: 1 addition & 0 deletions dist/commonjs/index.d.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ import { ComparisonStrategyResult, StringComparisonOptions, StringSamenessResult
import { strDefaultTransforms, transforms } from "./normalization/index.js";
declare const defaultStrategies: import("./atomic.js").ComparisonStrategy<import("./atomic.js").ComparisonStrategyResultObject>[];
declare const stringSameness: (valA: string, valB: string, options?: StringComparisonOptions) => StringSamenessResult;
export declare const reorderStr: (cleanA: string, cleanB: string, options?: StringComparisonOptions) => string;
declare const createStringSameness: (defaults: StringComparisonOptions) => (valA: string, valB: string, options?: StringComparisonOptions) => StringSamenessResult;
declare const strategies: {
diceStrategy: import("./atomic.js").ComparisonStrategy<import("./atomic.js").ComparisonStrategyResultObject>;
Expand Down
40 changes: 37 additions & 3 deletions dist/commonjs/index.js

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion dist/commonjs/index.js.map

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions dist/commonjs/normalization/index.js

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion dist/commonjs/normalization/index.js.map

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 12 additions & 0 deletions dist/esm/atomic.d.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,18 @@
export interface StringComparisonOptions {
/**
* An array of transformations to apply to each string before comparing similarity
* */
transforms?: StringTransformFunc[];
/**
* An array of strategies used to score similarity. All strategies scores are combined for an average high score.
* */
strategies?: ComparisonStrategy<ComparisonStrategyResultValue>[];
/**
* Reorder second string so its token match order of first string as closely as possible
*
* Useful when only the differences in content are important, but not the order of the content
* */
reorder?: boolean;
}
export interface StringSamenessResult {
strategies: {
Expand Down
Loading

0 comments on commit 69b0872

Please sign in to comment.