Skip to content

Add support for list of bigrams#23

Merged
wooorm merged 9 commits intowords:mainfrom
khxu:allow-bigram-as-input
Nov 10, 2021
Merged

Add support for list of bigrams#23
wooorm merged 9 commits intowords:mainfrom
khxu:allow-bigram-as-input

Conversation

@khxu
Copy link
Copy Markdown
Contributor

@khxu khxu commented Nov 6, 2021

This PR implements #22 to skip "bigram-ifying" if an input is already a bigram by checking if the input is an array.

Used nested ternaries for the logic -- would understand if you'd prefer not having those, though.

@codecov-commenter

This comment has been minimized.

Comment thread index.js Outdated
Comment on lines 11 to 12
var value_ = String(value).toLowerCase()
var alt = String(alternative).toLowerCase()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m quite sure this is making your later code never run though?
It does two things: cast to string, and lowercase.
The casting could be done when the value isn’t a string
The lowercase could be done for each bigram maybe?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick review! Ah yeah, there is a bug, but for a different reason -- the later code is run because left and right are assigned to the original input params (value and alternative), rather than value_ and alt. However, the bigram inputs wouldn't be case-insensitive. Made another commit that fixes the case sensitivity, but still probably not optimal from a readability perspective. Will try to come up with a more pleasant to read refactoring.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I think I've got a more readable solution committed.

Comment thread readme.md Outdated
Comment on lines +33 to +44

// bigrams may also be passed as input arguments for improved efficiency
// when analyzing the same strings repeatedly, for example, when
// comparing the text of each file in a directory with the text of
// each file in another directory.

import {bigram} from 'n-gram'

const bigramifiedString1 = bigram('abc') // ['ab', 'bc']
const bigramifiedString2 = bigram('xyz') // ['xy', 'yz']

diceCoefficient(bigramifiedString1, bigramifiedString2) // => 0
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don‘t think this needs to be in the Use section. but it should probably be in the API section, that arrays of strings are allowed, and a note that they should be bigrams?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the explanation to another code section, if that works.

Copy link
Copy Markdown
Member

@wooorm wooorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some prose suggestions. Rest looks good!

Comment thread readme.md Outdated
Comment thread readme.md Outdated
Comment thread readme.md Outdated
khxu and others added 3 commits November 9, 2021 08:01
Co-authored-by: Titus <tituswormer@gmail.com>
Co-authored-by: Titus <tituswormer@gmail.com>
Co-authored-by: Titus <tituswormer@gmail.com>
@wooorm wooorm changed the title Allow bigram as input Add support for list of bigrams Nov 10, 2021
@wooorm wooorm merged commit da9b223 into words:main Nov 10, 2021
@wooorm
Copy link
Copy Markdown
Member

wooorm commented Nov 10, 2021

released, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants