Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

word_count() is not accurate when counting sentences with quotes #2

Closed
DaveChild opened this issue Dec 2, 2010 · 2 comments
Closed

Comments

@DaveChild
Copy link
Owner

Issue transferred from Google Code:

Here's the test case:

public function testWordCountWithQuotes() {
$textStats = new TextStatistics();
$text = ""There should be seven words," said Joe";

$expected = 7;
$actual = $textStats->word_count($text); // value is 8

$this->assertEqual($actual, $expected);

}

Here's a possible fix:

In the clean_text(), replace:

$strText = preg_replace('/[,:;()-]/', ' ', $strText); // Replace commans,

hyphens etc (count them as spaces)

with:

$strText = preg_replace('/[",:;()-]/', ' ', $strText); // Replace double

quotes, commans, hyphens etc (count them as spaces)

@DaveChild
Copy link
Owner Author

Formatting a bit stuffed ... original is here:

http://code.google.com/p/php-text-statistics/issues/detail?id=6

@DaveChild
Copy link
Owner Author

This was added to the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant