-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use regex for lexical illusions #1174
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1174 +/- ##
=======================================
Coverage 90.14% 90.15%
=======================================
Files 83 83
Lines 1208 1209 +1
=======================================
+ Hits 1089 1090 +1
Misses 119 119
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
@@ -21,11 +21,6 @@ def check(text): | |||
"""Check the text.""" | |||
err = "lexical_illusions.misc" | |||
msg = u"There's a lexical illusion here: a word is repeated." | |||
regex = r"\b(\w+)\b\s\1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few instances where repeated words are perfectly acceptable. For example, one day I was commenting on student writing and I noticed that the student used the word "that" instead of "which". It's a hard distinction for many, and I don't blame them. So here I am now, telling you that that "that" that that student wrote ought to have been a "which".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few instances where repeated words are perfectly acceptable.
Gramatically valid, perhaps. Stylistically speaking, it's an absolute
atrocity. I had to attempt to read that example you threw in around 3
times before it parsed correctly in my mind.
Valid, yes - but also beneficial (and always possible) to avoid. An
exception has been made for punctuation inbetween word boundaries,
because splitting the lexical structure into phrases makes it readable.
I can therefore tell you that the "that" that the student wrote should
have been a "which", but there are other ways of writing about the
"which" which the student should've written without entering linguistic
territory in which a "which" which your student should've written such
that the "that" that your student wrote isn't so bad.
|
"is an English sentence used to demonstrate lexical ambiguity and the
necessity of punctuation"
I do not think from a stylistic perspective we should be referencing
esoteric sentences that serve to demonstrate poor linguistic usage when
justifying limitations.
```
The sentence is easier to understand with added punctuation and
emphasis:
James, while John had had "had", had had "had had"; "had had" had
had a better effect on the teacher.
```
The tool is designed to help people write better. Avoiding structures
like the ones you have referenced in favour of their better formatted
alternatives is doing exactly that, I feel.
Either way, the decision lies with you. I've said my piece.
|
How about we make an exception for |
Works for me. Apologies if that came off aggressive, by the way, I was just sharing my thoughts and it ended up coming out somewhat passionately |
All good, not going to complain about passion! |
It seems like to do so we will need to be able to create exceptions using the |
ad33608
to
d4cca14
Compare
@suchow The exceptions for |
d4cca14
to
b82f8bb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonderful, thank you, an excellent way to resolve this issue.
This relates to...
The usage of regular expressions rather than simple matches for
lexical_illusions.misc
.Rationale
Prior to this, the lexical illusions check would only catch
3 different lexical illusions, and always in pairs of two rather than
flagging the entire strand. This is inefficient, and does not catch
many lexical illusions at all.
Changes
lexical_illusions.misc
now uses regexlexical_illusions.misc
now reports entire strands instead of many word pairslexical_illusions.misc
reports all lexical illusions except forthat that
and
had had
Features
N/A.
Bug Fixes
N/A.
Breaking Changes and Deprecations
N/A.