-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sentence-case #26
sentence-case #26
Conversation
Can we add a test to https://github.com/zotero/utilities/blob/master/test/tests/utilitiesTest.js? Assuming you have the strings separately, we can just put a JSON file with the pairs in https://github.com/zotero/utilities/tree/master/test/data and load with Once you've run |
According to the README,
|
I've dumped my strings at https://gist.github.com/retorquere/8fb5a14a0b0f0a60db3df5313a258d5c . I haven't inspected everyone I must admit, this is accrued testing over the years based on user reports. I'd be happy to add tests. |
You probably didn't do a recursive clone. (Almost all Zotero repos require recursive clones.) You can use |
great, thanks. But when we're talking about loadSampleData, |
tests have been added. |
let masked = text.replace(/<[^>]+>/g, (match, i) => { | ||
preserve.push({ start: i, end: i + match.length, description: 'markup' }); | ||
return '\uFFFD'.repeat(match.length); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above two sections don't seem to be doing anything, at least with the sample input. Do we need them? If so, we should add some test data for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The (sub-)sentence-start ones? One of them did fail a test when I removed them, but I've added two new samples. But in the mail I received you seemed to be pointing towards the (sub-)sentence start handlers, here on the site it looks like you're referring to the markup handler. I'll look into the markup handler tonight -- certainly that should hit on something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two blocks above my comment here — protect nocase
and mask html tags with characters…
. Nothing failed when I removed those (but still setting masked
properly).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well that's just weird. I'll look into it tonight - I have a 6 hour drive in front of me right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It captures in-word markup. I will grant that I do not have a non-synthetic case at hand, but I've added a synthetic case.
What about the nocase? |
I missed that testcase, added it now. |
I don't think either of these is really appropriate as an example. CSL styles don't sentence-case, so despite the name, the sole point of |
I'd have no issue with Zotero making that call. For BBT it's a live case, but for Zotero it doesn't need to be. |
I'm just suggesting we go with something like this:
Which better reflects how |
(Which actually makes me wonder if citeproc-js actually looks for |
Ah I see, I hadn't scrutinized the sample, this is how I got it from a user. I can update this tonight. |
It wouldn't be hard to add that; the BBT html-to-latex supports it but I haven't documented that. Do you know what markup citeproc supports beyond b/i/sup/sub BTW? |
I've replaced the nocase tests with the proposed sample |
It looks great!
oh! Sorry, I hadn't thought about this at all. Perhaps we can apply chemical elements only when the title is in English? Or unmatch chemical elements that are function words in other languages. |
quote-protection has been removed. |
Great! Thank you! |
Super. This will show up in the next beta I take it? If so, when could we expect the next beta? |
Beta 32, out now |
close #35, #27, #18 related: zotero/utilities#26
The problem is that when the text is in all caps, the result is still in all caps, e.g. "NITROUS-OXIDE EMISSIONS FROM VEHICLES", which obviously doesn't have a specific word that needs to be capitalized, so I think that we can convert every word to lowercase when uppercase == uppercase with lowercase in all text. |
That wouldn't be hard to add. |
Oh, yes, that's a fairly significant regression. |
Simpler sentence-caser. I've tried my best to make sure it is in line with the coding style for Zotero, but utilities does not have an eslint config.