Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chmatchdup() now handles non-ascii strings consistently #3850

Merged
merged 10 commits into from Sep 14, 2019
Merged

chmatchdup() now handles non-ascii strings consistently #3850

merged 10 commits into from Sep 14, 2019

Conversation

@shrektan
Copy link
Member

shrektan commented Sep 9, 2019

Closes #3844

chmatchdup() now handles non-ASCII strings correctly.

More specifically, chmatchdup () calls the C code chmatchMain(), which would fall back to Rf_match() when non-ASCII strings were detected. However, the implementation was correct only before the param bool chmatchdup got introduced. Since Rf_match() can't handle the param chmatchdup, it would return undesired results for non-ASCII strings.

This PR will address this issue.

@codecov

This comment has been minimized.

Copy link

codecov bot commented Sep 9, 2019

Codecov Report

Merging #3850 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3850      +/-   ##
==========================================
+ Coverage   99.42%   99.42%   +<.01%     
==========================================
  Files          71       71              
  Lines       13441    13443       +2     
==========================================
+ Hits        13364    13366       +2     
  Misses         77       77
Impacted Files Coverage Δ
src/chmatch.c 100% <100%> (ø) ⬆️
src/utils.c 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5f284cf...ac563a3. Read the comment docs.

src/chmatch.c Show resolved Hide resolved
@shrektan

This comment has been minimized.

Copy link
Member Author

shrektan commented Sep 10, 2019

Note match_logical() is no longer used by any function. However, I'm not sure deleting the function is desired or not. So I just leave it there.


From Matt: Thanks. Yep let's just remove it then (will do). Great.

src/chmatch.c Outdated Show resolved Hide resolved
src/chmatch.c Show resolved Hide resolved
src/chmatch.c Outdated Show resolved Hide resolved
@shrektan

This comment has been minimized.

Copy link
Member Author

shrektan commented Sep 10, 2019

One question, as of SEXP *xd = STRING_PTR(coerceUtf8IfNeeded(x));, do I need to PROTECT() or not? I mean xd is pointing to coerceUtf8IfNeeded(x), which will be collected if GC being trigger or R is smart enough to know it's in use and won't collect it?


From Matt: yes the value returned from coerceUtf8IfNeeded() needs to be PROTECT-ed, will do. I did run this PR through rchk (see commands and links in CRAN_Release.cmd, and #3867 for recent work) which I expected to catch the missing PROTECT. But it didn't. That could be because rchk is even smarter to know that in this case there is actually no R API call before the last use of xd, so there is nowhere in this case that a GC could be triggered. It's very possible that it is that smart because the messages I've seen it report elsewhere suggest that. And the nature of this section of code (truelength-clobber) is that we do avoid R API usage here. rchk is a static analysis tool so it doesn't depend on coverage or tests to work, it works by analyzing the source code. Or it could be that the STRING_PTR wrapper hides the usage of coerceUtf8IfNeeded()'s return value from rchk's tracking. Not sure. But rchk didn't have any trouble with chmatch.c at all so it's not like the nesting was too deep and its coverage was incomplete; i.e. it didn't report any errors or warning on chmatch.c in this PR like it does with some other files. We should add the PROTECT anyway for completeness, will do.


From Shrektan: Thanks. For future readers, you may find https://github.com/kalibera/cran-checks/blob/master/rchk/PROTECT.md useful.

src/utils.c Outdated Show resolved Hide resolved
src/utils.c Outdated Show resolved Hide resolved
src/utils.c Outdated Show resolved Hide resolved
@shrektan shrektan mentioned this pull request Sep 13, 2019
25 of 25 tasks complete
@mattdowle mattdowle changed the title Closes #3844: chmatchdup() now handles non-ascii strings consistently chmatchdup() now handles non-ascii strings consistently Sep 14, 2019
mattdowle added 4 commits Sep 14, 2019
Copy link
Member

mattdowle left a comment

Excellent! Much simpler and better!

@mattdowle mattdowle merged commit b10dc94 into master Sep 14, 2019
4 checks passed
4 checks passed
codecov/patch 100% of diff hit (target 99.42%)
Details
codecov/project 99.42% (+<.01%) compared to 5f284cf
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@mattdowle mattdowle deleted the fix3844 branch Sep 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

4 participants
You can’t perform that action at this time.