Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 2124 #2129

Merged
merged 12 commits into from
Aug 17, 2021
Merged

Issue 2124 #2129

merged 12 commits into from
Aug 17, 2021

Conversation

koheiw
Copy link
Collaborator

@koheiw koheiw commented Aug 14, 2021

For #2124, add separator to phrase() and make as.phrase() for objects for which separator has no effect.

@codecov
Copy link

codecov bot commented Aug 14, 2021

Codecov Report

Merging #2129 (6eb70b4) into master (a5eb744) will decrease coverage by 0.03%.
The diff coverage is 87.50%.

❗ Current head 6eb70b4 differs from pull request most recent head 2a0bfba. Consider uploading reports for the commit 2a0bfba to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2129      +/-   ##
==========================================
- Coverage   96.17%   96.14%   -0.04%     
==========================================
  Files          87       87              
  Lines        4938     4949      +11     
==========================================
+ Hits         4749     4758       +9     
- Misses        189      191       +2     
Impacted Files Coverage Δ
R/phrases.R 88.00% <87.50%> (-4.86%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a5eb744...2a0bfba. Read the comment docs.

@koheiw koheiw requested a review from kbenoit August 14, 2021 02:12
Copy link
Collaborator

@kbenoit kbenoit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good addition and move to make the coercion methods more consistent as well. However I see two things.

  1. Better to deprecate or just make noRd the current phrase() methods that will be moved to as.phrase(), just so not to break existing code in a minor update. (I'm happy to do this if you want.)

  2. I'm happy with the use the separator here. But I did take opportunity to review our various uses of contatenator versus separator to make sure our usage makes sense and is consistent. I think it is.

Function concatenator separator (none)
kwic()
dictionary()
as.dictionary()
tokens_split()
tokens_ngrams()
tokens_skipgrams()
corpus_group()
spacyr::entity_consolidate()
spacyr::entity_extract()
quanteda.textstats::textstat_collocations()

I think this is consistent because all of our functions that use concatenator are designed to take parts and put them together, using the value of concatenator to be the joining character. The functions that use separator on the other hand either take things that are already concatenated and split them using the value of separator to know what value to use for this split.

So by that logic, for as.phrase() or phrase() the use of separator is correct.

@koheiw
Copy link
Collaborator Author

koheiw commented Aug 16, 2021

I doubt that more than one or two people are using phrase() on these objects, but you are free to deprecate the methods.

@koheiw koheiw merged commit a425cb3 into master Aug 17, 2021
@koheiw koheiw deleted the issue-2124 branch August 17, 2021 09:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants