Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename unnest_tokens to eliminate semantic overlap/ambiguity with tidyr::unnest #36

Closed
petereckley opened this issue Jan 16, 2017 · 2 comments

Comments

@petereckley
Copy link

Currently the naming overlaps with tidyr::unnest(), which takes unwraps list columns. By contrast tidytext::unnest_tokens() acts on a character vector column. For example, tokenise() would avoid this collision of names / semantics, though at the cost of losing the sense that multiple rows are being produced for each input text field. tokenise_to_rows or tokens_to_rows addresses this, but doesn't feel very elegant. I'm afraid I don't have the solution. But there must be a name that captures both that sense, without overlapping with tidyr::unnest()?

@petereckley petereckley changed the title Rename unnest_tokens Rename unnest_tokens to eliminate semantic overlap/ambiguity with tidyr::unnest Jan 16, 2017
@dgrtwo
Copy link
Collaborator

dgrtwo commented Jan 17, 2017

The name was chosen exactly to reflect the similarity to tidyr's unnest: that it unnests after tokenizing a column into a list column. (In the original version it actually used tidyr's unnest- the current solution is somewhat faster).

In any case, since it's the most commonly used function in the package, and used by us and others in many published examples, we would have needed an extremely compelling reason to change it.

@dgrtwo dgrtwo closed this as completed Jan 17, 2017
@github-actions
Copy link

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 26, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants