Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decode on phrases branch overwrites s_attributes with last s_attribute decoded #120

Closed
ChristophLeonhardt opened this issue Jan 24, 2020 · 1 comment

Comments

@ChristophLeonhardt
Copy link
Contributor

ChristophLeonhardt commented Jan 24, 2020

Problem

A bit tricky to describe, but the decode method isn't working anymore for slice objects because all structural attributes are overwritten by the last s-attribute which gets decoded in a previous step. The culprit seems to be line 248 in the declaration of the unfold function which says

value <- .SD[[s_attr]]

after which always the same "value" is used in:

for (s_attr in s_attributes) dt[, (s_attr) := value]

Workaround / Solution?

If I replace both lines with the pre-commit line, everything works as expected.

for (s_attr in s_attributes) dt[[s_attr]] <- rep(.SD[[s_attr]], times = nrow(dt))

I am not sure why the change was made though.

Reproducible example

In polmineR 0.8.0.9003 this might be an illustrative example for the problem before the fix. The resulting data.table is full of the speaker name:

gp_subset <- polmineR::subset("GERMAPARLMINI", date == "2009-10-28")
gp_subset_decode <- decode(gp_subset)

Edit: Also on the topic of modifying the decode method, in some instances it might be disadvantageous to remove the struc column from the token stream. Does that change serve a particular purpose?

@PolMine
Copy link
Collaborator

PolMine commented Feb 26, 2020

My apologies for the unwanted side-effects of my refactoring exercise for the decode()-method, and for you detailed feedback. The issue is fixed in the development version now, and I reverted to keeping the struc column.

@PolMine PolMine closed this as completed Feb 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant