Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wordfish results change from run to run #1216

Closed
chriswratil opened this issue Feb 7, 2018 · 2 comments
Closed

Wordfish results change from run to run #1216

chriswratil opened this issue Feb 7, 2018 · 2 comments

Comments

@chriswratil
Copy link

chriswratil commented Feb 7, 2018

Hi,

the textmodel_wordfish() function does not give the exact same results in two runs in case the DFM is sparse. Is this due to the use of an approximation to the SVD if the DFM is sparse? In any case, it would be great to add a possibility of deterministic results in case the DFM is sparse. This would be important for replication purposes.

To see the problem, just run the following a few times and compare the last digits of the theta estimates.

library(quanteda)
summary(textmodel_wordfish(dfm(data_corpus_inaugural)))

Thanks so much for your support and best wishes,

Chris

@koheiw
Copy link
Collaborator

koheiw commented Feb 7, 2018

Thank you for the post. It is related to random initial values for sparse implimentation:

std::mt19937 mt(time(0)); // issue #1063

We will try to fix.

kbenoit added a commit that referenced this issue Feb 11, 2018
@kbenoit
Copy link
Collaborator

kbenoit commented Feb 12, 2018

I've changed the default to sparse = FALSE. Doesn't solve the fundamental issue, but will stop the difference in results from repeating the function call.

@kbenoit kbenoit closed this as completed Feb 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants