Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is SMC function able to update posterior for new users? #306

Closed
osorensen opened this issue Sep 1, 2023 · 4 comments · Fixed by #347
Closed

Is SMC function able to update posterior for new users? #306

osorensen opened this issue Sep 1, 2023 · 4 comments · Fixed by #347
Labels
enhancement New feature or request question Further information is requested

Comments

@osorensen
Copy link
Collaborator

At the moment, smc_mallows_new_users does not allow the type of use case that sequential Monte Carlo is designed for. The typical use case is as follows:

  1. You have a dataset (which may be empty), and corresponding posterior distributions for the parameters. If the dataset is empty, the posterior is identical to the prior.
  2. At time t=1, some new data arrives and you want to update the posterior distribution, but without having to run full MCMC without all the data from previous timesteps.
  3. At time t=2, even more new data arrives and you want another update of the posterior distribution, starting from the posterior at t=1, but without having to run full MCMC without all the data from previous timesteps.
    etc.
  4. At the final time t=T, new data arrives and you want the posterior for the full data you have collected at all timepoints, but without having to run full MCMC without all the data from previous timesteps.

To the best of my understanding, smc_mallows_new_users and smc_mallows_new_item_rank currently require that the full data which you would have at time T is provided in the first round, and then it splits the data internally. It would be really great if we could instead do the following:

  1. Let the user run a model with that data they have. Then let them save the model.
  2. A month after, when a new batch of data arrives, give the previously saved model as an argument to our function, together with the new model, and the update the model parameters.
  3. Then keep doing this over and over.

If we do this properly @wleoncio, I'm pretty sure the SMC-Mallows extension to the package will be sufficient for submitting a paper to, e.g., JOSS or R Journal.

@osorensen osorensen added enhancement New feature or request question Further information is requested labels Sep 1, 2023
@wleoncio
Copy link
Member

wleoncio commented Sep 1, 2023

Sounds like a big change to implement, but I'd be happy to help. One thing I don't quite understand is what would happen if someone simply used smc_mallows_new_users() with the data they have as the months come by. The algorithm would just assume T = 1 the first month, T = 2 on the second run a month later, and so on, which I suppose could be a problem? Does T need to stay fixed for the model to give correct results?

Sorry for the noob questions 😆

@osorensen
Copy link
Collaborator Author

That's a very good question! What you suggest would work very well I think, except that the whole model estimation would have to be redone.

In realistic situation, these functions can take incredibly long to run. Maybe what we need is to expose an "update function" that takes both the output of smc_mallows_new_users from previously together with new data, and then updates. This could alternatively just be another argument to smc_mallows_new_users, which takes previous output as prior distribution.

There is a book project on rank modeling going on, and with regard to that, my goal is that all or most the methods described in the book will also be implemented in the package, so I expect to spend some time on this.

@wleoncio
Copy link
Member

wleoncio commented Sep 4, 2023

Exciting! I'll keep working on the opened issues so we can get a cleaner house before accommodating the new features. :)

@osorensen
Copy link
Collaborator Author

osorensen commented Oct 20, 2023

@osorensen osorensen linked a pull request Jan 4, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants