Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model loading is incompatible with newer RcppAnnoy #31

Closed
jlmelville opened this issue Sep 22, 2019 · 14 comments
Closed

Model loading is incompatible with newer RcppAnnoy #31

jlmelville opened this issue Sep 22, 2019 · 14 comments
Labels
bug Something isn't working

Comments

@jlmelville
Copy link
Owner

jlmelville commented Sep 22, 2019

Dirk Eddelbuettel, RcppAnnoy author, has reached out to inform me of that a new version of RcppAnnoy is coming, backed with an updated version of Annoy.

Unfortunately, uwot's ability to load previously saved models is broken by these changes. It used to be possible to specify an arbitrary dimensionality when creating an index, and then loading a serialized Annoy model would overwrite that dimensionality to whatever the serialized index was supposed to have, e.g.:

ann <- methods::new(RcppAnnoy::AnnoyEuclidean, 1)
ann$load(index_path)

Annoy author Erik Bernhardsson further pointed out that this should never have worked, which adds a little extra urgency to making a fix.

Fortunately, the information needed is readily to hand at load time, so this isn't a difficult (or backwards-compatibility breaking) fix.

@jlmelville
Copy link
Owner Author

0.1.4 is now on CRAN with this fix. Thanks to Dirk and Erik for their assistance on this one.

@OuNao
Copy link

OuNao commented Feb 13, 2020

Hi!

Maybe this issue is not completly solved.

Using R 3.6.0 and uwot 0.1.5 I can save a model but get a error on load the same model:
Error: index size 76853092 is not a multiple of vector size 16

If I try the uwot version 0.1.3 I can load the model with no problems (the model created with 0.1.5 uwot version)

What I´m doing wrong?

Thanks,

EDIT:
Module saved with: R 3.6.0, uwot 0.1.5, RcppAnnoy 0.0.14
Load try 1: R 3.6.0, uwot 0.1.5, RcppAnnoy 0.0.14 -> ERROR
Load try 2: R 3.5.3, uwot 0.1.3, RcppAnnoy 0.0.11 -> SUCESS!

@jlmelville
Copy link
Owner Author

Can you provide the commands you ran?

@jlmelville jlmelville reopened this Feb 14, 2020
@OuNao
Copy link

OuNao commented Feb 15, 2020

Hi,

Create the model:
model<-uwot::umap(X = ff, scale = TRUE, n_threads=4, init = "spectral", metric="euclidean", ret_model = TRUE)
Save the model:
uwot::save_uwot(model, file = "mymodel")
Load the model:
model2<-uwot::load_uwot(file="mymodel")

Thanks,

@jlmelville
Copy link
Owner Author

Does it make a difference if you do:

# Create the model:
model <- uwot::umap(X = ff, scale = TRUE, n_threads = 4, init = "spectral", 
                    metric = "euclidean", ret_model = TRUE)
model_file <- tempfile("mymodel")
# Save the model:
uwot::save_uwot(model, file = model_file)
# Load the model:
model2 <- uwot::load_uwot(file = model_file)

I have discovered there are lots of problems with save_uwot and load_uwot, although they are not connected with RcppAnnoy directly, so this might be a different bug.

@OuNao
Copy link

OuNao commented Feb 15, 2020

Hi,
The actual code already use tempfile. I simplified the above code.
Nope, there's no difference with or without intermediary tempfile.
Thanks,

@OuNao
Copy link

OuNao commented Feb 15, 2020

So,

Test load uwot in R 3.6.0 + RcppAnnoy 0.0.14 + uwot 0.1.13 -> error!
Test load uwot in R 3.6.0 + RcppAnnoy 0.0.11 + uwot 0.1.13 -> error!

I don't know why R 3.5.3 + RcppAnnoy 0.0.11 + uwot 0.1.13 works...
The problem is R 3.6.0?

@jlmelville
Copy link
Owner Author

Can you try with the latest R 3.6.x release, which is currently R 3.6.2?

@OuNao
Copy link

OuNao commented Feb 15, 2020

I will try R-3.6.2. Before this, some debuging:
I can isolate the problem to ann$load(nn_fname):
Error: index size 76853092 is not a multiple of vector size 16
ann is defined before this in ann <- create_ann(metric, ndim = length(model$metric[[i]])), but model$metric is NULL so length(model$metric[[i]]) is 0 (zero). That's ok?

@jlmelville
Copy link
Owner Author

No that's probably not ok. This is a bug, sorry. Unfortunately this bug isn't something I have been able to reproduce so I missed it (weirdly the lack of a correct dimensionality still isn't biting me).

model$metric[[i]] should be set to the number of features used in the NN index. Because you have scale = TRUE, this should be fixable by doing:

model$metric[[1]] <- length(model$scale_info$`scaled:nzvcols`)

Hopefully that will work for your current case for now.

@OuNao
Copy link

OuNao commented Feb 15, 2020

Hi,

Yes this solve the problem! But ann must be created with ann <- create_ann(metric, ndim = model$metric[[i]]) and not ann <- create_ann(metric, ndim = length(model$metric[[i]])).

Any ETA for this changes appear on CRAN?

I can use model$metric[[1]] <- length(model$scale_info$`scaled:nzvcols`) before save the model but load_uwot still throws error without the above change.

Thanks.

@jlmelville
Copy link
Owner Author

Yes this solve the problem! But ann must be created with ann <- create_ann(metric, ndim = model$metric[[i]]) and not ann <- create_ann(metric, ndim = length(model$metric[[i]])).

Yes, you are again correct.

Any ETA for this changes appear on CRAN?

I don't know. There are several more problems with save_uwot and load_uwot that need fixing, plus some other outstanding issues. Updates to packages on CRAN are not supposed to happen more than once every 1-2 months, so this requires submitting fixes in batches.

@jlmelville jlmelville added the bug Something isn't working label Feb 15, 2020
@OuNao
Copy link

OuNao commented Feb 16, 2020

No problem.

I can still use R 3.5.3.
I will wait for the changes appear on Github and move from cran. I could Aldo use the manual solition from issue #19.

So. Many alternatives.

Thanks a lot for this wonderfull package.

Regards,

@jlmelville
Copy link
Owner Author

Version 0.1.8 has made it to CRAN and this should be fixed now. Please re-open if problems persist.

yuhanH pushed a commit to yuhanH/uwot that referenced this issue Jul 20, 2020
yuhanH pushed a commit to yuhanH/uwot that referenced this issue Jul 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants