Is there a preferred method of saving and loading h2o word2vec models in python? #144

geoffkip · 2020-09-16T23:29:05Z

I have trained a word2vec model in the python h2o package. Is there a simple way for me to save that word2vec model and load it back later for use?

I have tried the h2o.save_model() and h2o.load_model() functions with no luck. I get an error using that approach like


water.exceptions.H2OIllegalArgumentException
[1] "water.exceptions.H2OIllegalArgumentException: Illegal argument: dir of function: importModel:
I am using the same version of h2o to train and load the model back in so the issue outlined in this question is not applicable Can't import binay h2o model with h2o.loadModel() function: 412 Precondition Failed

Any one with any insights on how to save and load an h2o word2vec model?

I realize more importantly than saving the model it is important to save the word vector embeddings to use later as a pre-trained model.

Is doing something like this best practice?

import h2o
from h2o.estimators import H2OWord2vecEstimator

df['text'] = df['text'].ascharacter()
  
# Break text into sequence of words
words = tokenize(df["text"])
    
# Initializing h2o
print('Initializing h2o.')
h2o.init(ip=h2o_ip, port=h2o_port, min_mem_size=h2o_min_memory) 
   
# Build word2vec model:
w2v_model = H2OWord2vecEstimator(sent_sample_rate = 0.0, epochs = 10)
w2v_model.train(training_frame=words)

#Create word vector embedding h20 frame
w2v_frame = w2v_model.to_frame()

#Export word embeddings to file for later use
h2o.export_file(w2v_frame,'/mnt/results/words_embeddings.csv',force=True)

# Import word embeddings later for pretrained model 
w2v_frame = h2o.import_file('/mnt/results/words_embeddings.csv')

#Define pretrained word2vec model
w2v_model2 = H2OWord2vecEstimator(pre_trained = w2v_frame, vec_size = 100)

# Train on words
w2v_model2.train(training_frame=words)

The text was updated successfully, but these errors were encountered:

geoffkip closed this as completed Sep 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there a preferred method of saving and loading h2o word2vec models in python? #144

Is there a preferred method of saving and loading h2o word2vec models in python? #144

geoffkip commented Sep 16, 2020 •

edited

Loading

Is there a preferred method of saving and loading h2o word2vec models in python? #144

Is there a preferred method of saving and loading h2o word2vec models in python? #144

Comments

geoffkip commented Sep 16, 2020 • edited Loading

geoffkip commented Sep 16, 2020 •

edited

Loading