Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

when object 'char_embed' doesn't exist #1521

Closed
ehsanasgari opened this issue Jul 24, 2018 · 5 comments
Closed

when object 'char_embed' doesn't exist #1521

ehsanasgari opened this issue Jul 24, 2018 · 5 comments

Comments

@ehsanasgari
Copy link

Is your feature request related to a problem? Please describe.
Hi there, thank you very much for the great work.
I am trying to use the code for char-level prediction where I don't have char_embedding in my elmo config. In this case I get the following error:

...
    self._load_char_embedding()
  File "/mounts/Users/student/me/.conda/envs/elmobilstm/lib/python3.6/site-packages/allennlp/modules/elmo.py", line 379, in _load_char_embedding
    char_embed_weights = fin['char_embed'][...]
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/mounts/Users/student/me/.conda/envs/elmobilstm/lib/python3.6/site-packages/h5py/_hl/group.py", line 177, in __getitem__
    oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5o.pyx", line 190, in h5py.h5o.open
KeyError: "Unable to open object (object 'char_embed' doesn't exist)"
@DeNeutoy
Copy link
Contributor

All elmo weight files provided should have the char_embed key. Please describe in more detail what you have done/are trying to do, and provide a full stack trace.

@ehsanasgari
Copy link
Author

Thank you for your response. For my project I have removed the white spaces between words and would like to have the contextualized character embeddings in the sentence. I only work on the characters and have no words. So I have removed 'char_cnn' key from the options in the training of ELMo. The training was successful and now would like to get the contextualized embedding for each character from the hdf5 file I have. I think since I don't have 'char_cnn' I am getting this error. Is there any straightforward solution to this?

This is the content of my 'options':

{"dropout": 0.1, "n_epochs": 3, "sample_softmax": false, "bidirectional": true, "n_tokens_vocab": 28, "all_clip_norm_val": 10.0, "unroll_steps": 20, "n_train_tokens": 199714119, "batch_size": 128, "lstm": {"n_layers": 2, "use_skip_connections": true, "projection_dim": 256, "cell_clip": 3, "dim": 1024, "proj_clip": 3}}

Also may I ask what is the straightforward way to get the embedding of a sub-sentence using the trained model in hdf5 format?

Thank you

@DeNeutoy
Copy link
Contributor

If you are trying to load a custom elmo model into allennlp, you will need to modify this file:
https://github.com/allenai/allennlp/blob/master/allennlp/modules/elmo.py#L372

Above is the section where the weights are loaded. Perhaps you could modify that to match your custom model?

It's possible you might find it easier to modify the original tensorflow code to dump embeddings using your model: https://github.com/allenai/bilm-tf/blob/master/bilm/training.py#L1060

Hopefully that's helpful?

@ehsanasgari
Copy link
Author

Thank you so much for the explanation. I tried both, but no success so far. My only modification to the model was removing the char_cnn params and I hoped that the "programmatically ELMo" code would still run on the character level. Please let me know if you decided to add this functionality.

@DeNeutoy
Copy link
Contributor

Hi, sorry we won't be adding this functionality.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants