Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

May you provide some example about conllu format ? #7

Closed
hecongqing opened this issue Oct 12, 2018 · 4 comments
Closed

May you provide some example about conllu format ? #7

hecongqing opened this issue Oct 12, 2018 · 4 comments

Comments

@hecongqing
Copy link

No description provided.

@Oneplus
Copy link
Member

Oneplus commented Oct 13, 2018

Please refer the readme for an example.

Then, prepare your input file in the conllu format, like
1 Sue Sue _ _ _ _ _ _ _
2 likes like _ _ _ _ _ _ _
3 coffee coffee _ _ _ _ _ _ _
4 and and _ _ _ _ _ _ _
5 Bill Bill _ _ _ _ _ _ _
6 tea tea _ _ _ _ _ _ _

@Oneplus
Copy link
Member

Oneplus commented Oct 13, 2018

I will close this issue. If there is a further question, please reopen it.

@Oneplus Oneplus closed this as completed Oct 13, 2018
@frankier
Copy link

You might find this helpful: https://github.com/EmilStenstrom/conllu

@jbrry
Copy link

jbrry commented Dec 5, 2018

Hi Yijia @Oneplus,

Could you give an example of running elmoformanylangs to compute ELMo representations for a conllu file and loading the vectors in a python file?

I have tried something similar to #9 but I cannot "get" the vector for a specific sentence. Using the allennlp module, I am able to get a sentence based on the sentence index, like so:

vecs_file = os.path.join(args.elmo_output_dir, 'en_lines-sentences.hdf5')   
h5py_file = h5py.File(vecs_file, 'r')  
embedding = h5py_file.get("0")

However, when I use elmoformanylangs to compute the ELMo representations, when I try get the vector it always returns a None value. I have even tried providing the full tab-separated sentence but I can never find the sentence in my program, e.g.

embedding = h5py_file.get("He and Ron went down to breakfast to find Mr and Mrs Weasley and Ginny already sitting at the kitchen table .") or embedding = h5py_file.get("0") always returns None.

The command I am running is:

python -m elmoformanylangs test --input_format conll --input ~/en_conllu/en_lines-ud-dev.conllu --model ~/ELMo/en.model/ --output_prefix sample --output_format hdf5 --output_layer -1

The output of such a command gives rather unusual looking output. E.g.

tail sample.ly-1.hdf5 looks like:
https://imgur.com/a/JWKeV4v

So my question is, does something look erroneous with my output/command or how do I load a particular sentence embedding when using a conllu file?

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants