Using pre-trained model example not working #37

YoannMR · 2017-07-16T16:49:16Z

Hi,

Thanks a lot for the model and the code! They are very useful.

I'm trying to re-use the conll-2003 pre-trained model like in the example, using the example files in the same folder path (..\data\example_unannotated_texts\deploy).

with: dataset_text_folder = ../data/example_unannotated_texts

while the text files to be annotated are in ..\data\example_unannotated_texts\deploy

The output text file is empty, and the loading shows that it found no tokens.

I tried running it from dataset_text_folder = ../data/example_unannotated_texts/deploy instead, but then I get an error message (assertion error) saying that the tag is not 'O' (from the remove BIO function).

I also get an error message from spacy (asking to download the 'en' data from it first, which I did a few times already) the first time I run the code on the data to annotate. If I run it a second time, it then runs but the spacy file created during the first run is empty (which I believe is the problem).

Thanks for your help!
Yoann

YoannMR · 2017-07-16T17:05:39Z

My bad!

I checked a second time the spacy 'en' install and there was an error message. Re-running it with admin rights fixed it and it now works great.

Thanks again for the tool!

Next step would be to figure out how to run it in a more 'production-like' way (ie: give it a pipe line of text files to annotates).

Let me know if that's something you already worked on.

Franck-Dernoncourt · 2017-07-16T17:39:02Z

Next step would be to figure out how to run it in a more 'production-like' way (ie: give it a pipe line of text files to annotates).

Separating the initialization and the predicting part was done on the series of commits on Jul 11, 2017. Did you clone the repo before?

YoannMR · 2017-07-16T21:08:37Z

Thanks a lot for your quick answer!

I updated it yesterday (still need to figure out the updates).

Has it been documented how to use the separation between initialization and prediction parts?

Franck-Dernoncourt · 2017-07-17T00:34:28Z

Has it been documented how to use the separation between initialization and prediction parts?

Good point, sorry about that, we should! In todo list :-)

NeuroNER/src/neuroner.py

Line 441 in 54c760d

def predict(self, text):

is the function to predict.
NeuroNER/src/main.py

Lines 245 to 247 in 54c760d

nn = NeuroNER(**arguments)

nn.fit()

nn.close()

shows how to train.

YoannMR · 2017-07-17T00:38:57Z

Thanks again for your answer!

by the way, I'm noticing some training differences between the two versions.

On the older versions, training has more noise (larger variations) hence requiring a lower learning rate for the SGD (ex. 0.001 vs 0.005) compared to the newer version.

Do you happen to know where it may be coming from?

I see that gradient clipping is now a parameter. Was it used on the previous version? If so, what was the value? I suspect that it could be causing part of the differences but a quick scan through the code did not reveal it.

Franck-Dernoncourt · 2017-07-17T16:15:50Z

I see that gradient clipping is now a parameter. Was it used on the previous version?

It was (e.g. May 11

NeuroNER/src/parameters.ini

Line 54 in bb6caa3

gradient_clipping_value = 5.0

).

I'll let you know if I can think of any difference that the new commits may have introduced.

YoannMR · 2017-07-17T22:44:28Z

Thanks.

It was not among the parameters that were pre-defined in the parameters.ini file that I was using in the previous version so I thought it might be different.

YoannMR closed this as completed Jul 16, 2017

Franck-Dernoncourt added the question label Jul 16, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using pre-trained model example not working #37

Using pre-trained model example not working #37

YoannMR commented Jul 16, 2017

YoannMR commented Jul 16, 2017

Franck-Dernoncourt commented Jul 16, 2017

YoannMR commented Jul 16, 2017

Franck-Dernoncourt commented Jul 17, 2017

YoannMR commented Jul 17, 2017

Franck-Dernoncourt commented Jul 17, 2017

YoannMR commented Jul 17, 2017

Using pre-trained model example not working #37

Using pre-trained model example not working #37

Comments

YoannMR commented Jul 16, 2017

YoannMR commented Jul 16, 2017

Franck-Dernoncourt commented Jul 16, 2017

YoannMR commented Jul 16, 2017

Franck-Dernoncourt commented Jul 17, 2017

YoannMR commented Jul 17, 2017

Franck-Dernoncourt commented Jul 17, 2017

YoannMR commented Jul 17, 2017