Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training Fix #11

Merged
merged 16 commits into from Dec 20, 2018
Merged

Training Fix #11

merged 16 commits into from Dec 20, 2018

Conversation

kylase
Copy link

@kylase kylase commented Nov 12, 2018

@cmkumar87
Copy link

cmkumar87 commented Nov 12, 2018 via email

@kylase
Copy link
Author

kylase commented Nov 12, 2018

@cmkumar87, not in that aspect. The upstream code has tags as a key for each piece of data, however somehow it has been missing.

@cmkumar87
Copy link

cmkumar87 commented Nov 12, 2018

@kylase I saw that the change adds tag to id to the load dataset module. This change seems sensitive to me; perhaps @animeshprasad can weigh in. Was this a bug or just an enhancement to make training easier in some way?

  • Did yo try retraining with the new code on the cora dataset atleast? Do you get the same published results?

@kylase
Copy link
Author

kylase commented Nov 13, 2018

It is a critical bug. If you compare that specific file and line to the original code, you will find that the labels (tags) are not provided to the training.

I don’t know how it managed to run previously, but the git log shows it is non-existent before I take over it.

@cmkumar87
Copy link

cmkumar87 commented Nov 13, 2018

@kylase that is weird. The file from the first modifications of the Named Entity Tagger contains the tagging scheme. See this commit: 590de7c

But I dont' see it in the WING_NUS/Neural-ParsCit. So @animeshprasad may have changed it for some reason. Perhaps the functionality was moved to a different function? Can you trace the commits to check for the modifications to this file?

This is some of the history I see,
Original NER code: https://github.com/glample/tagger/blob/master/loader.py
tags are there.
In this commit where a completely new version of the file is being uploaded it is not there:
e659829

@kylase
Copy link
Author

kylase commented Nov 13, 2018

@cmkumar87, refer to this blame and look for prepare_dataset. This is the commit that removes it.

Yes, I looked at the original code and then realised that it's missing and hence I put it back. The training has been failing because it becomes an unsupervised dataset.

@cmkumar87
Copy link

cmkumar87 commented Nov 13, 2018 via email

@kylase
Copy link
Author

kylase commented Nov 14, 2018

I have no idea that happened between the code that was run for the paper and the commit. I have been looking at the training code and it seems to have been commented out components other than the training with the training dataset.

Now I am working on restoring the testing portion then follow by the cross-validation.

@cmkumar87
Copy link

@kylase Normally the training code and testing code is factored and are executed conditionally based on a command line args passed.
Are you saying that this is not how n-parscit is written? Did the developer remove / comment out the training code to allow the parser to operate under the test mode?

@knmnyn
Copy link
Member

knmnyn commented Dec 15, 2018

Has this issue been resolved? Seems from the comments that it is still unresolved.

@cmkumar87
Copy link

cmkumar87 commented Dec 15, 2018

Update on this:

  • This codebase was forked from a version that is different from the one that was used to run the experiments for Prasad, Kaur, Kan, 2018, IJDL.
  • The correct version in question was identified to be on a local server. @kylase is trying to run this; he has been facing out-of-memory errors on the wing internal gpu server possibly due to 'some' theano misconfiguration. Will let him say more.

Co-Authored-By: nsorros <nsorros@gmail.com>
@kylase kylase added the bug Something isn't working label Dec 20, 2018
@kylase kylase self-assigned this Dec 20, 2018
@kylase kylase merged commit 188721d into master Dec 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants