DVector and dependencies #136

cowchipkid · 2016-07-21T14:58:36Z

The pom.xml for ner specifies a dependency on LBJava 1.2.14, but that version contains DVector, which in fact is now moved to core-utilities. Should that be changed to version 1.2.24? Seems the learners in 1.2.14 will be using the version in that jar rather than the one in core-utilities.

mssammon · 2016-07-21T15:08:41Z

I assume this will also be true for illinois-pos and illinois-chunker. It makes sense to change the lbjava dependency. Will this mean retraining each component and deploying new models?

cowchipkid · 2016-07-21T15:25:55Z

I would strongly recommend it, there may be different parameters includes over the course of 10 revisions, not to mention the serialization issues that might crop up.

qiangning · 2016-09-12T15:10:35Z

Hi @mssammon , I notice that the LBJava versions in pom.xml files are already 1.2.24 (did someone change this from 1.2.14 to 1.2.24?), so we only need to test it without further changes to the pom files, right? Also, since this is my first time doing this, how would I know if the chunker is working properly?

cogcomp-dev · 2016-09-12T15:38:59Z

@qiangning you need to run the benchmark script under chunk/scripts/ and check that the performance is in the ballpark of that reported in the relevant publication. Please record the results in a new page linked from here: https://wiki.illinois.edu/wiki/display/ccg/CCG+Software+Information (and also the results reported in the original publication).

cogcomp-dev · 2016-09-12T15:40:07Z

@b29308188 , please do the same. @hhuang97 , you just need to compare against the existing NER Benchmark table at the link mentioned above.

qiangning · 2016-09-13T22:45:32Z

Hi @mssammon @danyaljj , do you know why in L56 of ChunkTester.java, testFileURL is returned null even if the test file exists?

danyaljj · 2016-09-14T01:24:08Z

Maybe it's not in the classpath? Where do you put the file?

Side note: http://stackoverflow.com/questions/23821235/how-to-link-to-specific-line-number-on-github

qiangning · 2016-09-14T02:05:18Z

Thanks, Daniel. The test file is here. Does this look correct to you?

danyaljj · 2016-09-14T02:21:50Z

Weight did you just pointed out to this line first; right?

I don't know how used to work; as far as my knowledge goes, getClassLoader().getResource(.) can only read from classpath (not from anywhere on the disk). We should double-check this with @nitishgupta tho, since he seems to be the author/user of this configurator.

Side note: your link to the line looks great! 😍

qiangning · 2016-09-14T02:39:12Z

I first mentioned that this line returned null. The error came from a test file which is specified here.

Yes. I agree with you about getResource; that's also why I am not using my own script for test instead of using ChunkerTester.

Also, it seems that the test script in chunker/scripts/ are quite out-dated. Do you @nitishgupta think it better if I update them, or do you already have your own plan of doing so?

qiangning · 2016-09-16T19:58:17Z

Hi Daniel @danyaljj , are we going to provide the test file along with our package/jar? I'm not sure if that's allowed; but if not, I guess the BenchmarkTest.sh script makes less sense for general users since what a general user needs in the first place is to hit the button and see the results.

danyaljj · 2016-09-16T20:05:34Z

As far as I know, upon packaging ALL the files get packaged into a single jar file to be shipped, I think. @mssammon can confirm this.

mssammon · 2016-09-16T21:05:54Z

No corpus gets packaged. Benchmark tests generally use licensed data; documentation should indicate the variable/argument that needs to be changed and which corpus is required.

hhuang97 · 2016-09-18T16:46:36Z

@cogcomp-dev I've added a row for NER v3.0.72 with benchmark results. The results are in the ballpark of the previously reported ones.

qiangning · 2016-09-18T19:12:56Z

@cogcomp-dev The chunker's also passed the test wiki page.

mssammon · 2016-09-20T20:43:07Z

@mssammon check and close

b29308188 · 2016-09-22T20:29:03Z

@qiangning I also encounter the null pointer problem in this line
How did you solve it? It seems that the variable testFIleName already stores the right absolute path for the test file. ("/shared/corpora/corporaWeb/written/eng/chunking/conll2000distributions/test.txt")

qiangning · 2016-09-26T04:38:52Z

Hi Liang-Wei @b29308188 , I didn't fix that problem in ChunkTester. Instead, I created my own tester to do testing. I think the problem of fixing ChunkTester deserves a new issue and probably I will discuss this with Mark in the next software meeting.

Are you also testing chunker?

mssammon · 2016-09-26T20:03:02Z

@b29308188 did you retrain and evaluate POS with the updated LBJava dependency yet?

b29308188 · 2016-09-27T00:30:35Z

@mssammon I retrain and evaluate it with 1.2.24 LBJava.
The results are are there

mssammon · 2016-09-27T12:40:57Z

@b29308188 that link takes me to the Chunker results; you were assigned the POS tagger. Is there a separate table with results for POS?

b29308188 · 2016-09-27T14:12:19Z

@mssammon I though your ”please do the same“ means to also test the chuncher. There’s no clear description that indicates it. The description in the wiki says ”retrain and check performance of POS***”. I thought it means ”check the performance of POS dataset with the chuncher” ... I will test the POS tagger today or tomorrow.

mssammon · 2016-09-27T14:44:17Z

@b29308188 sorry for the confusion. Thanks for the follow-up. Please add comments to the page with the results to clarify what you did; this information may be useful in its own right.

b29308188 · 2016-09-28T01:09:39Z

@mssammon
The evaluation result of the POS tagger is here.
Sorry for the delay.

qiangning · 2016-09-28T04:24:23Z

Hi Mark @mssammon , just curious. Why is the chunker performance table from Liang-Wei different from mine? Which part of the training process gives rise to this randomness?

mssammon · 2016-09-28T13:56:37Z

@qiangning @b29308188 what exact command/script did you use to train/evaluate? Some variation is expected if the LBJ internally shuffles the training data, but this seems like a significant difference. (Qiang, thanks for pointing this out....)

qiangning · 2016-09-28T15:19:55Z

Mark, I used ChunkerTrain and ChunkTester in my fork. I only modified a bit of the script to fix the path issue (i.e., null pointer to files).

Thanks for clarifying the randomness of LBJ. I thought Liang-Wei's results were in the ballpark of mine (93.862 vs 93.451). Are you saying that this difference is too much?

b29308188 · 2016-09-28T19:02:42Z

@mssammon I modified the tester from @qiangning by pointing the training and testing data to /shared/corpora/corporaWeb/written/eng/chunking/conll2000distributions/ and storing the models into my folder.

mssammon · 2016-09-29T02:31:26Z

opened issue #222 to deal with the problems identified here. Closing this issue as the original task is complete.

hhuang97 self-assigned this Sep 9, 2016

b29308188 self-assigned this Sep 9, 2016

qiangning self-assigned this Sep 9, 2016

cogcomp-dev added this to the CCG Borg Bonanza milestone Sep 12, 2016

mssammon closed this as completed Sep 29, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DVector and dependencies #136

DVector and dependencies #136

cowchipkid commented Jul 21, 2016

mssammon commented Jul 21, 2016

cowchipkid commented Jul 21, 2016

qiangning commented Sep 12, 2016

cogcomp-dev commented Sep 12, 2016

cogcomp-dev commented Sep 12, 2016

qiangning commented Sep 13, 2016

danyaljj commented Sep 14, 2016

qiangning commented Sep 14, 2016

danyaljj commented Sep 14, 2016

qiangning commented Sep 14, 2016

qiangning commented Sep 16, 2016

danyaljj commented Sep 16, 2016

mssammon commented Sep 16, 2016

hhuang97 commented Sep 18, 2016

qiangning commented Sep 18, 2016

mssammon commented Sep 20, 2016

b29308188 commented Sep 22, 2016 •

edited

qiangning commented Sep 26, 2016

mssammon commented Sep 26, 2016

b29308188 commented Sep 27, 2016

mssammon commented Sep 27, 2016

b29308188 commented Sep 27, 2016

mssammon commented Sep 27, 2016

b29308188 commented Sep 28, 2016 •

edited

qiangning commented Sep 28, 2016

mssammon commented Sep 28, 2016

qiangning commented Sep 28, 2016

b29308188 commented Sep 28, 2016

mssammon commented Sep 29, 2016

DVector and dependencies #136

DVector and dependencies #136

Comments

cowchipkid commented Jul 21, 2016

mssammon commented Jul 21, 2016

cowchipkid commented Jul 21, 2016

qiangning commented Sep 12, 2016

cogcomp-dev commented Sep 12, 2016

cogcomp-dev commented Sep 12, 2016

qiangning commented Sep 13, 2016

danyaljj commented Sep 14, 2016

qiangning commented Sep 14, 2016

danyaljj commented Sep 14, 2016

qiangning commented Sep 14, 2016

qiangning commented Sep 16, 2016

danyaljj commented Sep 16, 2016

mssammon commented Sep 16, 2016

hhuang97 commented Sep 18, 2016

qiangning commented Sep 18, 2016

mssammon commented Sep 20, 2016

b29308188 commented Sep 22, 2016 • edited

qiangning commented Sep 26, 2016

mssammon commented Sep 26, 2016

b29308188 commented Sep 27, 2016

mssammon commented Sep 27, 2016

b29308188 commented Sep 27, 2016

mssammon commented Sep 27, 2016

b29308188 commented Sep 28, 2016 • edited

qiangning commented Sep 28, 2016

mssammon commented Sep 28, 2016

qiangning commented Sep 28, 2016

b29308188 commented Sep 28, 2016

mssammon commented Sep 29, 2016

b29308188 commented Sep 22, 2016 •

edited

b29308188 commented Sep 28, 2016 •

edited