Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API? #34

Closed
paris0120 opened this issue Feb 21, 2017 · 5 comments
Closed

API? #34

paris0120 opened this issue Feb 21, 2017 · 5 comments

Comments

@paris0120
Copy link

I'm wondering if there is any way that I can use it as a library in my application? Could you provide some basic example codes in Wiki? I just want to use the algorithmics.

Thank you.

@jerrygaoLondon
Copy link
Collaborator

jerrygaoLondon commented Feb 21, 2017

Sorry of lacking sufficient documentation. JATE2 can be used as a library without much effort.

As mentioned in Quick Start, You can either 1) download jar from maven repository or add following configuration in your maven project along with Dragontools.

<dependency>
    <groupId>uk.ac.shef.dcs</groupId>
    <artifactId>jate</artifactId>
    <version>2.0-beta.1</version>
</dependency>

Once you have setup JATE2 libraries, you are able to use all the available ATE algorithms in your application/project. Our App* shows the example how to use and integrate ATE algorithms with Apache Solr. All the available ATE implementations are subclass of uk.ac.shef.dcs.jate.algorithm.Algorithm in the package of uk.ac.shef.dcs.jate.algorithm.*. Current method/interface should be fairly straightforward to use by simply providing a list of candidate terms and corresponding features. The method will then return ranked terms modelled by uk.ac.shef.dcs.jate.model.JATETerm with scores and other features/metadata. Since JATE2 relies on Solr to perform pre-processing and feature extraction, you have to implement your own method or use Solr or our embedded Solr implementation (i.e., App* ) to parse and extract candidates and features from your corpus.

We will introduce more documentations in near future.

Thanks for your interests.

@paris0120
Copy link
Author

I tried

AppCValue.main(("uk.ac.shef.dcs.jate.app.AppCValue -corpusDir " + corpusDir + " -o cvalue-terms.json " + solrDir + "/testdata/solr-testbed ACLRDTEC").split(" "));

but
uk.ac.shef.dcs.jate.JATEException: Cannot find expected field: jate_ngraminfo
at uk.ac.shef.dcs.jate.util.SolrUtil.getTermVector(SolrUtil.java:36)
at uk.ac.shef.dcs.jate.feature.FrequencyTermBasedFBMaster.build(FrequencyTermBasedFBMaster.java:39)
at com.scholarfriend.maven.Epollo.Tools.AppCValue.extract(AppCValue.java:93)
at com.scholarfriend.maven.Epollo.Tools.AppCValue.extract(AppCValue.java:85)
at uk.ac.shef.dcs.jate.app.App.extract(App.java:285)

I have pdf, txt, and html file under the folder.

@paris0120
Copy link
Author

Logger: com.softcorporation.util.Logger
Mon Feb 27 01:33:46 EST 2017 loading exception data for lemmatiser...
Mon Feb 27 01:33:46 EST 2017 loading exception data for lemmatiser...
Mon Feb 27 01:33:47 EST 2017 loading exception data for lemmatiser...
Mon Feb 27 01:33:47 EST 2017 loading done
Mon Feb 27 01:33:47 EST 2017 loading done
Mon Feb 27 01:33:47 EST 2017 loading done
Mon Feb 27 01:33:47 EST 2017 loading exception data for lemmatiser...
Mon Feb 27 01:33:48 EST 2017 loading exception data for lemmatiser...
Mon Feb 27 01:33:48 EST 2017 loading exception data for lemmatiser...
Mon Feb 27 01:33:48 EST 2017 loading done
Mon Feb 27 01:33:48 EST 2017 loading done
2017-02-27 01:33:48 ERROR SolrCore:525 - [jateCore] Solr index directory 'A:\eclipse\lib\jate-master\testdata\solr-testbed\jateCore\data\index/' is locked. Throwing exception.
2017-02-27 01:33:48 ERROR CoreContainer:740 - Error creating core [jateCore]: Index locked for write for core 'jateCore'. Solr now longer supports forceful unlocking via 'unlockOnStartup'. Please verify locks manually!
org.apache.solr.common.SolrException: Index locked for write for core 'jateCore'. Solr now longer supports forceful unlocking via 'unlockOnStartup'. Please verify locks manually!
at org.apache.solr.core.SolrCore.(SolrCore.java:820)
at org.apache.solr.core.SolrCore.(SolrCore.java:659)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:727)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:447)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:438)
at java.util.concurrent.FutureTask.run(Unknown Source)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:210)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: org.apache.lucene.store.LockObtainFailedException: Index locked for write for core 'jateCore'. Solr now longer supports forceful unlocking via 'unlockOnStartup'. Please verify locks manually!
at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:528)
at org.apache.solr.core.SolrCore.(SolrCore.java:761)
... 9 more
Mon Feb 27 01:33:48 EST 2017 loading done
2017-02-27 01:33:48 ERROR SolrCore:525 - [GENIA] Solr index directory 'A:\eclipse\lib\jate-master\testdata\solr-testbed\GENIA\data\index/' is locked. Throwing exception.
2017-02-27 01:33:48 ERROR CoreContainer:740 - Error creating core [GENIA]: Index locked for write for core 'GENIA'. Solr now longer supports forceful unlocking via 'unlockOnStartup'. Please verify locks manually!
org.apache.solr.common.SolrException: Index locked for write for core 'GENIA'. Solr now longer supports forceful unlocking via 'unlockOnStartup'. Please verify locks manually!
at org.apache.solr.core.SolrCore.(SolrCore.java:820)
at org.apache.solr.core.SolrCore.(SolrCore.java:659)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:727)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:447)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:438)
at java.util.concurrent.FutureTask.run(Unknown Source)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:210)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: org.apache.lucene.store.LockObtainFailedException: Index locked for write for core 'GENIA'. Solr now longer supports forceful unlocking via 'unlockOnStartup'. Please verify locks manually!
at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:528)
at org.apache.solr.core.SolrCore.(SolrCore.java:761)
... 9 more
2017-02-27 01:33:48 INFO AppCValue:72 - Start CValue term ranking and filtering for whole index ...
uk.ac.shef.dcs.jate.JATEException: Cannot find expected field: jate_ngraminfo
at uk.ac.shef.dcs.jate.util.SolrUtil.getTermVector(SolrUtil.java:36)
at uk.ac.shef.dcs.jate.feature.FrequencyTermBasedFBMaster.build(FrequencyTermBasedFBMaster.java:39)
at uk.ac.shef.dcs.jate.app.AppCValue.extract(AppCValue.java:86)
at uk.ac.shef.dcs.jate.app.AppCValue.extract(AppCValue.java:77)
at uk.ac.shef.dcs.jate.app.App.extract(App.java:285)
at uk.ac.shef.dcs.jate.app.AppCValue.main(AppCValue.java:48)

I removed all the file in the data folder but still got these messages.

@jerrygaoLondon
Copy link
Collaborator

To run AppCValue programmatically, the main method accepts run-time parameters from the string array with the same order as the command line format.

The problem of your implements is that you should not provide class name as parameter if you directly run AppCValue programmatically.

So try with the following:

AppCValue.main(("-corpusDir " + corpusDir + " -o cvalue-terms.json " + solrDir + "/testdata/solr-testbed ACLRDTEC").split(" "));

To make it more clearly, you can try with the following code:

String[] cvalueArgs = new String[6];
cvalueArgs[0] = "-corpusDir";
cvalueArgs[1] = <YOUR_CORPUS_DIR>;
cvalueArgs[2] = "-o";
cvalueArgs[3] = <YOUR_JSON_FILE_PATH>;
cvalueArgs[4] = <YOUR_SOLR_HOME_PATH>;
cvalueArgs[5] = <YOUR_SOLR_CORE_NAME>;

AppCValue.main(cvalueArgs);

Hope it helps.

@paris0120
Copy link
Author

Thank you it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants