Skip to content
This repository has been archived by the owner on Dec 12, 2018. It is now read-only.

Switch between NER models #12

Closed
minusplusminus opened this issue Nov 16, 2013 · 8 comments
Closed

Switch between NER models #12

minusplusminus opened this issue Nov 16, 2013 · 8 comments
Assignees

Comments

@minusplusminus
Copy link

Hi,

At first I want to thank you for this wonderful wrapper. It saves me allot of time I've rewritten my c++ code to use this.

I see that all the models are loaded into the wrapper. How can I select one of these? Another question is about the output. Can I get an XML output of the wrapper? And can I switch between parser types?

With kind regards,

i_kill_bombs

PS.

your wrapper works fine in cpp.

@dmnapolitano
Copy link
Owner

Hi @ikillbombs ! I'm glad you find this wrapper/server so helpful. 😄

You want the option to load in any number of NER models, right?

Also, what kind of XML output would you find helpful? Do you want XML for the NER's output or for all of them? Are you looking for something like the XML output given from http://nlp.stanford.edu:8080/corenlp ?

You can switch between parser types pretty easily if you're using the start_server.sh script, I believe the second argument to it is the full path to the parser model you'd like to use.

Also, would you be interested in sharing your C++ client? 😄

Thanks,
Diane

@minusplusminus
Copy link
Author

If I look at the code, I see that it's only for the lex parser: https://github.com/EducationalTestingService/stanford-thrift/blob/master/src/StanfordCoreNLPHandler.java#L36

Yes, there are several options on the Stanford parser. According to the lexparser.sh file in the Stanford Parser package. The options are: penn, oneline, rootSymbolOnly, words, wordsAndTags, dependencies, typedDependencies, typedDependenciesCollapsed, latexTree, xmlTree, collocations, semanticGraph, conllStyleDependencies, conll2007. Now it's only a tree.

I've installed thrift into a xCode project on my mac.
A small tutorial on how to use the CPP thrift option:
install thrift by using homebrew:

    brew install thrift

In the directory of 'Stanford Thrift' type:

    thrift --gen cpp corenlp.thrift

It generates the gen-cpp directory. Copy this directory into your project.
After that copy the usr/local/Cellar/thrift/0.9.0/include and usr/local/Cellar/thrift/0.9.0/libs into your project together with the boost files and add the header and lib search paths in Build settings.

In your cpp file:

 #include <stdio.h>
 #include <unistd.h>
 #include <sys/time.h>

 #include <thrift/protocol/TBinaryProtocol.h>
 #include <thrift/transport/TSocket.h>
 #include <thrift/transport/TTransportUtils.h>

 using namespace std;
 using namespace apache::thrift;
 using namespace apache::thrift::protocol;
 using namespace apache::thrift::transport;
 using namespace boost;


 void thriftConnect::setup() {
     shared_ptr<TTransport> socket(new TSocket("localhost", 9999));
     shared_ptr<TTransport> transport(new TBufferedTransport(socket));
     shared_ptr<TProtocol> protocol(new TBinaryProtocol(transport));
     StanfordCoreNLPClient  client(protocol);


     try {
         transport->open();

         //client.ping();
         client.send_ping();
         printf("ping()\n");
           std::vector<NamedEntity> _NamedEntityReturn;
         client.get_entities_from_text(_NamedEntityReturn, text);
         transport->close();
     } catch (TException &tx) {
         printf("ERROR: %s\n", tx.what());
     }
 }

@dmnapolitano
Copy link
Owner

Hi. For the parser options, if you take a look at the README: https://github.com/EducationalTestingService/stanford-thrift/blob/master/README_parser.md there's an option for outputFormat where you can specify any of those options and get the desired result. That argument accepts all of the same arguments as the command-line Stanford Parser.

Thanks for sharing your C++ client! I'll add it. 😄

@ghost ghost assigned dmnapolitano Nov 19, 2013
@minusplusminus
Copy link
Author

Ah nice, but still I cannot switch in between the loaded classifiers

 Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [3,5 sec].
 Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [3,3 sec].
 Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [3,3 sec].

Your welcome!

@dmnapolitano
Copy link
Owner

That's because I haven't done anything to accommodate that yet. 😕 Do you want the ability to load any number of NER models; as in, having the choice to load any one of them vs. any two of them vs. all three (and potentially more as they are developed)?

@minusplusminus
Copy link
Author

It's a good idea to load everyone of them, because they all work totally different. If I use the NER, the 4 class for me works best with names of persons/organisations and 7 class for specific parsing. So I use all of them. I think other developers will do the same.

dmnapolitano pushed a commit that referenced this issue Nov 26, 2013
@dmnapolitano dmnapolitano mentioned this issue Nov 26, 2013
@dmnapolitano
Copy link
Owner

Hi! Any number of NER models (along with your choice of parser and tagger models) can now be loaded when the server starts from a configuration file. Please take a look at this when you get a chance; in the meantime I'll close this and if you find anything wrong with it, definitely open an issue. 😄

Now on to other issues, LOL.

Thanks!

@minusplusminus
Copy link
Author

Freaking sweet! TNX allot!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants