Skip to content
This repository has been archived by the owner on Dec 12, 2018. It is now read-only.

org.apache.thrift.TApplicationException: Internal error processing get_entities_from_text #22

Closed
Vanaja1505 opened this issue Jun 17, 2014 · 9 comments
Assignees

Comments

@Vanaja1505
Copy link

Hi,

I got the below exception when executing Client script. However the server is running good.

org.apache.thrift.TApplicationException: Internal error processing get_entities_from_text

Any help appreciated!

Regards,

Vanaja Jayaraman

@dan-blanchard
Copy link
Contributor

Closing as duplicate of #14.

@dmnapolitano
Copy link
Owner

I'm reopening this as this seems to be an issue with named entity recognition and not with the parser, which was the case with #14. Or am I wrong? @Vanaja1505 is there anything more to that error message? Can you share with me your client code? Thanks.

@Vanaja1505
Copy link
Author

Thanks for your response!

Below is the issue am getting:

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
org.apache.thrift.TApplicationException: Internal error processing parse_text
    at org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
    at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
    at CoreNLP.StanfordCoreNLP$Client.recv_parse_text(Unknown Source)
    at CoreNLP.StanfordCoreNLP$Client.parse_text(Unknown Source)
    at StanfordCoreNLPClient.perform(StanfordCoreNLPClient.java:59)
    at StanfordCoreNLPClient.main(StanfordCoreNLPClient.java:42)`

And my client code is:

    import java.io.BufferedReader;
    import java.io.FileReader;
    import java.io.IOException;
    import java.util.List;

    import org.apache.thrift.TException;
    import org.apache.thrift.protocol.TBinaryProtocol;
    import org.apache.thrift.protocol.TProtocol;
    import org.apache.thrift.transport.TSocket;
    import org.apache.thrift.transport.TTransport;

    import CoreNLP.ParseTree;
    import CoreNLP.StanfordCoreNLP;

    public class StanfordCoreNLPClient {

    public static void main(String [] args) {

        String server = "";
        Integer port = 0;
        String inputFilename = "";

        if (args.length == 3) {
            server = args[0];
            port = Integer.parseInt(args[1]);
            inputFilename = args[2];
        }
        else {
            System.err.println("Usage: StanfordCoreNLPClient <server> <port> <inputfile>");
            System.exit(2);
        }

        try {
            TTransport transport;
            transport = new TSocket(server, port);
            transport.open();

            TProtocol protocol = new  TBinaryProtocol(transport);
            StanfordCoreNLP.Client client = new StanfordCoreNLP.Client(protocol);

            perform(client, inputFilename);

            transport.close();
        } catch (TException x) {
            x.printStackTrace();
        }
    }

    private static void perform(StanfordCoreNLP.Client client, String inputFilename) throws TException
    {
        FileReader infile = null;

        try {
            infile = new FileReader(inputFilename);
            BufferedReader in = new BufferedReader(infile);
            while (in.ready()) {
                String sentence = in.readLine();
               List<ParseTree> trees = client.parse_text(sentence, null);
                for (ParseTree tree : trees)
                {
                    System.out.println(tree.tree);
                }
               /* StanfordTokenizerThrift a= new StanfordTokenizerThrift();
                List<List<String>> st = a.tokenizeText(sentence);
                for(int i=0;i<st.size()-1;i++){
                    List<String> st1 = st.get(i);
                    for(int j=0;j<st1.size();j++){
                        System.out.println("--"+st1.get(j));
                    }
                }*/
            }
            in.close();
        }
        catch (IOException e) {
          e.printStackTrace();
        }
    }
    }

Note:
I get exception when I use parser, ner, corenlp. Tokenizer runs well which is commented in the above client code.

@dmnapolitano
Copy link
Owner

Got it. Thanks for sharing your client code. 😄

  • For the parser, you need to specify an output format. Try replacing your second argument of null with a String[] like:
String[] desiredOutputFormat = {"-outputFormat", "oneline"};
List<ParseTree> trees = client.parse_text(sentence, desiredOutputFormat);
  • For the NER, I just got the same error here, doing exactly what you're doing. I'm looking into it now. 😕
  • There is no CoreNLP per se; that's just the package all of these tools are part of. Unless I'm misunderstanding, and you mean you get this same exception no matter what you do. 😕

Thanks!

@dmnapolitano
Copy link
Owner

Oh, hello. get_entities_from_text() isn't actually implemented. 😕 My mistake. I'll see if I can put something together there as I definitely why someone would want to find named entities in arbitrary, raw text LOL.

@dmnapolitano dmnapolitano added NER and removed duplicate labels Jul 6, 2014
@dmnapolitano dmnapolitano self-assigned this Jul 6, 2014
@minusplusminus
Copy link

Nice, maybe my problem is then fixed too.

@dmnapolitano
Copy link
Owner

I think so! Definitely re-open if you find this to not be the case. 😄

@Vanaja1505
Copy link
Author

Thanks for your valuable response!

Now it works without any error.

But now the memory usage of Server is increasing as we use continuously...
For Example, server uses 1704M while initiating and increases constantly to 2476M after processing nearly 340k characters text.

Do you have any solution to this? If yes can you please share it??

@dmnapolitano
Copy link
Owner

Hmmm, interesting...I'm going to create a new issue for that.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants