Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StanfordCoreNlp object hangs when loading NER classifiers #54

Closed
zanek opened this issue Oct 18, 2016 · 4 comments
Closed

StanfordCoreNlp object hangs when loading NER classifiers #54

zanek opened this issue Oct 18, 2016 · 4 comments

Comments

@zanek
Copy link

zanek commented Oct 18, 2016

This is quite odd. If I compile my project as 32-bit, it loads the classifiers just fine and everything works.

If I compile the project as 64-bit it starts loading all the NER classifiers and and then hangs on the english.4class.conll.distsim.crf.ser.gz model. Loading the models before that are really slow on 64-bit, but it never finishes loading the conll model (I've left it running for 30+ min)

Does anyone have any idea why this would happen on 64-bit only ?

@sergey-tihon
Copy link
Owner

sergey-tihon commented Oct 18, 2016

Just did a comparison on tests projects - I do not see such issue
#40 (comment)

Runtime Time
x86 00:04:18.5871957
x64 00:03:34.5648107
AnyCpu 00:03:53.0988898

@vermouthmjl
Copy link

vermouthmjl commented Apr 19, 2018

I also have this problem. When I do not load NER annotator, everything works fine. When I only load NER, memory use caps at about 500MB, then nothing happens, and the program hangs.

I'm using the nuget package Stanford.NLP.CoreNLP 3.9.1.0 in C#, for net45, in x64.

Accessorilly, the package Nuget Stanford.NLP.NER works fine but unfortunately it does not provide the same flexibility as the corenlp pipeline.

Here is the subtree provided by the profiler from the moment we instantiate the StanfordCoreNLP object:

78.82%   StanfordCoreNLP..ctor  •  39,444 ms  •  edu.stanford.nlp.pipeline.StanfordCoreNLP..ctor
  78.82%   StanfordCoreNLP..ctor  •  39,444 ms  •  edu.stanford.nlp.pipeline.StanfordCoreNLP..ctor
    78.82%   StanfordCoreNLP..ctor  •  39,444 ms  •  edu.stanford.nlp.pipeline.StanfordCoreNLP..ctor
      78.78%   construct  •  39,426 ms  •  edu.stanford.nlp.pipeline.StanfordCoreNLP.construct
        78.47%   get  •  39,269 ms  •  edu.stanford.nlp.pipeline.AnnotatorPool.get
          78.47%   get  •  39,269 ms  •  edu.stanford.nlp.util.Lazy.get
            78.47%   compute  •  39,269 ms  •  edu.stanford.nlp.util.Lazy+3.compute
              78.47%   get  •  39,269 ms  •  edu.stanford.nlp.pipeline.StanfordCoreNLP+__<>Anon38.get
                78.47%   lambda$null$69  •  39,269 ms  •  edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$null$69
                  78.47%   apply  •  39,269 ms  •  edu.stanford.nlp.pipeline.StanfordCoreNLP+__<>Anon8.apply
                    78.47%   lambda$getNamedAnnotators$46  •  39,269 ms  •  edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$getNamedAnnotators$46
                      78.47%   tokensRegexNER  •  39,269 ms  •  edu.stanford.nlp.pipeline.AnnotatorImplementations.tokensRegexNER
                        78.45%   TokensRegexNERAnnotator..ctor  •  39,263 ms  •  edu.stanford.nlp.pipeline.TokensRegexNERAnnotator..ctor
                          62.04%   createPatternMatcher  •  31,050 ms  •  edu.stanford.nlp.pipeline.TokensRegexNERAnnotator.createPatternMatcher
                            61.71%   getNewEnv  •  30,883 ms  •  edu.stanford.nlp.ling.tokensregex.TokenSequencePattern.getNewEnv
                              61.69%   TokenSequenceParser..ctor  •  30,871 ms  •  edu.stanford.nlp.ling.tokensregex.parser.TokenSequenceParser..ctor
                                61.69%   TokenSequenceParser+LookaheadSuccess..ctor  •  30,871 ms  •  edu.stanford.nlp.ling.tokensregex.parser.TokenSequenceParser+LookaheadSuccess..ctor
                                  61.69%   TokenSequenceParser+LookaheadSuccess..ctor  •  30,871 ms  •  edu.stanford.nlp.ling.tokensregex.parser.TokenSequenceParser+LookaheadSuccess..ctor
                                    61.69%   Error..ctor  •  30,871 ms  •  java.lang.Error..ctor
                                      61.69%   Throwable..ctor  •  30,871 ms  •  java.lang.Throwable..ctor
                                        61.69%   fillInStackTrace  •  30,871 ms  •  java.lang.Throwable.fillInStackTrace
                                           61.69%   CaptureStackTrace  •  30,871 ms  •  System.Diagnostics.StackTrace.CaptureStackTrace
                              ►0.02%   initDefaultBindings  •  12 ms  •  edu.stanford.nlp.ling.tokensregex.Env.initDefaultBindings
                            ►0.11%   valueOf  •  53 ms  •  edu.stanford.nlp.ling.tokensregex.CoreMapNodePattern.valueOf
                            ►0.09%   add  •  48 ms  •  java.util.ArrayList.add
                            ►0.07%   [Native code]  •  36 ms
                            ►0.06%   compile  •  30 ms  •  edu.stanford.nlp.ling.tokensregex.TokenSequencePattern.compile
                          ►16.35%   readEntries  •  8,183 ms  •  edu.stanford.nlp.pipeline.TokensRegexNERAnnotator.readEntries
                          ►0.04%   valueOf  •  18 ms  •  edu.stanford.nlp.pipeline.TokensRegexNERAnnotator+PosMatchType.valueOf
                          ►0.01%   lookupAnnotationKeyWithClassname  •  6 ms  •  edu.stanford.nlp.ling.tokensregex.EnvLookup.lookupAnnotationKeyWithClassname
        ►0.16%   info  •  78 ms  •  edu.stanford.nlp.util.logging.Redwood+RedwoodChannels.info
        ►0.14%   [Native code]  •  72 ms
      ►0.02%   constructAnnotatorPool  •  12 ms  •  edu.stanford.nlp.pipeline.StanfordCoreNLP.constructAnnotatorPool

@vermouthmjl
Copy link

To provide some additional information, it also hangs when I instantiate an NERCombinerAnnotator, with the following code.

            var NER_7CLASS = modelsDir + @"\ner\english.all.3class.distsim.crf.ser.gz";

            var curDir = Environment.CurrentDirectory;
            Directory.SetCurrentDirectory(jarRoot);
            Properties props = new Properties();
            props.setProperty("ner.applyNumericClassifiers", "0");
            props.setProperty("ner.useSUTime", "0");
            props.setProperty("ner.model", NER_7CLASS);
            props.setProperty("ner.applyFineGrained", "0");
            NERClassifierCombiner ner = NERClassifierCombiner.createNERClassifierCombiner("ner", props);
            var unthreadedAnnotator = new NERCombinerAnnotator(ner, false, 1, 5);

One other question is that the program goes into setUpFineGrainedNER even though I put props.setProperty("ner.applyFineGrained", "0"); in my code.

The following subtree is captured in the profiler:

74.42%   NERCombinerAnnotator..ctor  •  25,353 ms  •  edu.stanford.nlp.pipeline.NERCombinerAnnotator..ctor
  74.42%   NERCombinerAnnotator..ctor  •  25,353 ms  •  edu.stanford.nlp.pipeline.NERCombinerAnnotator..ctor
    74.42%   NERCombinerAnnotator..ctor  •  25,353 ms  •  edu.stanford.nlp.pipeline.NERCombinerAnnotator..ctor
      74.42%   setUpFineGrainedNER  •  25,353 ms  •  edu.stanford.nlp.pipeline.NERCombinerAnnotator.setUpFineGrainedNER
        74.40%   TokensRegexNERAnnotator..ctor  •  25,347 ms  •  edu.stanford.nlp.pipeline.TokensRegexNERAnnotator..ctor
          51.62%   createPatternMatcher  •  17,585 ms  •  edu.stanford.nlp.pipeline.TokensRegexNERAnnotator.createPatternMatcher
            51.44%   getNewEnv  •  17,525 ms  •  edu.stanford.nlp.ling.tokensregex.TokenSequencePattern.getNewEnv
              51.41%   TokenSequenceParser..ctor  •  17,513 ms  •  edu.stanford.nlp.ling.tokensregex.parser.TokenSequenceParser..ctor
                51.41%   TokenSequenceParser+LookaheadSuccess..ctor  •  17,513 ms  •  edu.stanford.nlp.ling.tokensregex.parser.TokenSequenceParser+LookaheadSuccess..ctor
                  51.41%   TokenSequenceParser+LookaheadSuccess..ctor  •  17,513 ms  •  edu.stanford.nlp.ling.tokensregex.parser.TokenSequenceParser+LookaheadSuccess..ctor
                    51.41%   Error..ctor  •  17,513 ms  •  java.lang.Error..ctor
                      51.41%   Throwable..ctor  •  17,513 ms  •  java.lang.Throwable..ctor
                        51.41%   fillInStackTrace  •  17,513 ms  •  java.lang.Throwable.fillInStackTrace
                          51.41%   CaptureStackTrace  •  17,513 ms  •  System.Diagnostics.StackTrace.CaptureStackTrace
                            51.30%   InitializeSourceInfo  •  17,477 ms  •  System.Diagnostics.StackFrameHelper.InitializeSourceInfo
                              50.30%   GetSourceLineInfo  •  17,137 ms  •  System.Diagnostics.StackTraceSymbols.GetSourceLineInfo
                                49.43%   TryGetReader  •  16,839 ms  •  System.Diagnostics.StackTraceSymbols.TryGetReader
                                  49.37%   TryOpenReaderFromAssemblyFile  •  16,821 ms  •  System.Diagnostics.StackTraceSymbols.TryOpenReaderFromAssemblyFile
                                    41.79%   TryOpenFile  •  14,238 ms  •  System.Diagnostics.StackTraceSymbols.TryOpenFile
                                      35.65%   FileStream..ctor  •  12,144 ms  •  System.IO.FileStream..ctor
                                        35.51%   Init  •  12,096 ms  •  System.IO.FileStream.Init
                                          35.00%   SafeCreateFile  •  11,922 ms  •  Microsoft.Win32.Win32Native.SafeCreateFile
                                             0.02%   SafeHandleAddRef  •  6 ms  •  System.StubHelpers.StubHelpers.SafeHandleAddRef
                                          ►0.30%   LegacyNormalizePath  •  102 ms  •  System.IO.Path.LegacyNormalizePath
                                          ►0.05%   SetErrorMode  •  18 ms  •  Microsoft.Win32.Win32Native.SetErrorMode
                                          ►0.05%   EmulateFileIOPermissionChecks  •  18 ms  •  System.Security.Permissions.FileIOPermission.EmulateFileIOPermissionChecks
                                           0.02%   get_BlockLongPaths  •  6 ms  •  System.AppContextSwitches.get_BlockLongPaths
                                           0.02%   StartsWith  •  6 ms  •  System.String.StartsWith
                                           0.02%   SafeHandleRelease  •  6 ms  •  System.StubHelpers.StubHelpers.SafeHandleRelease
                                        ►0.07%   GetFileName  •  24 ms  •  System.IO.Path.GetFileName
                                        ►0.05%   InternalSubString  •  18 ms  •  System.String.InternalSubString
                                      ►6.06%   InternalExistsHelper  •  2,065 ms  •  System.IO.File.InternalExistsHelper
                                    ►4.24%   TryOpenAssociatedPortablePdb  •  1,444 ms  •  System.Reflection.PortableExecutable.PEReader.TryOpenAssociatedPortablePdb
                                    ►2.93%   Dispose  •  997 ms  •  System.Reflection.PortableExecutable.PEReader.Dispose
                                    ►0.42%   PEReader..ctor  •  142 ms  •  System.Reflection.PortableExecutable.PEReader..ctor
                                ►0.84%   Assert  •  286 ms  •  System.Security.CodeAccessPermission.Assert
                                 0.02%   [Garbage collection]  •  6 ms
                              ►0.16%   CreateDelegate  •  54 ms  •  System.Reflection.RuntimeMethodInfo.CreateDelegate
                              ►0.04%   Assert  •  12 ms  •  System.Security.CodeAccessPermission.Assert
                              ►0.04%   CreateInstance  •  12 ms  •  System.Activator.CreateInstance
                               0.02%   [Garbage collection]  •  6 ms
                               0.02%   ReflectionPermission..ctor  •  6 ms  •  System.Security.Permissions.ReflectionPermission..ctor
                            ►0.07%   GetMethodBase  •  24 ms  •  System.Diagnostics.StackFrameHelper.GetMethodBase
                             0.02%   [Garbage collection]  •  6 ms
              ►0.04%   Env..ctor  •  12 ms  •  edu.stanford.nlp.ling.tokensregex.Env..ctor
            ►0.05%   valueOf  •  18 ms  •  edu.stanford.nlp.ling.tokensregex.CoreMapNodePattern.valueOf
            ►0.04%   add  •  12 ms  •  java.util.ArrayList.add
            ►0.04%   [Native code]  •  12 ms
            ►0.03%   compile  •  11 ms  •  edu.stanford.nlp.ling.tokensregex.TokenSequencePattern.compile
          ►22.75%   readEntries  •  7,750 ms  •  edu.stanford.nlp.pipeline.TokensRegexNERAnnotator.readEntries
          ►0.02%   lookupAnnotationKeyWithClassname  •  6 ms  •  edu.stanford.nlp.ling.tokensregex.EnvLookup.lookupAnnotationKeyWithClassname
          ►0.02%   split  •  6 ms  •  java.util.regex.Pattern.split

@sergey-tihon
Copy link
Owner

Close as an old issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants