Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lucene not pulled by the dependency puller #219

Closed
GindaChen opened this issue Nov 26, 2020 · 1 comment
Closed

Lucene not pulled by the dependency puller #219

GindaChen opened this issue Nov 26, 2020 · 1 comment

Comments

@GindaChen
Copy link

GindaChen commented Nov 26, 2020

When execute

$ ./pull-dependencies freebase

and then

./run @mode=freebase @domain=webquestions @train=1 @sparqlserver=localhost:3001 @cacheserver=local

I encountered an parser error, followed by an message saying one of the folder lucene/ does not exist, which should have been downloaded in the dependency:

java.lang.RuntimeException: org.apache.lucene.store.NoSuchDirectoryException: 
directory '/mnt/data/sempre/lib/lucene/4.4/inexact' does not exist
    Example lib/data/webquestions/dataset_11/webquestions.examples.train.json:3777 (3777): [what, kind, government, does, the, us, have, ?] => (list (description "Presidential system") (description "Federal republic") (description "Representative democracy") (description "Two-party system") (description "Constitutional republic") (description Republic))
    Dataset stats {
      numTokenTypes = 3604
      numTokensPerExample = 4/ << 7.715 ~ 1.592 >> /15 (3778)
      numExamples.train = 3022
      numExamples.dev = 756
    }
  } [50s, cum. 58s]
  Learner.learn() {
    Iteration 0/3 {
      Processing iter=0.train: 3022 examples {
        Examples {
          iter=0.train: example 0/3022: lib/data/webquestions/dataset_11/webquestions.examples.train.json:1553 {
            Example: where was emperor hadrian born? {
              Tokens: [where, was, emperor, hadrian, born, ?]
              Lemmatized tokens: [where, be, emperor, hadrian, bear, ?]
              POS tags: [WRB, VBD-AUX, NNP, NNP, VBN, .]
              NER tags: [O, O, O, PERSON, O, O]
              NER values: [null, null, null, null, null, null]
              targetValue: (list (description Rome))
              Dependency children: [[], [], [], [compound->2], [advmod->0, auxpass->1, nsubjpass->3, punct->5], []]
            }
            Parser.parse: parse {
              Constructing Searcher {
                Opening index dir: lib/lucene/4.4/inexact/
                ERROR: Composition failed: rule = $Entity -> $NamedEntity (LexiconFn entity inexact), children = [(derivation (formula (string hadrian)) (type fb:type.text))]
java.lang.RuntimeException: org.apache.lucene.store.NoSuchDirectoryException: directory '/mnt/data/sempre/lib/lucene/4.4/inexact' does not exist
	at edu.stanford.nlp.sempre.freebase.LexiconFn.call(LexiconFn.java:237)
	at edu.stanford.nlp.sempre.BeamParserState.applyRule(BeamParser.java:142)
	at edu.stanford.nlp.sempre.BeamParserState.applyCatUnaryRules(BeamParser.java:193)
	at edu.stanford.nlp.sempre.BeamParserState.build(BeamParser.java:126)
	at edu.stanford.nlp.sempre.BeamParserState.infer(BeamParser.java:98)
	at edu.stanford.nlp.sempre.Parser.parse(Parser.java:170)
	at edu.stanford.nlp.sempre.Learner.parseExample(Learner.java:288)
	at edu.stanford.nlp.sempre.Learner.processExamples(Learner.java:199)
	at edu.stanford.nlp.sempre.Learner.learn(Learner.java:125)
	at edu.stanford.nlp.sempre.Learner.learn(Learner.java:90)
	at edu.stanford.nlp.sempre.Main.run(Main.java:27)
	at fig.exec.Execution.runWithObjArray(Execution.java:337)
	at fig.exec.Execution.run(Execution.java:325)
	at edu.stanford.nlp.sempre.Main.main(Main.java:50)
Caused by: org.apache.lucene.store.NoSuchDirectoryException: directory '/mnt/data/sempre/lib/lucene/4.4/inexact' does not exist
	at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:218)
	at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:242)
	at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:712)
	at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
	at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:66)
	at edu.stanford.nlp.sempre.freebase.index.FbEntitySearcher.<init>(FbEntitySearcher.java:45)
	at edu.stanford.nlp.sempre.freebase.EntityLexicon.lookupEntries(EntityLexicon.java:72)
	at edu.stanford.nlp.sempre.freebase.Lexicon.lookupEntities(Lexicon.java:65)
	at edu.stanford.nlp.sempre.freebase.LexiconFn.call(LexiconFn.java:204)
	... 13 more
                ERROR: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.lucene.store.NoSuchDirectoryException: directory '/mnt/data/sempre/lib/lucene/4.4/inexact' does not exist:
edu.stanford.nlp.sempre.BeamParserState.applyRule(BeamParser.java:165)
edu.stanford.nlp.sempre.BeamParserState.applyCatUnaryRules(BeamParser.java:193)
edu.stanford.nlp.sempre.BeamParserState.build(BeamParser.java:126)
edu.stanford.nlp.sempre.BeamParserState.infer(BeamParser.java:98)
edu.stanford.nlp.sempre.Parser.parse(Parser.java:170)
edu.stanford.nlp.sempre.Learner.parseExample(Learner.java:288)
edu.stanford.nlp.sempre.Learner.processExamples(Learner.java:199)
edu.stanford.nlp.sempre.Learner.learn(Learner.java:125)
edu.stanford.nlp.sempre.Learner.learn(Learner.java:90)
edu.stanford.nlp.sempre.Main.run(Main.java:27)
fig.exec.Execution.runWithObjArray(Execution.java:337)
fig.exec.Execution.run(Execution.java:325)
edu.stanford.nlp.sempre.Main.main(Main.java:50)
                ERROR: Caused by java.lang.RuntimeException: org.apache.lucene.store.NoSuchDirectoryException: directory '/mnt/data/sempre/lib/lucene/4.4/inexact' does not exist:
edu.stanford.nlp.sempre.freebase.LexiconFn.call(LexiconFn.java:237)
edu.stanford.nlp.sempre.BeamParserState.applyRule(BeamParser.java:142)
edu.stanford.nlp.sempre.BeamParserState.applyCatUnaryRules(BeamParser.java:193)
edu.stanford.nlp.sempre.BeamParserState.build(BeamParser.java:126)
edu.stanford.nlp.sempre.BeamParserState.infer(BeamParser.java:98)
edu.stanford.nlp.sempre.Parser.parse(Parser.java:170)
edu.stanford.nlp.sempre.Learner.parseExample(Learner.java:288)
edu.stanford.nlp.sempre.Learner.processExamples(Learner.java:199)
edu.stanford.nlp.sempre.Learner.learn(Learner.java:125)
edu.stanford.nlp.sempre.Learner.learn(Learner.java:90)
edu.stanford.nlp.sempre.Main.run(Main.java:27)
fig.exec.Execution.runWithObjArray(Execution.java:337)
fig.exec.Execution.run(Execution.java:325)
edu.stanford.nlp.sempre.Main.main(Main.java:50)
                Execution directory: state/execs/0.exec
3 errors, 0 warnings
              }
Command failed: fig/bin/qcreate java -ea -Dmodules=core,freebase -Xms8G -Xmx10G -cp libsempre/*:lib/* edu.stanford.nlp.sempre.Main -execDir _OUTPATH_ -overwriteExecDir -addToView 0 -SparqlExecutor.endpointUrl http://localhost:3001/sparql -FeatureExtractor.featureDomains basicStats alignmentScores entityFeatures context skipPos joinPos wordSim lexAlign tokenMatch rule opCount constant denotation whType span derivRank lemmaAndBinaries -Builder.executor freebase.SparqlExecutor -Builder.valueEvaluator freebase.FreebaseValueEvaluator -LanguageAnalyzer.languageAnalyzer corenlp.CoreNLPAnalyzer -LexiconFn.lexiconClassName edu.stanford.nlp.sempre.fbalignment.lexicons.Lexicon -BinaryLexicon.binaryLexiconFilesPath lib/fb_data/7/binaryInfoStringAndAlignment.txt -BinaryLexicon.keyToSortBy Intersection_size_typed -UnaryLexicon.unaryLexiconFilePath lib/fb_data/7/unaryInfoStringAndAlignment.txt -EntityLexicon.entityPopularityPath lib/fb_data/7/entityPopularity.txt -TypeInference.typeLookup freebase.FreebaseTypeLookup -FreebaseSearch.cachePath /u/nlp/data/semparse/scr/cache/fbsearch/1.cache -Dataset.inPaths train,lib/data/webquestions/dataset_11/webquestions.examples.train.json -Dataset.trainFrac 0.8 -Dataset.devFrac 0.2 -Grammar.inPaths freebase/data/emnlp2013.grammar -Parser.beamSize 200 -Lexicon.cachePath LexiconFn.cache -SparqlExecutor.cachePath SparqlExecutor.cache -FreebaseSearch.cachePath FreebaseSearch.cache -EntityLexicon.inexactMatchIndex lib/lucene/4.4/inexact/ -LexiconFn.maxEntityEntries 10 -Grammar.tags webquestions bridge join inject inexact -Learner.maxTrainIters 3 -BridgeFn.useBinaryPredicateFeatures true -BridgeFn.filterBadDomain true -Dataset.splitRandom 1

Indeed, I found the following error silently slipped away in the pull-dependency log:


/u/nlp/data/semparse/resources/lucene-core-4.4.0.jar
--2020-11-26 00:50:47--  http://nlp.stanford.edu/software/sempre/dependencies-2.0/u/nlp/data/semparse/resources/lucene-core-4.4.0.jar
Resolving nlp.stanford.edu (nlp.stanford.edu)... 171.64.67.140
Connecting to nlp.stanford.edu (nlp.stanford.edu)|171.64.67.140|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://nlp.stanford.edu/software/sempre/dependencies-2.0/u/nlp/data/semparse/resources/lucene-core-4.4.0.jar [following]
--2020-11-26 00:50:47--  https://nlp.stanford.edu/software/sempre/dependencies-2.0/u/nlp/data/semparse/resources/lucene-core-4.4.0.jar
Connecting to nlp.stanford.edu (nlp.stanford.edu)|171.64.67.140|:443... connected.
HTTP request sent, awaiting response... 301 MOVED PERMANENTLY
Location: https://nlp.stanford.edu/software/sempre/dependencies-2.0/u/nlp/data/semparse/resources/lucene-core-4.4.0.jar/ [following]
--2020-11-26 00:50:48--  https://nlp.stanford.edu/software/sempre/dependencies-2.0/u/nlp/data/semparse/resources/lucene-core-4.4.0.jar/
Connecting to nlp.stanford.edu (nlp.stanford.edu)|171.64.67.140|:443... connected.
HTTP request sent, awaiting response... 416 Requested Range Not Satisfiable

    The file is already fully retrieved; nothing to do.

/u/nlp/data/semparse/resources/lucene-analyzers-common-4.4.0.jar
--2020-11-26 00:50:48--  http://nlp.stanford.edu/software/sempre/dependencies-2.0/u/nlp/data/semparse/resources/lucene-analyzers-common-4.4.0.jar
Resolving nlp.stanford.edu (nlp.stanford.edu)... 171.64.67.140
Connecting to nlp.stanford.edu (nlp.stanford.edu)|171.64.67.140|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://nlp.stanford.edu/software/sempre/dependencies-2.0/u/nlp/data/semparse/resources/lucene-analyzers-common-4.4.0.jar [following]
--2020-11-26 00:50:48--  https://nlp.stanford.edu/software/sempre/dependencies-2.0/u/nlp/data/semparse/resources/lucene-analyzers-common-4.4.0.jar
Connecting to nlp.stanford.edu (nlp.stanford.edu)|171.64.67.140|:443... connected.
HTTP request sent, awaiting response... 301 MOVED PERMANENTLY
Location: https://nlp.stanford.edu/software/sempre/dependencies-2.0/u/nlp/data/semparse/resources/lucene-analyzers-common-4.4.0.jar/ [following]
--2020-11-26 00:50:48--  https://nlp.stanford.edu/software/sempre/dependencies-2.0/u/nlp/data/semparse/resources/lucene-analyzers-common-4.4.0.jar/
Connecting to nlp.stanford.edu (nlp.stanford.edu)|171.64.67.140|:443... connected.
HTTP request sent, awaiting response... 416 Requested Range Not Satisfiable

    The file is already fully retrieved; nothing to do.

/u/nlp/data/semparse/resources/lucene-queryparser-4.4.0.jar
--2020-11-26 00:50:49--  http://nlp.stanford.edu/software/sempre/dependencies-2.0/u/nlp/data/semparse/resources/lucene-queryparser-4.4.0.jar
Resolving nlp.stanford.edu (nlp.stanford.edu)... 171.64.67.140
Connecting to nlp.stanford.edu (nlp.stanford.edu)|171.64.67.140|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://nlp.stanford.edu/software/sempre/dependencies-2.0/u/nlp/data/semparse/resources/lucene-queryparser-4.4.0.jar [following]
--2020-11-26 00:50:49--  https://nlp.stanford.edu/software/sempre/dependencies-2.0/u/nlp/data/semparse/resources/lucene-queryparser-4.4.0.jar
Connecting to nlp.stanford.edu (nlp.stanford.edu)|171.64.67.140|:443... connected.
HTTP request sent, awaiting response... 301 MOVED PERMANENTLY
Location: https://nlp.stanford.edu/software/sempre/dependencies-2.0/u/nlp/data/semparse/resources/lucene-queryparser-4.4.0.jar/ [following]
--2020-11-26 00:50:49--  https://nlp.stanford.edu/software/sempre/dependencies-2.0/u/nlp/data/semparse/resources/lucene-queryparser-4.4.0.jar/
Connecting to nlp.stanford.edu (nlp.stanford.edu)|171.64.67.140|:443... connected.
HTTP request sent, awaiting response... 416 Requested Range Not Satisfiable

    The file is already fully retrieved; nothing to do.
  1. Shall we fetch the lucene directly from the Apache archive?
  2. It seems there should be a folder lucene/inexact when we specify the inexact mode. Is it created at runtime (by sempre)?
@GindaChen
Copy link
Author

GindaChen commented Nov 26, 2020

Just saw this post, and it seems like the link to free917.tar.bz2 has expired is working (if manually download)

@GindaChen GindaChen changed the title Dependency lucene no longer reachable Lucene not pulled by the dependency puller Nov 26, 2020
@GindaChen GindaChen reopened this Nov 27, 2020
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant