SolrException: Undefined field text_edge #115

GreenArchon · 2016-12-31T21:09:36Z

Hello,

I just installed Nextant (v1.0.3) & ran a first successful index on a Solr 6.3.0 instance. However, I don't seem to get any results on search (except for the default slow Nextcloud search on filenames), and when looking at the Solr logs for a search of "foobar" in Nextcloud, I get the following:

2016-12-31 20:40:18.683 ERROR (qtp606548741-58460) [   x:nextant] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: undefined field text_edge
        at org.apache.solr.schema.IndexSchema.getDynamicFieldType(IndexSchema.java:1308)
        at org.apache.solr.schema.IndexSchema$SolrQueryAnalyzer.getWrappedAnalyzer(IndexSchema.java:452)
        at org.apache.lucene.analysis.DelegatingAnalyzerWrapper$DelegatingReuseStrategy.getReusableComponents(DelegatingAnalyzerWrapper.java:84)
        at org.apache.lucene.analysis.Analyzer.tokenStream(Analyzer.java:191)
        at org.apache.lucene.util.QueryBuilder.createFieldQuery(QueryBuilder.java:206)
        at org.apache.solr.parser.SolrQueryParserBase.newFieldQuery(SolrQueryParserBase.java:371)
        at org.apache.solr.parser.SolrQueryParserBase.getFieldQuery(SolrQueryParserBase.java:741)
        at org.apache.solr.parser.SolrQueryParserBase.getFieldQuery(SolrQueryParserBase.java:384)
        at org.apache.solr.parser.SolrQueryParserBase.handleQuotedTerm(SolrQueryParserBase.java:543)
        at org.apache.solr.parser.QueryParser.Term(QueryParser.java:413)
        at org.apache.solr.parser.QueryParser.Clause(QueryParser.java:180)
        at org.apache.solr.parser.QueryParser.Query(QueryParser.java:101)
        at org.apache.solr.parser.QueryParser.Clause(QueryParser.java:184)
        at org.apache.solr.parser.QueryParser.Query(QueryParser.java:101)
        at org.apache.solr.parser.QueryParser.Clause(QueryParser.java:184)
        at org.apache.solr.parser.QueryParser.Query(QueryParser.java:101)
        at org.apache.solr.parser.QueryParser.TopLevelQuery(QueryParser.java:90)
        at org.apache.solr.parser.SolrQueryParserBase.parse(SolrQueryParserBase.java:152)
        at org.apache.solr.search.LuceneQParser.parse(LuceneQParser.java:50)
        at org.apache.solr.search.QParser.getQuery(QParser.java:140)
        at org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:161)
        at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:269)
        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:153)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:2213)
        at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:654)
        at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:460)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:303)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:254)
        at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
        at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
        at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
        at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
        at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)
        at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
        at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
        at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
        at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
        at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
        at org.eclipse.jetty.server.Server.handle(Server.java:518)
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
        at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)
        at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
        at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
        at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
        at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
        at java.lang.Thread.run(Thread.java:745)

2016-12-31 20:40:18.683 INFO  (qtp606548741-58460) [   x:nextant] o.a.s.c.S.Request [nextant]  webapp=/solr path=/select params={json.nl=flat&hl=true&fl=id,nextant_deleted,nextant_path,nextant_source,nextant_owner,nextant_mtime,nextant_attr_content_type,score&start=0&hl.fragsize=70&fq=nextant_owner:"user1"+OR+nextant_share:"user1"+OR++nextant_sharegroup:[a bunch of groups...]&rows=25&hl.snippets=4&q=((text_edge:"foobar"^150)+OR+(text:"foobar"^1)+OR+(text_edge:"foobar"^5))%0a+OR+(nextant_path:"foobar"^1000+%0a)&omitHeader=true&hl.maxAnalyzedChars=100000&hl.fl=text_edge&wt=json} status=400 QTime=0

Querying Solr manually, I can see that the files have a nextant_attr_text_edge (and can successfully query for it).

However, simply modifying the query Nextant sends to append nextant_attr_ then leads to an undefined field text.

Thanks, and happy new year!

The text was updated successfully, but these errors were encountered:

ArtificialOwl · 2016-12-31T21:16:56Z

can you try a ./occ nextant:check --fix ?

if it is not full green, try multiple time.

GreenArchon · 2016-12-31T23:15:49Z

That helped, with:

[...]
 * Checking field-type 'text_general' : fail
   -> Fixing field-type 'text_general' ok
 * Checking field-type 'text_general_edge' : fail
   -> Fixing field-type 'text_general_edge' ok
 * Checking field-type 'text_general_word' : fail
   -> Fixing field-type 'text_general_word' ok
 * Checking field '_version_' : fail
   -> Fixing field '_version_' ok
 * Checking field 'id' : ok
 * Checking field 'text' : fail
   -> Fixing field 'text' ok
 * Checking field 'text_edge' : fail
   -> Fixing field 'text_edge' ok
 * Checking field 'text_word' : fail
   -> Fixing field 'text_word' ok
 * Checking field 'nextant_path' : fail
   -> Fixing field 'nextant_path' ok
 * Checking field 'nextant_owner' : fail
   -> Fixing field 'nextant_owner' ok
 * Checking field 'nextant_mtime' : fail
   -> Fixing field 'nextant_mtime' ok
 * Checking field 'nextant_share' : fail
   -> Fixing field 'nextant_share' ok
 * Checking field 'nextant_sharegroup' : fail
   -> Fixing field 'nextant_sharegroup' ok
 * Checking field 'nextant_deleted' : fail
   -> Fixing field 'nextant_deleted' ok
 * Checking field 'nextant_source' : fail
   -> Fixing field 'nextant_source' ok
 * Checking field 'nextant_tags' : fail
   -> Fixing field 'nextant_tags' ok
 * Checking field 'nextant_extracted' : fail
   -> Fixing field 'nextant_extracted' ok
 * Checking field 'nextant_ocr' : fail
   -> Fixing field 'nextant_ocr' ok
 * Checking field 'nextant_unmounted' : fail
   -> Fixing field 'nextant_unmounted' ok
 * Checking dynamic-field 'ignored_*' : ok
 * Checking dynamic-field 'nextant_attr_*' : fail
   -> Fixing dynamic-field 'nextant_attr_*' ok
 * Checking copy-field 'text_edge/text' : fail
   -> Fixing copy-field 'text_edge/text' ok
 * Checking copy-field 'text_edge/text_word' : fail
   -> Fixing copy-field 'text_edge/text_word' ok
[...]

All is green now.

However, it seems I'm not out of the woods yet, since all queries still return 0 hits, and I get something like this in the logs:

2016-12-31 22:50:02.792 INFO  (qtp606548741-58453) [   x:nextant] o.a.s.c.S.Request [nextant]  webapp=/solr path=/select params={json.nl=flat&hl=true&fl=id,nextant_deleted,nextant_path,nextant_source,nextant_owner,nextant_mtime,nextant_attr_content_type,score&start=0&hl.fragsize=70&fq=nextant_owner:"user1"+OR+nextant_share:"user1"+OR++nextant_sharegroup:"group1"+OR++nextant_sharegroup:"__all"&rows=25&hl.snippets=4&q=((text_edge:"foobar"^150)+OR+(text:"foobar"^1)+OR+(text_edge:"foobar"^5))%0a+OR+(nextant_path:"foobar"^1000+%0a)&omitHeader=true&hl.maxAnalyzedChars=100000&hl.fl=text_edge&wt=json} hits=0 status=0 QTime=0

Playing with it a bit, I noticed that if I modify manually the query a bit, changing +OR+ to %2BOR%2B (ie HTML escaping), I get results...

ArtificialOwl · 2017-01-01T01:32:00Z

did you reindex after the nextant:check ?

GreenArchon · 2017-01-01T17:59:44Z

Doing it now, I seem to start getting results. I'll confirm it when it's done (the reindex seems to take ~10x more time than the first index for the same files, is that to be expected?) in a few days.

ArtificialOwl · 2017-01-01T20:08:31Z

Well, the schema of your Solr was not ok, so it might be normal that on the first index your files were not totally extracted.
Now, how many files do you have on your cloud, and what kind of equipment if running it ?

GreenArchon · 2017-01-01T20:24:14Z

It's currently indexing ~155k files and has done only ~20k in 10 hours, versus the whole process in 13 the first time (both the files and Solr are local and the bottleneck is mostly php maxing out a core, not much to do here to improve it).

Anyway, I don't mind it taking a few days for the first index, I'll just leave it running in its screen session.

GreenArchon · 2017-01-06T15:13:13Z

After a few retries I finally got a working index, so all is good now, thanks.

Looking back at it, what I did was try to index, have issues, drop the core and recreate it, and of course nextant didn't know about it and thought the schema was still ok... Sorry for the trouble.

GreenArchon closed this as completed Jan 6, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SolrException: Undefined field text_edge #115

SolrException: Undefined field text_edge #115

GreenArchon commented Dec 31, 2016

ArtificialOwl commented Dec 31, 2016

GreenArchon commented Dec 31, 2016

ArtificialOwl commented Jan 1, 2017

GreenArchon commented Jan 1, 2017

ArtificialOwl commented Jan 1, 2017

GreenArchon commented Jan 1, 2017

GreenArchon commented Jan 6, 2017

SolrException: Undefined field text_edge #115

SolrException: Undefined field text_edge #115

Comments

GreenArchon commented Dec 31, 2016

ArtificialOwl commented Dec 31, 2016

GreenArchon commented Dec 31, 2016

ArtificialOwl commented Jan 1, 2017

GreenArchon commented Jan 1, 2017

ArtificialOwl commented Jan 1, 2017

GreenArchon commented Jan 1, 2017

GreenArchon commented Jan 6, 2017