Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nextant not scraping documents #74

Closed
huntersli opened this issue Nov 17, 2016 · 8 comments
Closed

Nextant not scraping documents #74

huntersli opened this issue Nov 17, 2016 · 8 comments
Assignees
Labels
Milestone

Comments

@huntersli
Copy link

Firstly my apologies if this has already been covered. I could not see a similar issue.

On a frash solr install and nextant 0.6.6 I am unable to complete a full test search

running nextant:check --info results in the following

array (
'instant' => true,
'configured' => '1',
'ping' => true,
'solr_url' => 'HIDDEN',
'solr_core' => 'HIDDEN',
'solr_timeout' => '30',
'nextant_version' => '0.6.6 (beta)',
'index_files' => '1',
'index_files_needed' => '0',
'index_files_max_size' => '40',
'index_files_tree' => '1',
'index_files_sharelink' => '1',
'index_files_external' => '0',
'index_files_encrypted' => '0',
'index_files_filters_text' => '1',
'index_files_filters_pdf' => '1',
'index_files_filters_office' => '1',
'index_files_filters_image' => '0',
'index_files_filters_audio' => '0',
'index_files_filters_extensions' => '',
'display_result' => '2',
'replace_core_search' => 0,
'current_docs' => 93199,
'current_segments' => 22,
'bookmarks_app_enabled' => false,
'index_bookmarks' => '1',
'index_bookmarks_needed' => '0',
'index_live' => '1',
'index_live_queuekey' => '540???',
'index_delay' => '2',
'index_locked' => '1479409886',
'index_files_last' => '1479339676',
'index_files_last_format' => 'Wed, 16 Nov 2016 23:41:16 +0000',
'index_bookmarks_last' => '1479339676',
'index_bookmarks_last_format' => 'Wed, 16 Nov 2016 23:41:16 +0000',
'source' => 'check',
)
Pinging 10.72.0.2:8983/solr/nextant2 : ok
Checking Solr schema fields

  • Checking dynamic-field 'nextant_attr_*' : ok
  • Checking field 'nextant_path' : ok
  • Checking field 'text' : ok
  • Checking field 'nextant_owner' : ok
  • Checking field 'nextant_mtime' : ok
  • Checking field 'nextant_share' : ok
  • Checking field 'nextant_sharegroup' : ok
  • Checking field 'nextant_deleted' : ok
  • Checking field 'nextant_source' : ok
  • Checking field 'nextant_tags' : ok
  • Checking field 'nextant_extracted' : ok
  • Checking field 'nextant_ocr' : ok
  • Checking field 'nextant_unmounted' : ok
  • Checking field-type 'text_general' : fail
@ArtificialOwl
Copy link
Member

First, you should fix the 'text_general' field-type:

./occ nextant:check --fix

Then, have a look to this thread:

https://help.nextcloud.com/t/help-with-nextant/5124

@huntersli
Copy link
Author

@daita thank you for your fast response I must admit I don't really know what i am doing with solr/nextant I have started looking at the thread you have posted, however, regarding your first recommendation the check --fix command does not appear to fix the text_general field-type. the command runs as bellow but if the check is re-run i t still fails repeatedly.

root@webserver:~# sudo -u www-data php /var/www/owncloud/occ nextant:check --fix
Pinging 10.72.0.2:8983/solr/nextant2 : ok
Checking Solr schema fields

  • Checking dynamic-field 'nextant_attr_*' : ok
  • Checking field 'nextant_path' : ok
  • Checking field 'text' : ok
  • Checking field 'nextant_owner' : ok
  • Checking field 'nextant_mtime' : ok
  • Checking field 'nextant_share' : ok
  • Checking field 'nextant_sharegroup' : ok
  • Checking field 'nextant_deleted' : ok
  • Checking field 'nextant_source' : ok
  • Checking field 'nextant_tags' : ok
  • Checking field 'nextant_extracted' : ok
  • Checking field 'nextant_ocr' : ok
  • Checking field 'nextant_unmounted' : ok
  • Checking field-type 'text_general' : fail
    -> Fixing field-type 'text_general'

Your solr contains 93199 documents :

  • 93199 files
  • 0 bookmarks
  • 24 segments

just a little more information for you.. i did have to remove and re added the nextant app. this resolved some of the problems i was having.

@ArtificialOwl
Copy link
Member

ArtificialOwl commented Nov 17, 2016

hmm, thanks for the report, I guess there is a conflict with the default field-type from the default config of your core (solr). I'll see what I can do to fix this next release

Note that I do not think this create any major issue; nextant should still work

@ArtificialOwl ArtificialOwl added this to the 0.10.x milestone Nov 17, 2016
@ArtificialOwl ArtificialOwl self-assigned this Nov 17, 2016
@ArtificialOwl
Copy link
Member

In the meanwhile, can you check the logs of your solr when you're trying to nextant:check --fix ? Does it produce any error message ?

@huntersli
Copy link
Author

no errors are produced @daita I have run the check along with verbose logging. I did have a large amount of the bellow errors

LukeRequestHandler Error getting file length for [segments_97]

java.nio.file.NoSuchFileException: /var/solr/data/nextant2/data/index/segments_97
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:55)
at sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:144)
at sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99)
at java.nio.file.Files.readAttributes(Files.java:1737)
at java.nio.file.Files.size(Files.java:2332)
at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:210)
at org.apache.lucene.store.NRTCachingDirectory.fileLength(NRTCachingDirectory.java:124)
at org.apache.solr.handler.admin.LukeRequestHandler.getFileLength(LukeRequestHandler.java:604)
at org.apache.solr.handler.admin.LukeRequestHandler.getIndexInfo(LukeRequestHandler.java:592)
at org.apache.solr.handler.admin.LukeRequestHandler.handleRequestBody(LukeRequestHandler.java:137)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:155)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2102)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:654)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:460)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:499)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:745)

However i font think there are related. unfortunately I am currently moving the VM to a new host so will not be able to fault find this further until that is complete.

Thank you for all your help

@ArtificialOwl
Copy link
Member

@ArtificialOwl
Copy link
Member

Can you try this version: https://github.com/nextcloud/nextant/releases/tag/0.10.0-rc ?

@ArtificialOwl
Copy link
Member

0.10.0 is out and should fix this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants