Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setup instructions for local Wikipedia dump needs better guidelines #60

Closed
talentoscope opened this issue Sep 13, 2016 · 5 comments
Closed

Comments

@talentoscope
Copy link

I am following along from the README.md in data/enwiki and have created the data/example folder, put the dump, and extraction, and single xml in this folder.
Solr was extracted to the data/example directory like data/example/solr-versionnumber
The files in data/enwiki have been copied to data/example as instructed, and /data/example/collection1/conf/data-config.xml has been edited to reflect the dump date I appended to the filename.
In /data/example, I have attempted to run the start.jar using "java -Dsolr.solr.home=enwiki -jar start.jar", but I get "WARNING: Nothing to start, exiting ..."

I'm guessing this is probably due to folder placement error, but there is no breakdown or full explanation on how it should be structured, or at least it's difficult to follow.

Any guidance on how you did this would be greatly appreciated.

@talentoscope
Copy link
Author

Realised the mistake. All of this should be in data/enwiki itself.
Still, starting solr with java gives Error: Unable to access jarfile start.jar.

This start.jar does not appear to be in the solr/example directory from the download. It is however in the server folder, but not sure if just copying a jar file will solve this, so am trying to use the 4.6.0 version from the README

@talentoscope
Copy link
Author

Used version 4.6.0, but I am told it is unable to create collection1. Obviously this already exists in data/enwiki that is symlinked to example/. Is it supposed to be creating it itself, or is there an undocumented issue?

3393 [coreLoadExecutor-3-thread-1] INFO org.apache.solr.core.CachingDirectoryFactory – looking to close /home/roy/yodaqa/data/enwiki/collection1/data [CachedDir<<refCount=0;path=/home/roy/yodaqa/data/enwiki/collection1/data;done=false>>]
3393 [coreLoadExecutor-3-thread-1] INFO org.apache.solr.core.CachingDirectoryFactory – Closing directory: /home/roy/yodaqa/data/enwiki/collection1/data
3393 [coreLoadExecutor-3-thread-1] INFO org.apache.solr.core.CachingDirectoryFactory – looking to close /home/roy/yodaqa/data/enwiki/collection1/data/index [CachedDir<<refCount=0;path=/home/roy/yodaqa/data/enwiki/collection1/data/index;done=false>>]
3393 [coreLoadExecutor-3-thread-1] INFO org.apache.solr.core.CachingDirectoryFactory – Closing directory: /home/roy/yodaqa/data/enwiki/collection1/data/index
3394 [coreLoadExecutor-3-thread-1] ERROR org.apache.solr.core.CoreContainer – Unable to create core: collection1
org.apache.solr.common.SolrException: RequestHandler init failure
at org.apache.solr.core.SolrCore.(SolrCore.java:834)
at org.apache.solr.core.SolrCore.(SolrCore.java:625)
at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:557)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:592)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:271)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:263)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.common.SolrException: RequestHandler init failure
at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:167)
at org.apache.solr.core.SolrCore.(SolrCore.java:768)
... 11 more
Caused by: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.handler.dataimport.DataImportHandler'
at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:470)
at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:401)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:526)
at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:599)
at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:153)
... 12 more

@pasky
Copy link
Member

pasky commented Sep 13, 2016

To be clear, the example/ subdirectory should be part of solr-4.6.0 and should contain a start.jar. You are expected to symlink yodaqa's data/enwiki/ to enwiki in the example/ subdirectory.

If you symlink things around, what might break is

  <lib dir="../../../contrib/dataimporthandler/lib" regex=".*\.jar" />
  <lib dir="../../../dist/" regex="solr-dataimporthandler-.*\.jar" />

in data/enwiki/collection1/conf/solrconfig.xml - I'd try to put in some absolute paths with solr-4.6.0 directory instead of ../../../

@talentoscope
Copy link
Author

Tried explicity stating ~/yodaqa/... etc but that just concatenated the ../../../contrib/....etc to the command, which in theory should've worked, going 3 levels up to the contrib folder, but for some reason it just doesn't like it.

Out of ideas now, so trying to start solr on its own, add the enwiki xml to it using post.jar/post.sh and trying it that way, and then point yodaqa to that instance. Should work, it's essentially the same thing, and have copied the collection1 folder contents to the new one. Fingers crossed!

@talentoscope
Copy link
Author

This worked. Closing report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants