Skip to content

Commit

Permalink
Reorganize Solr setup docs
Browse files Browse the repository at this point in the history
1. Put the Solr setup docs inline in the Source install page

2. Remove the single core Solr setup docs, support only multi core

3. Add new docs for creating a second Solr core

I've given the instructions assuming CKAN has a single schema.xml file
(and not schema-1.4.xml, schema-2.0.xml, etc.) which I don't think is
correct at the moment, but I think we're changing to it.

If we're changing to a single schema.xml file, then the package and
source upgrade instructions that tell you to update your schema.xml
symlink can be deleted/

Also renamed the default Solr core and schema from ckan-schema-2.0 to
ckan_default. It's not one Solr core per CKAN schema, it's one Solr core
per CKAN instance, that makes more sense to me.

These new docs are untested.
  • Loading branch information
Sean Hammond committed Nov 15, 2013
1 parent 35e5729 commit 66340f8
Show file tree
Hide file tree
Showing 5 changed files with 188 additions and 283 deletions.
2 changes: 1 addition & 1 deletion doc/configuration.rst
Expand Up @@ -392,7 +392,7 @@ solr_url

Example::

solr_url = http://solr.okfn.org:8983/solr/ckan-schema-2.0
solr_url = http://solr.okfn.org:8983/solr/ckan_default

Default value: ``http://solr.okfn.org:8983/solr``

Expand Down
189 changes: 179 additions & 10 deletions doc/install-from-source.rst
Expand Up @@ -18,7 +18,7 @@ work on CKAN.
If you're using a Debian-based operating system (such as Ubuntu) install the
required packages with this command::

sudo apt-get install python-dev postgresql libpq-dev python-pip python-virtualenv git-core solr-jetty openjdk-6-jdk
sudo apt-get install python-dev postgresql libpq-dev python-pip python-virtualenv git-core

If you're not using a Debian-based operating system, find the best way to
install the following packages on your operating system (see
Expand Down Expand Up @@ -195,28 +195,147 @@ site_id

ckan.site_id = default

.. _setup solr:

5. Setup Solr
~~~~~~~~~~~~~

Follow the instructions in :ref:`solr-single` or :ref:`solr-multi-core` to
setup Solr, then change the ``solr_url`` option in your CKAN config file to
point to your Solr server, for example::
CKAN uses Solr_ as its search platform, and uses a customized Solr schema file
that takes into account CKAN's specific search needs. Now that we have CKAN
installed, we need to install and configure Solr.

solr_url=http://127.0.0.1:8983/solr
.. _Solr: http://lucene.apache.org/solr/

.. toctree::
:hidden:
.. note::

These instructions explain how to deploy Solr using the Jetty web
server, but CKAN doesn't require Jetty - you can deploy Solr to another web
server, such as Tomcat, if that's convenient on your operating system.

#. Install Solr::

sudo apt-get install solr-jetty openjdk-6-jdk

Edit the Jetty configuration file (``/etc/default/jetty``) and change the
following variables::

NO_START=0 # (line 4)
JETTY_HOST=127.0.0.1 # (line 15)
JETTY_PORT=8983 # (line 18)

Start the Jetty server::

sudo service jetty start

You should now see a welcome page from Solr if you open
http://localhost:8983/solr/ in your web browser (replace localhost with
your server address if needed).

.. note::

If you get the message ``Could not start Jetty servlet engine because no
Java Development Kit (JDK) was found.`` then you will have to edit the
``JAVA_HOME`` setting in ``/etc/default/jetty`` to point to your machine's
JDK install location. For example::

JAVA_HOME=/usr/lib/jvm/java-6-openjdk-amd64/

or::

JAVA_HOME=/usr/lib/jvm/java-6-openjdk-i386/

#. Create the file ``/usr/share/solr/solr.xml``, with the following contents::

<solr persistent="true" sharedLib="lib">
<cores adminPath="/admin/cores">
<core name="ckan_default" instanceDir="ckan_default">
<property name="dataDir" value="/var/lib/solr/data/ckan_default" />
</core>
</cores>
</solr>

This file lists the different Solr cores, in this example we have just a
single core called ``ckan_default``.

.. note::

Solr can also be set up to have multiple configurations and search indexes
on the same machine. Each configuration is called a Solr *core*. Having
multiple cores is useful when you want different applications or different
versions of CKAN to share the same Solr instance. Each core will
have a different URL, for example::

http://localhost:8983/solr/ckan_default
http://localhost:8983/solr/some-other-site

If you've setup a second CKAN instance on the same machine and want to
create a second Solr core for it,
see :doc:`/howtos/create-a-second-solr-core`.

#. Create the data directory for your Solr core, run this command in a
terminal::

sudo -u jetty mkdir /var/lib/solr/data/ckan_default

This is the directory where Solr will store the search index files for
our core.

#. Create the directory ``/etc/solr/ckan_default``, and move the
``/etc/solr/conf`` directory into it::

sudo mkdir /etc/solr/ckan_default
sudo mv /etc/solr/conf /etc/solr/ckan_default/

This directory holds the configuration files for your Solr core.

#. Replace the ``/etc/solr/ckan_default/schema.xml`` file with a symlink to
CKAN's ``schema.xml`` file::

sudo mv /etc/solr/ckan_default/conf/schema.xml /etc/solr/ckan_default/conf/schema.xml.bak
sudo ln -s /usr/lib/ckan/default/src/ckan/ckan/config/solr/schema-2.0.xml /etc/solr/ckan_default/conf/schema.xml

#. Edit ``/etc/solr/ckan_default/conf/solrconfig.xml`` and change the
``<dataDir>`` tag to this::

<dataDir>${dataDir}</dataDir>

This configures our ``ckan_default`` core to use the data directory you
specified for it in ``solr.xml``.

#. Create the directory ``/usr/share/solr/ckan_default`` and put a symlink
to the ``conf`` directory in it::

sudo mkdir /usr/share/solr/ckan_default
sudo ln -s /etc/solr/ckan_default/conf /usr/share/solr/ckan_default/conf

.. todo:: What is this directory for?

#. Restart jetty::

sudo service jetty restart

You should now see your newly created ``ckan_default`` core if you open
http://localhost:8983/solr/ckan_default/admin/ in your web browser.
You can click on the *schema* link on this page to check that the core is
using the right schema (you should see ``<schema name="ckan" version="2.0">``
near the top of the ``schema.xml`` file). The http://localhost:8983/solr/
page will list all of your configured Solr cores.

#. Finally, change the ``solr_url`` setting in your |development.ini| or
|production.ini| file to point to your new Solr core, for example::

solr_url = http://127.0.0.1:8983/solr/ckan_default

If you have trouble when setting up Solr, see :ref:`solr troubleshooting`
below.

solr-setup

.. _postgres-init:

6. Create database tables
~~~~~~~~~~~~~~~~~~~~~~~~~

Now that you have a configuration file that has the correct settings for your
database, you can create the database tables:
Create the |postgres| database tables that CKAN needs:

.. parsed-literal::
Expand Down Expand Up @@ -277,3 +396,53 @@ Now that you've installed CKAN, you should:
as Apache or Nginx. See :doc:`deployment`.

* Begin using and customizing your site, see :doc:`/getting-started`.

------------------------------
Source install troubleshooting
------------------------------

.. _solr troubleshooting:

Solr setup troubleshooting
==========================

Solr requests and errors are logged in the web server log.

* For jetty servers, they are located in::

/var/log/jetty/<date>.stderrout.log

* For Tomcat servers, they are located in::

/var/log/tomcat6/catalina.<date>.log

Some problems that can be found during the install:

* When setting up a multi-core Solr instance, no cores are shown when visiting the
Solr index page, and the admin interface returns a 404 error.

Check the web server error log if you can find an error similar to this one::

WARNING: [iatiregistry.org] Solr index directory '/usr/share/solr/iatiregistry.org/data/index' doesn't exist. Creating new index...
07-Dec-2011 18:06:33 org.apache.solr.common.SolrException log
SEVERE: java.lang.RuntimeException: Cannot create directory: /usr/share/solr/iatiregistry.org/data/index
[...]

The ``dataDir`` is not properly configured. With our setup the data directory should
be under ``/var/lib/solr/data``. Make sure that you defined the correct ``dataDir``
in the ``solr.xml`` file and that in the ``solrconfig.xml`` file you have the
following configuration option::

<dataDir>${dataDir}</dataDir>

* When running Solr it says ``Unable to find a javac compiler; com.sun.tools.javac.Main is not on the classpath. Perhaps JAVA_HOME does not point to the JDK.``

See the note above about ``JAVA_HOME``. Alternatively you may not have installed the JDK. Check by seeing if javac is installed::

which javac

If it isn't do::

sudo apt-get install openjdk-6-jdk

and restart Solr.

0 comments on commit 66340f8

Please sign in to comment.