Skip to content

Commit

Permalink
Adaptations to support Solr 5.0.0 along with 4.10.2
Browse files Browse the repository at this point in the history
    - deploying Cores with new managed-schema (not schema.xml) and the Schema REST API
    - edited README and installing_search_engines.rst for both solr 4.10.2 and 5.0.0
    - changed build_solr_schema command, plus updated tests
  • Loading branch information
Elias committed Apr 1, 2015
1 parent 97475f2 commit d5ca733
Show file tree
Hide file tree
Showing 6 changed files with 191 additions and 85 deletions.
59 changes: 44 additions & 15 deletions docs/installing_search_engines.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,30 +11,58 @@ Official Download Location: http://www.apache.org/dyn/closer.cgi/lucene/solr/

Solr is Java but comes in a pre-packaged form that requires very little other
than the JRE and Jetty. It's very performant and has an advanced featureset.
Haystack suggests using Solr 3.5+, though it's possible to get it working on
Solr 1.4 with a little effort. Installation is relatively simple::
Haystack suggests using Solr 5.0+, though it's possible to get it working on
Solr 1.4 with a little effort. Installation is relatively simple:

Install the latest version 5.0.0
----------------------------

::

curl -LO https://archive.apache.org/dist/lucene/solr/5.0.0/solr-5.0.0.tgz
tar xvzf solr-5.0.0.tgz
cd solr-5.0.0
cd bin/solr start


Then you create at least on "cloud" core :
::
sudo -u solr bin/solr create -c mycorename

Your core is now accessible by default at http://localhost:8983/solr/#/mycorename

Install version 4.10.2
-----------------------
::

curl -LO https://archive.apache.org/dist/lucene/solr/4.10.2/solr-4.10.2.tgz
tar xvzf solr-4.10.2.tgz
cd solr-4.10.2
cd example
java -jar start.jar

You'll need to revise your schema. You can generate this from your application
(once Haystack is installed and setup) by running
``./manage.py build_solr_schema``. Take the output from that command and place
it in ``solr-4.10.2/example/solr/collection1/conf/schema.xml``. Then restart Solr.
Your server is now accessible at http://localhost:8983/solr/

.. note::
``build_solr_schema`` uses a template to generate ``schema.xml``. Haystack
provides a default template using some sensible defaults. If you would like
to provide your own template, you will need to place it in
``search_configuration/solr.xml``, inside a directory specified by your app's
``TEMPLATE_DIRS`` setting. Examples::
Update your schema from the indexes
------------------------------------

You can update the schema from your application (once Haystack is installed and setup) by running:

- since Solr 5.0.0 ``./manage.py build_solr_schema``.
- for previous version (4.10.2) ``./manage.py build_solr_schema --stdout``. or ``./manage.py build_solr_schema --filename schema.xml``

/myproj/myapp/templates/search_configuration/solr.xml
# ...or...
/myproj/templates/search_configuration/solr.xml
When using the version 5.0.0 of Solr, the command uses the Schema REST API described at https://wiki.apache.org/solr/SchemaRESTAPI
Solr 5.0.0 provides many default dynamicFields and fieldTypes configured at the Core's creation by default.
If you like, these type are customisable using the Schema REST API Core's, or the http admin interface.

When using previous version 4.10.2, please provide --filename schema.xml to export the schema file, and so that you can copy it to your solr installation.
build_solr_schema uses a template to generate schema.xml.
Haystack provides a default template using some sensible defaults. If you would like to provide your own template, you will need to place it in search_configuration/solr.xml, inside a directory specified by your app's TEMPLATE_DIRS setting.

Examples:
/myproj/myapp/templates/search_configuration/solr.xml
# ...or...
/myproj/templates/search_configuration/solr.xml

You'll also need a Solr binding, ``pysolr``. The official ``pysolr`` package,
distributed via PyPI, is the best version to use (2.1.0+). Place ``pysolr.py``
Expand Down Expand Up @@ -220,3 +248,4 @@ http://github.com/notanumber/xapian-haystack/tree/master. Installation
instructions can be found on that page as well. The backend, written
by David Sauve (notanumber), fully implements the `SearchQuerySet` API and is
an excellent alternative to Solr.

45 changes: 42 additions & 3 deletions haystack/backends/solr_backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,13 @@
from __future__ import absolute_import, division, print_function, unicode_literals

import warnings
import json

# Try to import urljoin from the Python 3 reorganized stdlib first:
try:
from urllib.parse import urljoin
except ImportError:
from urlparse import urljoin

from django.conf import settings
from django.core.exceptions import ImproperlyConfigured
Expand Down Expand Up @@ -48,6 +55,14 @@ def __init__(self, connection_alias, **connection_options):
self.conn = Solr(connection_options['URL'], timeout=self.timeout, **connection_options.get('KWARGS', {}))
self.log = logging.getLogger('haystack')

def get_schema_admin(self):
'''
SolrSchemaAdmin singleton
'''
if not hasattr(self, '_schema_admin'):
self._schema_admin = SolrSchemaAdmin(self.conn.url, self.conn.session)
return self._schema_admin

def update(self, index, iterable, commit=True):
docs = []

Expand Down Expand Up @@ -427,11 +442,11 @@ def build_schema(self, fields):

for field_name, field_class in fields.items():
field_data = {
'field_name': field_class.index_fieldname,
'name': field_class.index_fieldname,
'type': 'text_en',
'indexed': 'true',
'stored': 'true',
'multi_valued': 'false',
'multiValued': 'false',
}

if field_class.document is True:
Expand All @@ -456,7 +471,7 @@ def build_schema(self, fields):
field_data['type'] = 'location'

if field_class.is_multivalued:
field_data['multi_valued'] = 'true'
field_data['multiValued'] = 'true'

if field_class.stored is False:
field_data['stored'] = 'false'
Expand Down Expand Up @@ -714,3 +729,27 @@ def run_mlt(self, **kwargs):
class SolrEngine(BaseEngine):
backend = SolrSearchBackend
query = SolrSearchQuery


class SolrSchemaAdmin(object):
"""
Handles Schema API operations: see https://wiki.apache.org/solr/SchemaRESTAPI
"""
def __init__(self, url, session):
super(SolrSchemaAdmin, self).__init__()
self.url = url
self.session = session

def _post(self, url, data={}, headers={}):
"""
Post json encoded data body to a Schema endpoint
"""
import requests
headers['content-type'] = 'application/json'
resp = self.session.post(url, data=json.dumps(data), headers=headers)
return resp

def add_field(self, fields):
if not isinstance(fields, list):
fields = [fields]
return self._post(urljoin(self.url, 'schema'), {'add-field' : fields})
60 changes: 49 additions & 11 deletions haystack/management/commands/build_solr_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
from django.core.management.base import BaseCommand
from django.template import Context, loader


from haystack import constants
from haystack.backends.solr_backend import SolrSearchBackend

Expand All @@ -17,21 +18,45 @@ class Command(BaseCommand):
help = "Generates a Solr schema that reflects the indexes."
base_options = (
make_option("-f", "--filename", action="store", type="string", dest="filename",
help='If provided, directs output to a file instead of stdout.'),
help='For Solr version before 5.0.0. If provided, directs output to a XML schema.'),
make_option("-s", "--stdout", action="store_true", dest="stdout",
help='For Solr version before 5.0.0, print on stdout the schema.xml', default=False),
make_option("-u", "--using", action="store", type="string", dest="using", default=constants.DEFAULT_ALIAS,
help='If provided, chooses a connection to work with.'),
)
option_list = BaseCommand.option_list + base_options

def handle(self, **options):
"""Generates a Solr schema that reflects the indexes."""
from haystack import connections, connection_router

using = options.get('using')
schema_xml = self.build_template(using=using)
backend = connections[using].get_backend()

if options.get('filename'):
self.write_file(options.get('filename'), schema_xml)
else:
self.print_stdout(schema_xml)
if not isinstance(backend, SolrSearchBackend):
raise ImproperlyConfigured("'%s' isn't configured as a SolrEngine)." % backend.connection_alias)

if options.get('filename') or options.get('stdout'):
schema_xml = self.build_template(using=using)
if options.get('filename'):
self.write_file(options.get('filename'), schema_xml)
else:
self.print_schema(schema_xml)
return

content_field_name, fields = backend.build_schema(connections[using].get_unified_index().all_searchfields())

django_fields = [
dict(name=constants.ID, type="string", indexed="true", stored="true", multiValued="false", required="true"),
dict(name= constants.DJANGO_CT, type="string", indexed="true", stored="true", multiValued="false"),
dict(name= constants.DJANGO_ID, type="string", indexed="true", stored="true", multiValued="false"),
dict(name="_version_", type="long", indexed="true", stored ="true"),
]

admin = backend.get_schema_admin()
for field in fields + django_fields:
resp = admin.add_field(field)
self.log(field, resp, backend)

def build_context(self, using):
from haystack import connections, connection_router
Expand All @@ -55,7 +80,12 @@ def build_template(self, using):
c = self.build_context(using=using)
return t.render(c)

def print_stdout(self, schema_xml):
def write_file(self, filename, schema_xml):
schema_file = open(filename, 'w')
schema_file.write(schema_xml)
schema_file.close()

def print_schema(self, schema_xml):
sys.stderr.write("\n")
sys.stderr.write("\n")
sys.stderr.write("\n")
Expand All @@ -64,7 +94,15 @@ def print_stdout(self, schema_xml):
sys.stderr.write("\n")
print(schema_xml)

def write_file(self, filename, schema_xml):
schema_file = open(filename, 'w')
schema_file.write(schema_xml)
schema_file.close()
def log(self, field, response, backend):
try:
message = response.json()
except ValueError:
raise Exception('unable to decode Solr API, are sure you started Solr and created the configured Core (%s) ?' % backend.conn.url)

if 'errors' in message:
sys.stdout.write("%s.\n" % [" ".join(err.get('errorMessages')) for err in message['errors']])
elif 'responseHeader' in message and 'status' in message['responseHeader']:
sys.stdout.write("Successfully created the field %s\n" % field['name'])
else:
sys.stdout.write("%s.\n" % message)
10 changes: 5 additions & 5 deletions haystack/templates/search_configuration/solr.xml
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,10 @@
<fieldType name="float" class="solr.TrieFloatField" precisionStep="0" omitNorms="true" sortMissingLast="true" positionIncrementGap="0"/>
<fieldType name="long" class="solr.TrieLongField" precisionStep="0" omitNorms="true" sortMissingLast="true" positionIncrementGap="0"/>
<fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" omitNorms="true" sortMissingLast="true" positionIncrementGap="0"/>
<fieldType name="sint" class="solr.SortableIntField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="slong" class="solr.SortableLongField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="sfloat" class="solr.SortableFloatField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="sdouble" class="solr.SortableDoubleField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="sint" class="solr.TrieIntField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="slong" class="solr.TrieLongField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="sfloat" class="solr.TrieFloatField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="sdouble" class="solr.TrieDoubleField" sortMissingLast="true" omitNorms="true"/>

<fieldType name="tint" class="solr.TrieIntField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
<fieldType name="tfloat" class="solr.TrieFloatField" precisionStep="8" omitNorms="true" positionIncrementGap="0"/>
Expand Down Expand Up @@ -151,7 +151,7 @@
<dynamicField name="*_coordinate" type="tdouble" indexed="true" stored="false"/>

{% for field in fields %}
<field name="{{ field.field_name }}" type="{{ field.type }}" indexed="{{ field.indexed }}" stored="{{ field.stored }}" multiValued="{{ field.multi_valued }}" />
<field name="{{ field.name }}" type="{{ field.type }}" indexed="{{ field.indexed }}" stored="{{ field.stored }}" multiValued="{{ field.multiValued }}" />
{% endfor %}
</fields>

Expand Down

0 comments on commit d5ca733

Please sign in to comment.