It would be nice to have indexable fields of type string in solr, right now if you want to have a string field, it cannot be indexed or it has to be faceted.
Can you describe the use case here? I'm not sure how just strings are useful. The only use I'm aware of would be matching just exact words, which you can accomplish by making the field faceted & searching in the shadowed facet field (<fieldname>_exact).
I just ran in to this problem - I want to order alphabetically by a specific field, but Solr can't order by fields that have been tokenized. I'll have to do the fieldname_exact hack, which is frustrating because my automated deployments currently use the output of build_solr_schema so now I'll have to change them to modify the schema.xml on its way through (or save the output of build_solr_schema in source control).
Hmm... looks like I can override the solr.xml template myself by dropping my own search_configuration/solr.xml file in to my template directory - I'll try that for the moment.
I ran into this issue as well. To sort a CharField, it must be indexed, but an indexed CharField produces a "text" type, which can produce a java.lang.ArrayIndexOutOfBoundsException when sorted due to being tokenized by the WhitespaceTokenizerFactory.
I was able to make a simple change to my copy of Haystack (1.2.4) so I could specify the field type.
haystack/fields.py - add a index_fieldtype attribute to SearchField
@@ -21,7 +21,7 @@
def __init__(self, model_attr=None, use_template=False, template_name=None,
document=False, indexed=True, stored=True, faceted=False,
default=NOT_PROVIDED, null=False, index_fieldname=None,
- facet_class=None, boost=1.0, weight=None):
+ index_fieldtype=None, facet_class=None, boost=1.0, weight=None):
# Track what the index thinks this field is called.
self.instance_name = None
self.model_attr = model_attr
@@ -34,6 +34,7 @@
self._default = default
self.null = null
self.index_fieldname = index_fieldname
+ self.index_fieldtype = index_fieldtype
self.boost = weight or boost
self.is_multivalued = False
haystack/backends/solr_backend.py - pass this attribute along to the schema
@@ -360,6 +360,10 @@
if field_data['type'] == 'text':
field_data['type'] = 'string'
+ # Let the class have the final say on its type.
+ if field_class.index_fieldtype is not None:
+ field_data['type'] = field_class.index_fieldtype
return (content_field_name, schema_fields)
And then, in my solr.xml template, I added a text_sort field type. I specify index_fieldtype="text_sort" to create a CharField with this type.
+ <fieldType name="text_sort" class="solr.TextField" sortMissingLast="true" omitNorms="true">
+ <tokenizer class="solr.KeywordTokenizerFactory"/>
+ <filter class="solr.LowerCaseFilterFactory"/>
+ <filter class="solr.TrimFilterFactory"/>
The use case for me was creating a field that could be used for sorting.
I encountered the same problem, namely that I was unable determine how to make a sortable field for SOLR via Haystack.
My solution was to change the schema at build time converting the field from type="text" to type="string":
./manage.py build_solr_schema | sed 's/<field name=\"result_title_sort\" type=\"text\"/<field name=\"result_title_sort\" type=\"string\"/' > schema.xml
The solution that is desribed on Stackoverflow works fine, and was very easy to implement once I identified the problem, but it was very confusing before I was able to determine what was going on. So I agree that this is a problem. At the very least, it should be more clear that vanilla haystack generated text fields will not be properly sortable by SOLR.
Alphabetical sorting seems like something fundamental to a search index. It seems that this issue is still open and our option is to manually edit the schema file post haystack generation. Can someone correct me if there's a better way to handle fields of type "string"?