Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure a index #3

Closed
ElPicador opened this issue Nov 30, 2015 · 0 comments
Closed

Configure a index #3

ElPicador opened this issue Nov 30, 2015 · 0 comments
Milestone

Comments

@ElPicador
Copy link
Contributor

You can retrieve all settings using the getSettings function. The result will contain the following attributes:

Indexing parameters

  • attributesToIndex: (array of strings) The list of fields you want to index.
    If set to null, all textual and numerical attributes of your objects are indexed. Be sure to update it to get optimal results.
    This parameter has two important uses:
    • Limit the attributes to index.
      For example, if you store a binary image in base64, you want to store it and be able to retrieve it, but you don't want to search in the base64 string.
    • Control part of the ranking.
      (see the ranking parameter for full explanation) Matches in attributes at the beginning of the list will be considered more important than matches in attributes further down the list. In one attribute, matching text at the beginning of the attribute will be considered more important than text after. You can disable this behavior if you add your attribute inside unordered(AttributeName). For example, attributesToIndex: ["title", "unordered(text)"].
      You can decide to have the same priority for two attributes by passing them in the same string using a comma as a separator. For example title and alternative_title have the same priority in this example, which is different than text priority: attributesToIndex:["title,alternative_title", "text"].
    • numericAttributesToIndex: (array of strings) All numerical attributes are automatically indexed as numerical filters. If you don't need filtering on some of your numerical attributes, you can specify this list to speed up the indexing.
      If you only need to filter on a numeric value with the operator '=', you can speed up the indexing by specifying the attribute with equalOnly(AttributeName). The other operators will be disabled.
  • attributesForFaceting: (array of strings) The list of fields you want to use for faceting. All strings in the attribute selected for faceting are extracted and added as a facet. If set to null, no attribute is used for faceting.
  • attributeForDistinct: The attribute name used for the Distinct feature. This feature is similar to the SQL "distinct" keyword. When enabled in queries with the distinct=1 parameter, all hits containing a duplicate value for this attribute are removed from results. For example, if the chosen attribute is show_name and several hits have the same value for show_name, then only the best one is kept and others are removed.
  • ranking: (array of strings) Controls the way results are sorted.
    We have nine available criteria:
    • typo: Sort according to number of typos.
    • geo: Sort according to decreasing distance when performing a geo location based search.
    • words: Sort according to the number of query words matched by decreasing order. This parameter is useful when you use the optionalWords query parameter to have results with the most matched words first.
    • proximity: Sort according to the proximity of the query words in hits.
    • attribute: Sort according to the order of attributes defined by attributesToIndex.
    • exact:
      • If the user query contains one word: sort objects having an attribute that is exactly the query word before others. For example, if you search for the TV show "V", you want to find it with the "V" query and avoid getting all popular TV shows starting by the letter V before it.
      • If the user query contains multiple words: sort according to the number of words that matched exactly (not as a prefix).
    • custom: Sort according to a user defined formula set in the customRanking attribute.
    • asc(attributeName): Sort according to a numeric attribute using ascending order. attributeName can be the name of any numeric attribute in your records (integer, double or boolean).
    • desc(attributeName): Sort according to a numeric attribute using descending order. attributeName can be the name of any numeric attribute in your records (integer, double or boolean).
      The standard order is ["typo", "geo", "words", "proximity", "attribute", "exact", "custom"].
  • customRanking: (array of strings) Lets you specify part of the ranking.
    The syntax of this condition is an array of strings containing attributes prefixed by the asc (ascending order) or desc (descending order) operator. For example, "customRanking" => ["desc(population)", "asc(name)"].
  • queryType: Select how the query words are interpreted. It can be one of the following values:
    • prefixAll: All query words are interpreted as prefixes.
    • prefixLast: Only the last word is interpreted as a prefix (default behavior).
    • prefixNone: No query word is interpreted as a prefix. This option is not recommended.
  • separatorsToIndex: Specify the separators (punctuation characters) to index. By default, separators are not indexed. Use +# to be able to search Google+ or C#.
  • slaves: The list of indices on which you want to replicate all write operations. In order to get response times in milliseconds, we pre-compute part of the ranking during indexing. If you want to use different ranking configurations depending of the use case, you need to create one index per ranking configuration. This option enables you to perform write operations only on this index and automatically update slave indices with the same operations.
  • unretrievableAttributes: The list of attributes that cannot be retrieved at query time. This feature allows you to have attributes that are used for indexing and/or ranking but cannot be retrieved. Defaults to null.
  • allowCompressionOfIntegerArray: Allows compression of big integer arrays. We recommended enabling this feature and then storing the list of user IDs or rights as an integer array. When enabled, the integer array is reordered to reach a better compression ratio. Defaults to false.

Query expansion

  • synonyms: (array of array of string considered as equals). For example, you may want to retrieve the black ipad record when your users are searching for dark ipad, even if the word dark is not part of the record. To do this, you need to configure black as a synonym of dark. For example, "synomyms": [ [ "black", "dark" ], [ "small", "little", "mini" ], ... ]. Synonym feature also supports multi-words expression like "synonyms": [ ["NY", "New York"] ]

  • placeholders: (hash of array of words). This is an advanced use case to define a token substitutable by a list of words without having the original token searchable. It is defined by a hash associating placeholders to lists of substitutable words. For example, "placeholders": { "<streetnumber>": ["1", "2", "3", ..., "9999"]} would allow it to be able to match all street numbers. We use the < > tag syntax to define placeholders in an attribute. For example:

    • Push a record with the placeholder: { "name" : "Apple Store", "address" : "&lt;streetnumber&gt; Opera street, Paris" }.
    • Configure the placeholder in your index settings: "placeholders": { "<streetnumber>" : ["1", "2", "3", "4", "5", ... ], ... }.
  • disableTypoToleranceOnWords: (string array) Specify a list of words on which automatic typo tolerance will be disabled.

  • disableTypoToleranceOnAttributes: (string array) List of attributes on which you want to disable typo tolerance (must be a subset of the attributesToIndex index setting). By default the list is empty.

  • altCorrections: (object array) Specify alternative corrections that you want to consider. Each alternative correction is described by an object containing three attributes:

    • word: The word to correct.
    • correction: The corrected word.
    • nbTypos The number of typos (1 or 2) that will be considered for the ranking algorithm (1 typo is better than 2 typos).

    For example "altCorrections": [ { "word" : "foot", "correction": "feet", "nbTypos": 1 }, { "word": "feet", "correction": "foot", "nbTypos": 1 } ].

Default query parameters (can be overwritten by queries)

  • minWordSizefor1Typo: (integer) The minimum number of characters needed to accept one typo (default = 4).
  • minWordSizefor2Typos: (integer) The minimum number of characters needed to accept two typos (default = 8).
  • hitsPerPage: (integer) The number of hits per page (default = 10).
  • attributesToRetrieve: (array of strings) Default list of attributes to retrieve in objects. If set to null, all attributes are retrieved.
  • attributesToHighlight: (array of strings) Default list of attributes to highlight. If set to null, all indexed attributes are highlighted.
  • attributesToSnippet: (array of strings) Default list of attributes to snippet alongside the number of words to return (syntax is 'attributeName:nbWords').
    By default, no snippet is computed. If set to null, no snippet is computed.
  • highlightPreTag: (string) Specify the string that is inserted before the highlighted parts in the query result (defaults to "<em>").
  • highlightPostTag: (string) Specify the string that is inserted after the highlighted parts in the query result (defaults to "</em>").
  • optionalWords: (array of strings) Specify a list of words that should be considered optional when found in the query.
  • allowTyposOnNumericTokens: (boolean) If set to false, disable typo-tolerance on numeric tokens (=numbers) in the query word. For example the query "304" will match with "30450", but not with "40450" that would have been the case with typo-tolerance enabled. Can be very useful on serial numbers and zip codes searches. Default to false.
  • ignorePlurals: (boolean) If set to true, simple plural forms won’t be considered as typos (for example car/cars will be considered as equal). Default to false.
  • advancedSyntax: Enable the advanced query syntax. Defaults to 0 (false).
    • Phrase query: a phrase query defines a particular sequence of terms. A phrase query is build by Algolia's query parser for words surrounded by ". For example, "search engine" will retrieve records having search next to engine only. Typo-tolerance is disabled on phrase queries.
    • Prohibit operator: The prohibit operator excludes records that contain the term after the - symbol. For example search -engine will retrieve records containing search but not engine.
  • replaceSynonymsInHighlight: (boolean) If set to false, words matched via synonyms expansion will not be replaced by the matched synonym in the highlighted result. Default to true.
  • maxValuesPerFacet: (integer) Limit the number of facet values returned for each facet. For example: maxValuesPerFacet=10 will retrieve max 10 values per facet.
  • distinct: (integer) Enable the distinct feature (disabled by default) if the attributeForDistinct index setting is set. This feature is similar to the SQL "distinct" keyword: when enabled in a query with the distinct=1 parameter, all hits containing a duplicate value for theattributeForDistinct attribute are removed from results. For example, if the chosen attribute is show_name and several hits have the same value for show_name, then only the best one is kept and others are removed.
  • typoTolerance: (string) This setting has four different options:
    • true: activate the typo-tolerance (default value).
    • false: disable the typo-tolerance
    • min: keep only results with the lowest number of typo. For example if one result match without typos, then all results with typos will be hidden.
    • strict: if there is a match without typo, then all results with 2 typos or more will be removed. This option is useful if you want to avoid as much as possible false positive.
  • removeStopWords: (boolean) Remove stop words from query before executing it. Defaults to false. Contains stop words for 41 languages (Arabic, Armenian, Basque, Bengali, Brazilian, Bulgarian, Catalan, Chinese, Czech, Danish, Dutch, English, Finnish, French, Galician, German, Greek, Hindi, Hungarian, Indonesian, Irish, Italian, Japanese, Korean, Kurdish, Latvian, Lithuanian, Marathi, Norwegian, Persian, Polish, Portugese, Romanian, Russian, Slovak, Spanish, Swedish, Thai, Turkish, Ukranian, Urdu)

You can easily retrieve settings or update them:

TODO: System.out.println(index.getSettings());
TODO: index.setSettings(new JSONObject().append("customRanking", "desc(followers)"));
@ElPicador ElPicador added this to the 1.3 milestone Dec 22, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant