Skip to content
This repository has been archived by the owner on Nov 9, 2022. It is now read-only.

tokenizer-overrides does not seem to work in ml-config #222

Closed
geverit4 opened this issue Apr 26, 2014 · 2 comments
Closed

tokenizer-overrides does not seem to work in ml-config #222

geverit4 opened this issue Apr 26, 2014 · 2 comments
Labels
Milestone

Comments

@geverit4
Copy link

I added the following field definition to ml-config.xml:

  <fields>
    <field>
      <field-name>mnemonic</field-name>
      <field-path>
        <path>epi:indicator/epi:code</path>
        <weight>1.0</weight>
      </field-path>
      <include-root>false</include-root>
      <fast-case-sensitive-searches>false</fast-case-sensitive-searches>
      <fast-diacritic-sensitive-searches>false</fast-diacritic-sensitive-searches>
      <fast-phrase-searches>false</fast-phrase-searches>
      <one-character-searches>false</one-character-searches>
      <stemmed-searches>basic</stemmed-searches>
      <three-character-searches>false</three-character-searches>
      <three-character-word-positions>false</three-character-word-positions>
      <trailing-wildcard-searches>false</trailing-wildcard-searches>
      <trailing-wildcard-word-positions>false</trailing-wildcard-word-positions>
      <two-character-searches>false</two-character-searches>
      <value-positions>false</value-positions>
      <value-searches>false</value-searches>
      <word-searches>true</word-searches>
      <word-lexicons/>
      <included-elements/>
      <excluded-elements/>
      <tokenizer-overrides>
        <tokenizer-override>
          <character>_</character>
          <tokenizer-class>word</tokenizer-class>
        </tokenizer-override>
      </tokenizer-overrides>
    </field>
  </fields>

However, when I do a bootstrap, it ignores the <tokenizer-override> elements.

I got those from admin:database-get-fields() after manually defining the field in the admin interface.

@grtjn
Copy link
Contributor

grtjn commented Apr 26, 2014

Correct. That setting is relatively new, and hasn't been implemented yet. It is effectively ignored.

@dmcassel dmcassel added the bug label Apr 28, 2014
@dmcassel dmcassel added this to the v1.6 milestone Apr 28, 2014
@grtjn
Copy link
Contributor

grtjn commented May 8, 2014

Support added. Make sure not to apply this to the main field (the one with empty field-name), that isn't allowed. Bootstrap will fail with the following message:

ADMIN-BADCUSTOMFIELD: (err:FOER0000) Bad field for custom tokenization: the word field may not have tokenizer overrides
See MarkLogic Server error log for more details.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants