-
Notifications
You must be signed in to change notification settings - Fork 27
-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct the handling of alternate graphics in Title proper fields (display and indexing) #2591
Labels
bug
Breaks something but is not blocking
f: professional ui
Professional interface
f: public ui
Public interface, as opposed to the professional interface
p-High
High priority (to be solved in the 2-3 next months)
Comments
JoelleDosimont
added
f: professional ui
Professional interface
f: public ui
Public interface, as opposed to the professional interface
triage
bug
Breaks something but is not blocking
p-Low
Low priority
labels
Dec 9, 2021
JoelleDosimont
added
p-High
High priority (to be solved in the 2-3 next months)
and removed
p-Low
Low priority
labels
Jun 2, 2022
JoelleDosimont
changed the title
Correct the display of title when the value "language" is used (for Alternate Graphics)
Correct the handling of alternate graphics in Title proper fields (display and indexing)
Jun 2, 2022
jma
added a commit
to jma/rero-ils
that referenced
this issue
Aug 17, 2022
Some complex fields has a formated content in a special field `_text`. This field is not indexed anymore, but all the related field are indexed. The legacy `series` document field configuration is removed. Remove `&` character from the `tokenize_on_chars`. The tokenize characters comes from the following unicode categories as defined in the elasticsearch source code: Ps, Pe, Po, Pc, Pd, Pi, Pf. `&` is excluded from the Po category. Unfortunately, elasticsearch does not support exceptions. `unicategories` python package has been used to generate the list from categories. Note that some unicode characters such as `"\ud836\ude8b"` has been removed as they create an elasticsearch errors. Data Migration Instructions: need elasticsearch server side document reindexing. * Closes: rero#3050. * Closes: rero#2591. * Closes: rero#2730. * Closes: rero#2972. * Closes: rero#2050. * Closes: rero#3027. Co-Authored-by: Johnny Mariéthoz <Johnny.Mariethoz@rero.ch>
jma
added a commit
to jma/rero-ils
that referenced
this issue
Aug 17, 2022
Some complex fields has a formated content in a special field `_text`. This field is not indexed anymore, but all the related field are indexed. The legacy `series` document field configuration is removed. Remove `&` character from the `tokenize_on_chars`. The tokenize characters comes from the following unicode categories as defined in the elasticsearch source code: Ps, Pe, Po, Pc, Pd, Pi, Pf. `&` is excluded from the Po category. Unfortunately, elasticsearch does not support exceptions. `unicategories` python package has been used to generate the list from categories. Note that some unicode characters such as `"\ud836\ude8b"` has been removed as they create an elasticsearch errors. Data Migration Instructions: need a complete document re-indexing. * Closes: rero#3050. * Closes: rero#2591. * Closes: rero#2730. * Closes: rero#2972. * Closes: rero#2050. * Closes: rero#3027. Co-Authored-by: Johnny Mariéthoz <Johnny.Mariethoz@rero.ch>
jma
added a commit
to jma/rero-ils
that referenced
this issue
Aug 17, 2022
Some complex fields has a formated content in a special field `_text`. This field is not indexed anymore, but all the related field are indexed. The legacy `series` document field configuration is removed. Remove `&` character from the `tokenize_on_chars`. The tokenize characters comes from the following unicode categories as defined in the elasticsearch source code: Ps, Pe, Po, Pc, Pd, Pi, Pf. `&` is excluded from the Po category. Unfortunately, elasticsearch does not support exceptions. `unicategories` python package has been used to generate the list from categories. Note that some unicode characters such as `"\ud836\ude8b"` has been removed as they create an elasticsearch errors. Data Migration Instructions: need a complete document re-indexing. * Closes: rero#3050. * Closes: rero#2591. * Closes: rero#2730. * Closes: rero#2972. * Closes: rero#2050. * Closes: rero#3027. Co-Authored-by: Johnny Mariéthoz <Johnny.Mariethoz@rero.ch>
jma
added a commit
to jma/rero-ils
that referenced
this issue
Aug 17, 2022
Some complex fields has a formated content in a special field `_text`. This field is not indexed anymore, but all the related field are indexed. The legacy `series` document field configuration is removed. Remove `&` character from the `tokenize_on_chars`. The tokenize characters comes from the following unicode categories as defined in the elasticsearch source code: Ps, Pe, Po, Pc, Pd, Pi, Pf. `&` is excluded from the Po category. Unfortunately, elasticsearch does not support exceptions. `unicategories` python package has been used to generate the list from categories. Note that some unicode characters such as `"\ud836\ude8b"` has been removed as they create an elasticsearch errors. Data Migration Instructions: need a complete document re-indexing. * Closes: rero#3050. * Closes: rero#2591. * Closes: rero#2730. * Closes: rero#2972. * Closes: rero#2050. * Closes: rero#3027. Co-Authored-by: Johnny Mariéthoz <Johnny.Mariethoz@rero.ch>
jma
added a commit
that referenced
this issue
Aug 22, 2022
Some complex fields has a formated content in a special field `_text`. This field is not indexed anymore, but all the related field are indexed. The legacy `series` document field configuration is removed. Remove `&` character from the `tokenize_on_chars`. The tokenize characters comes from the following unicode categories as defined in the elasticsearch source code: Ps, Pe, Po, Pc, Pd, Pi, Pf. `&` is excluded from the Po category. Unfortunately, elasticsearch does not support exceptions. `unicategories` python package has been used to generate the list from categories. Note that some unicode characters such as `"\ud836\ude8b"` has been removed as they create an elasticsearch errors. Data Migration Instructions: need a complete document re-indexing. * Closes: #3050. * Closes: #2591. * Closes: #2730. * Closes: #2972. * Closes: #2050. * Closes: #3027. Co-Authored-by: Johnny Mariéthoz <Johnny.Mariethoz@rero.ch>
jma
added a commit
to jma/rero-ils
that referenced
this issue
Aug 22, 2022
Some complex fields has a formated content in a special field `_text`. This field is not indexed anymore, but all the related field are indexed. The legacy `series` document field configuration is removed. Remove `&` character from the `tokenize_on_chars`. The tokenize characters comes from the following unicode categories as defined in the elasticsearch source code: Ps, Pe, Po, Pc, Pd, Pi, Pf. `&` is excluded from the Po category. Unfortunately, elasticsearch does not support exceptions. `unicategories` python package has been used to generate the list from categories. Note that some unicode characters such as `"\ud836\ude8b"` has been removed as they create an elasticsearch errors. Data Migration Instructions: need a complete document re-indexing. * Closes: rero#3050. * Closes: rero#2591. * Closes: rero#2730. * Closes: rero#2972. * Closes: rero#2050. * Closes: rero#3027. Co-Authored-by: Johnny Mariéthoz <Johnny.Mariethoz@rero.ch>
jma
added a commit
that referenced
this issue
Aug 22, 2022
Some complex fields has a formated content in a special field `_text`. This field is not indexed anymore, but all the related field are indexed. The legacy `series` document field configuration is removed. Remove `&` character from the `tokenize_on_chars`. The tokenize characters comes from the following unicode categories as defined in the elasticsearch source code: Ps, Pe, Po, Pc, Pd, Pi, Pf. `&` is excluded from the Po category. Unfortunately, elasticsearch does not support exceptions. `unicategories` python package has been used to generate the list from categories. Note that some unicode characters such as `"\ud836\ude8b"` has been removed as they create an elasticsearch errors. Data Migration Instructions: need a complete document re-indexing. * Closes: #3050. * Closes: #2591. * Closes: #2730. * Closes: #2972. * Closes: #2050. * Closes: #3027. Co-Authored-by: Johnny Mariéthoz <Johnny.Mariethoz@rero.ch>
jma
added a commit
to jma/rero-ils
that referenced
this issue
Aug 24, 2022
Some complex fields has a formated content in a special field `_text`. This field is not indexed anymore, but all the related field are indexed. The legacy `series` document field configuration is removed. Remove `&` character from the `tokenize_on_chars`. The tokenize characters comes from the following unicode categories as defined in the elasticsearch source code: Ps, Pe, Po, Pc, Pd, Pi, Pf. `&` is excluded from the Po category. Unfortunately, elasticsearch does not support exceptions. `unicategories` python package has been used to generate the list from categories. Note that some unicode characters such as `"\ud836\ude8b"` has been removed as they create an elasticsearch errors. Data Migration Instructions: need a complete document re-indexing. * Closes: rero#3050. * Closes: rero#2591. * Closes: rero#2730. * Closes: rero#2972. * Closes: rero#3027. Co-Authored-by: Johnny Mariéthoz <Johnny.Mariethoz@rero.ch>
jma
added a commit
to jma/rero-ils
that referenced
this issue
Aug 24, 2022
Some complex fields has a formated content in a special field `_text`. This field is not indexed anymore, but all the related field are indexed. The legacy `series` document field configuration is removed. Remove `&` character from the `tokenize_on_chars`. The tokenize characters comes from the following unicode categories as defined in the elasticsearch source code: Ps, Pe, Po, Pc, Pd, Pi, Pf. `&` is excluded from the Po category. Unfortunately, elasticsearch does not support exceptions. `unicategories` python package has been used to generate the list from categories. Note that some unicode characters such as `"\ud836\ude8b"` has been removed as they create an elasticsearch errors. Data Migration Instructions: need a complete document re-indexing. * Fixes several mapping configurations comming from the facets configuration. * Closes: rero#3050. * Closes: rero#2591. * Closes: rero#2730. * Closes: rero#2972. * Closes: rero#3027. Co-Authored-by: Johnny Mariéthoz <Johnny.Mariethoz@rero.ch>
jma
added a commit
to jma/rero-ils
that referenced
this issue
Aug 24, 2022
Some complex fields has a formated content in a special field `_text`. This field is not indexed anymore, but all the related field are indexed. The legacy `series` document field configuration is removed. Remove `&` character from the `tokenize_on_chars`. The tokenize characters comes from the following unicode categories as defined in the elasticsearch source code: Ps, Pe, Po, Pc, Pd, Pi, Pf. `&` is excluded from the Po category. Unfortunately, elasticsearch does not support exceptions. `unicategories` python package has been used to generate the list from categories. Note that some unicode characters such as `"\ud836\ude8b"` has been removed as they create an elasticsearch errors. Data Migration Instructions: need a complete document re-indexing. * Fixes several mapping configurations comming from the facets configuration. * Closes: rero#3050. * Closes: rero#2591. * Closes: rero#2730. * Closes: rero#2972. * Closes: rero#3027. Co-Authored-by: Johnny Mariéthoz <Johnny.Mariethoz@rero.ch>
jma
added a commit
that referenced
this issue
Aug 25, 2022
Some complex fields has a formated content in a special field `_text`. This field is not indexed anymore, but all the related field are indexed. The legacy `series` document field configuration is removed. Remove `&` character from the `tokenize_on_chars`. The tokenize characters comes from the following unicode categories as defined in the elasticsearch source code: Ps, Pe, Po, Pc, Pd, Pi, Pf. `&` is excluded from the Po category. Unfortunately, elasticsearch does not support exceptions. `unicategories` python package has been used to generate the list from categories. Note that some unicode characters such as `"\ud836\ude8b"` has been removed as they create an elasticsearch errors. Data Migration Instructions: need a complete document re-indexing. * Fixes several mapping configurations comming from the facets configuration. * Closes: #3050. * Closes: #2591. * Closes: #2730. * Closes: #2972. * Closes: #3027. Co-Authored-by: Johnny Mariéthoz <Johnny.Mariethoz@rero.ch>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Breaks something but is not blocking
f: professional ui
Professional interface
f: public ui
Public interface, as opposed to the professional interface
p-High
High priority (to be solved in the 2-3 next months)
Describe the bug
When a document has an alternate graphics (ex. Title in cyrillic characters), the field "value" is doubled to add the transliteration. The sub-element "language" is used to determine the language and graphic.
There are problems in two cases :
To Reproduce
Case 1
Case 2
Expected behavior
For a Title, if the value is repeated for transliteration (same language, different graphic) : the Title should display :
All values are indexed, regardless of the presence of the Language sub-element.
Context
v1.10.0
or the commit hash (see frontpage).Screenshots
Expected display :
Additional context
On the 07/12/2021, the display of the Title was different : 1 entry for the original graphic, 2 entries for the transliteration.
On the 09/12/2021 : 2 entries for the original graphic, 1 entry for the transliteration in the wrong order (the original graphic should be first).
The text was updated successfully, but these errors were encountered: