Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As a content editor, I want to search for tags with or without diacritics and get the same results so that I can more easily find the correct tag. #943

Closed
5 tasks done
mrustow opened this issue Jul 12, 2022 · 7 comments
Assignees

Comments

@mrustow
Copy link

mrustow commented Jul 12, 2022

testing notes

  • Try the below search and confirm it works as expected
  • try adding a new tag with diacritics — tag should be saved with the diacritics removed
  • confirm existing tags have had diacritics removed
  • confirm tag variants with and without diacritics have been merged correctly

deploy notes

  • solr reindex

On the admin interface > taggit, I typed hujra (no dot under the H) into the search bar and received this result:
image

However, I knew that #ḥujra is associated with 4 documents. So then I searched for ḥujra and received this result:
image

Lots of our tags don't use proper diacritics, so it would be better for us to be diacritic-agnostic.

Which we do seem to be already in the document view in the admin interface, and in searching the public site.

Thanks! 🥇

@rlskoeser
Copy link
Contributor

@richmanrachel is it ok for tags to not allow diacritics? wondering if we can tackle this in tandem with #499 — collapse tags that are the same except for diacritics, and then add logic so that in future diacritics are ignored when creating tags.

@richmanrachel
Copy link

@rlskoeser - yes! Especially since Marina already requested this, I'm happy with this solution

@blms blms self-assigned this Aug 8, 2022
blms added a commit that referenced this issue Aug 8, 2022
blms added a commit that referenced this issue Aug 10, 2022
…nsitivity

Make tags case- and diacritic-insensitive (#499, #943)
@rlskoeser rlskoeser added the 🗜️ awaiting testing Implemented and ready to be tested label Aug 11, 2022
@richmanrachel
Copy link

This didn't quite work either. The slug correctly took out diacritics, but the search doesn't include the slugs I guess? no results for mualim but returning correct result with diacritics:
image

@richmanrachel richmanrachel added ⚠️ tested needs attention Has been through acceptance testing and needs additional work and removed 🗜️ awaiting testing Implemented and ready to be tested labels Aug 11, 2022
@blms
Copy link
Contributor

blms commented Aug 15, 2022

Ah—this is a problem with the "add tag" functionality in the Tags section of the admin, rather than adding a tag to a document. Looks like our diacritic-stripping code is not run when you create a tag separately!

blms added a commit that referenced this issue Aug 15, 2022
@blms blms removed the ⚠️ tested needs attention Has been through acceptance testing and needs additional work label Aug 15, 2022
@rlskoeser rlskoeser added the 🗜️ awaiting testing Implemented and ready to be tested label Aug 16, 2022
@richmanrachel
Copy link

I didn't read through all the tags to ensure that all of the diacritics were properly dealt with, but I did look over about 400. The only potential error I saw was this one:
image

But if I only found 1/400 with errors, I think that's still enough to close as this gets us to a much easier place for tag cleanup. Do you agree, @rlskoeser ?

@rlskoeser
Copy link
Contributor

@richmanrachel I agree with you — the near-duplicates I saw were variations like plural/singular, and I think those are ones that I think would be better to handle manually when we get the tag merge function implemented. Thanks for reviewing so carefully.

@richmanrachel richmanrachel removed the 🗜️ awaiting testing Implemented and ready to be tested label Aug 16, 2022
@richmanrachel
Copy link

@rlskoeser - perfect, closing now!

@rlskoeser rlskoeser changed the title As a content editor, I want to be able to search for tags with or without diacritics and get the same results. As a content editor, I want to search for tags with or without diacritics and get the same results so that I can more easily find the correct tag. Aug 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants