Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sometimes there is no space between two words in the search results while there is actually a space in the text #40543

Closed
DSist opened this issue May 5, 2023 · 12 comments
Labels

Comments

@DSist
Copy link

DSist commented May 5, 2023

Steps to reproduce the issue

Result Description parameter = Show. It doesn't happen often, but can be reproduced with a specific search.

Expected result

search_bug1_2

Actual result

search_bug1_1

System information (as much as possible)

Joomla 4.3.1 native search
PHP 8.0.3

Additional comments

The issue has been there from the very beginning of Joomla 4.

The articles are in Russian, but I don't think it is a language-specific issue. The site is under development with no dns record. If needed I can provide all the data in a private message.

@chmst
Copy link
Contributor

chmst commented May 10, 2023

Could you please check if the space is something like %nbsp; or %shy; in the original text?

@DSist
Copy link
Author

DSist commented May 10, 2023

Could you please check if the space is something like %nbsp; or %shy; in the original text?

Nope, a regular space. Moreover if I delete this space in the original text Joomla removes the previous one as well, etc.

I have extracted the text fragment to reproduce the issue. Create a new article and paste this text:

Планетарное Братство Учителей Мудрости, направляющее эволюцию на нашей планете. Распределительный центр энергии Любви Бога. Духовный источник всех человеческих рас, цивилизаций, религий, культур. Великое Белое Братство. Иерархия Учителей. Иерархия Света.

Духовная Иерархия — это реальность, как и сама душа. А для нас, людей, это сверхреальность, так как Иерархия представляет собой более высокое царство природы — царство душ, которое является причинным по отношению к нашему человеческому царству. Это значит, что все эволюционные изменения, которые происходят в человечестве, являются следствиями изменений, что сначала происходят в Иерархии, то есть в умах Учителей Мудрости, и лишь затем они осаждаются на планы человеческого бытия, где и обретают те или иные формы: религиозные, политические, общественные, экономические и другие.

Планетарная Иерархия направляет эволюцию на нашей планете, имея одной из задач непрерывное расширение человеческого сознания, так как именно человеческое царство является связующим звеном между тремя дочеловеческими и тремя сверхчеловеческими царствами в теле Планетарного Логоса, нашего Бога. Иерархия воздействует на человечество двумя основными способами:

Search the term: нашего

There is no space in the search results between the words inside the red circle

search_bug1_3

@brianteeman
Copy link
Contributor

I copied and pasted the text from your post and then searched for нашего
As you can see from the screenshot there is a space.

I am using the chrome browser on windows 11

image

@chmst
Copy link
Contributor

chmst commented May 11, 2023

I can confirm the issue, Firefox and Chrome. But no clue what happens here.

@chmst chmst added the bug label May 11, 2023
@brianteeman
Copy link
Contributor

Wonder why it was different for me

@chmst
Copy link
Contributor

chmst commented May 11, 2023

After replacing the spaces from the copied text with "my" spaces the search result is ok, There must be something in the text.

@DSist
Copy link
Author

DSist commented May 11, 2023

I have replaced all the spaces in the text with "new" spaces via JCE Editor, the result is the same.

@DSist
Copy link
Author

DSist commented May 11, 2023

I have found how to reproduce both results. I use JCE Editor as a default editor.

If you copy the source text I have cited above and paste it via Code section of the editor, the text is inserted as a paragraph with a single p tag. The search result is OK.

If you divide it into several paragraphs via Editor section with br/ inserted (using Enter key) or with p tags via Code section the result is wrong.

@brianteeman
Copy link
Contributor

Can you replicate with tinymce

@DSist
Copy link
Author

DSist commented May 12, 2023

Yes, it is replicated with tinymce: a single p tag - OK, several p tags - no space.

@Hackwar
Copy link
Member

Hackwar commented Aug 25, 2023

The problem is the tokenize() code in the Smart Search component in the Helper class. It removes all tags and seems to not insert the right spaces here.

@Hackwar
Copy link
Member

Hackwar commented Aug 28, 2023

Got a PR to fix this with #41502. Closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants