Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[4.0] Removing artificial sharding from com_finder #20185

Merged
merged 7 commits into from Jun 11, 2018

Conversation

Hackwar
Copy link
Member

@Hackwar Hackwar commented Apr 17, 2018

Finder has an artificial sharding, which splits up the links between terms and content items onto 16 seperate tables. This PR removes that sharding. It makes the code extremely complex and bloats it unnecessarily and it also slows down the whole finder system. Indexing the original sample data of Joomla 4.0 with the current finder indexer takes about 2:40 minutes on my system. With these changes, we are already down to 1:30 minute.

This PR also fixes a few other bugs that prevented com_finder from working in Joomla 4.0 right now and I standardised all the language fields to be 7 characters long. While I plan to provide more PRs with further improvements to make com_finder more usable, this PR is in itself complete and can be merged like this.

How to test

Install 4.0-dev and go to the backend of Smart Search. Enable the Smart Search Plugin and start the indexing process. Notice that it takes a long time (preferably measure the time with a stopwatch). Go to the frontend and open the Smart Search menu item. See that it throws errors and doesn't work right now.
Apply this PR and clear the index in the backend. Start the re-indexing and notice that it finishes quicker. Go to the frontend and see that that menu item no longer throws errors and that all features work like expected.

When typing in a phrase and then pressing enter, sending the request to the server, a message "error" is displayed before the page is reloaded both before and after applying the patch. I'm still looking for the source of that issue.

I want to thank cloudaccess.net for supporting these changes. Without their generous help, this wouldn't be possible.

@brianteeman
Copy link
Contributor

As there is administrator/components/com_admin/sql/updates/mysql/4.0.0-2018-04-14.sql
I assume that there should also be an equivalent postgres file?

@Hackwar
Copy link
Member Author

Hackwar commented Apr 17, 2018

I would need help with the postgres stuff. I don't have postgres on my system and would need someone to provide such a file and test that part.

@Hackwar
Copy link
Member Author

Hackwar commented Apr 17, 2018

@alikon Could you help me with adding the right migration script for postgres?

@alikon
Copy link
Contributor

alikon commented Apr 18, 2018

yes i'll do it

@Hackwar
Copy link
Member Author

Hackwar commented Apr 18, 2018

Thank you!

@infograf768
Copy link
Member

Will test. Please let me know, apart from the indexing speed (and obvious code corrections), what I can expect, as a multilingual user, from this PR?

@Hackwar
Copy link
Member Author

Hackwar commented Apr 18, 2018

At this moment, there is no change in terms of multilingual users. In order to keep these changes testable, I'm restricting my changes to one thing at a time. So this only removes the sharding and gets it all working again in 4.0. The language changes are coming in the next few steps.

@alikon
Copy link
Contributor

alikon commented Apr 21, 2018

made a pr Hackwar#22
unfortunately without this one #19707 merged
unable to test/work

the update sql for postgresql
@Hackwar
Copy link
Member Author

Hackwar commented Apr 21, 2018

Merged

@brianteeman
Copy link
Contributor

I was unable to replicate the error that you mentioned before applying this PR

After applying the PR the indexer did not work at all

@brianteeman
Copy link
Contributor

I have tested this item 🔴 unsuccessfully on 7716cd6


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/20185.

@Hackwar Hackwar requested a review from mbabker as a code owner May 11, 2018 20:57
@Hackwar
Copy link
Member Author

Hackwar commented May 11, 2018

The current 4.0-dev branch fails in displaying the frontend when opening the smart search component and having debug mode enabled with the following error message:
1055 'j4search2.t.parent_id' isn't in GROUP BY
This has been fixed in this PR.

I have updated the branch with the latest changes from 4.0-dev and checked it again. It does work. Could it be that you did not apply the database changes, that this introduces? I would be happy if you could test this again.

@Hackwar
Copy link
Member Author

Hackwar commented May 20, 2018

@brianteeman Did you see my last message? Could you check your test again, especially with the database changes?

@brianteeman
Copy link
Contributor

Sorry with JAB and JDFR it slipped my radar - thanks for the reminder I will put it on the todo for tomorrow

@Hackwar
Copy link
Member Author

Hackwar commented May 21, 2018

Just to list all current PRs for Finder/Smart Search:

@brianteeman
Copy link
Contributor

@Hackwar no change from my previous report

@Hackwar
Copy link
Member Author

Hackwar commented May 23, 2018

Can you see what the ajax call from the indexer returns as error?

@laoneo
Copy link
Member

laoneo commented Jun 4, 2018

Yes please one pr per task.

@laoneo
Copy link
Member

laoneo commented Jun 4, 2018

Thanks Chris for the explanation.

…4finder_sharding

# Conflicts:
#	components/com_finder/Model/SearchModel.php
@Hackwar
Copy link
Member Author

Hackwar commented Jun 5, 2018

I've reverted the changes of the router and cleaned up the commits.

@@ -9,6 +9,8 @@

defined('_JEXEC') or die;

JLoader::register('FinderHelperLanguage', JPATH_ADMINISTRATOR . '/components/com_finder/helpers/language.php');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, otherwise you get a fatal error when using the advanced filters. And please don't ask me to put all these single line fixes to make com_finder run in 4.0 into separate PRs....

@laoneo
Copy link
Member

laoneo commented Jun 5, 2018

Would be good if we can get another round of testing and then I think it is good to merge.

@Hackwar
Copy link
Member Author

Hackwar commented Jun 5, 2018

We have 2 successfull tests. Why do we need another round of tests now?

@carlitorweb
Copy link
Contributor

I have tested this item ✅ successfully on c80b975

Tested again. Only the route change was removed from this PR, so I think will be same. Anyway, I tested again, and work as before.


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/20185.

DROP TABLE #__finder_links_termsd;
DROP TABLE #__finder_links_termse;
DROP TABLE #__finder_links_termsf;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed CHANGE COLUMN to CHANGE, dropped AFTER ... and be on same line to make it work.

ALTER TABLE `#__finder_terms` CHANGE `language` `language` CHAR(7) NOT NULL DEFAULT '';
ALTER TABLE `#__finder_terms_common` CHANGE `language` `language` CHAR(7) NOT NULL DEFAULT '';
ALTER TABLE `#__finder_tokens` CHANGE `language` `language` CHAR(7) NOT NULL DEFAULT '';
ALTER TABLE `#__finder_tokens_aggregate` DROP COLUMN `map_suffix`;
ALTER TABLE `#__finder_tokens_aggregate` CHANGE `language` `language` CHAR(7) NOT NULL DEFAULT '';
ALTER TABLE `#__finder_links` CHANGE `language` `language` CHAR(7) NOT NULL DEFAULT '';

COLLATE='utf8mb4_general_ci'
ENGINE=InnoDB
;
DROP TABLE #__finder_links_terms0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tables are not dropped and I don't know why.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this operation manually, and work ok

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but not via Extensions > Manage > Database > Fix

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iirc you can't do a drop that way

@laoneo
Copy link
Member

laoneo commented Jun 7, 2018

@Hackwar can you please add the suggested changes to make the DB fixing work and then we are good to go here.

@brianteeman
Copy link
Contributor

there is no reason to make database>fix work
That is a misuse of the functionality

@laoneo
Copy link
Member

laoneo commented Jun 7, 2018

Ok, misinterpreted that comment. What about the suggestion from @Quy?

@Hackwar
Copy link
Member Author

Hackwar commented Jun 7, 2018

I'm hesitant to change the SQL again, mainly because I don't see how what is supposed to work how and more importantly because there are quite a few additional changes on the way for the finder tables. At one point we will have to make one PR that combines all these changes into one SQL file. Can we tackle that issue then? (There are 12 open PRs for finder from me, 4 of which have DB changes)

@laoneo
Copy link
Member

laoneo commented Jun 7, 2018

Does the upgrade script work without the recommendations from @Quy?

@Quy
Copy link
Contributor

Quy commented Jun 7, 2018

With fresh install, it works.
With upgrade/patch tester, it stalls at Starting Indexer.

@mbabker
Copy link
Contributor

mbabker commented Jun 7, 2018

This has database schema changes. Patch tester can't deal with database schema changes (you can apply the patch and run Database > Fix but you can't reverse the schema changes if you do that). The component, as useful as it can be, is not a definitive way of identifying issues with patches.

@Hackwar Hackwar mentioned this pull request Jun 7, 2018
@Hackwar
Copy link
Member Author

Hackwar commented Jun 7, 2018

Just to clarify @chrisdavenport s comment here: With the sharding removed, he had a performance improvement of 44%, not a performance reduction.

@wilsonge wilsonge merged commit 5613476 into joomla:4.0-dev Jun 11, 2018
@joomla-cms-bot joomla-cms-bot removed the RTC This Pull Request is Ready To Commit label Jun 11, 2018
@wilsonge wilsonge added this to the Joomla 4.0 milestone Jun 11, 2018
@Hackwar
Copy link
Member Author

Hackwar commented Jun 12, 2018

Thank you!

@Hackwar Hackwar deleted the j4finder_sharding branch June 12, 2018 06:58
@wilsonge
Copy link
Contributor

Thankyou for being so patient with this! Sorry it took so long to deal with!

Hackwar added a commit to Hackwar/joomla-cms that referenced this pull request Jun 18, 2018
laoneo pushed a commit that referenced this pull request Jul 6, 2018
* Decoupling highlighter plugin from com_finder

* Codestyle

* Codestyle

* Version comment

* Fixing bug due to #20185

* Using url for display and route for route-URL

* Cleaning up finder plugins and adding cleanURL attribute in model

* Making OutputFilter::stringJSSafe() utf8 aware
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet