Skip to content
This repository has been archived by the owner on Nov 25, 2023. It is now read-only.

Implement MySQL + Sphinx data driving model #3

Closed
d47081 opened this issue Apr 5, 2023 · 1 comment
Closed

Implement MySQL + Sphinx data driving model #3

d47081 opened this issue Apr 5, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@d47081
Copy link
Collaborator

d47081 commented Apr 5, 2023

Just tried to make search request on 2.5M rows on SQLite / FTS5 and seems that we starting to have performance issue.

According to following conversation we need to rewrite current DB driver model. Suppose MySQL is the nice accessible alternative.

I have experience with Sphinx engine, it stores compiled data in RAM and able to process at least 8M rows in milliseconds with same server resources, comparing to the current result.

If some one have better ideas - you are welcome here.

@d47081 d47081 added the enhancement New feature or request label Apr 5, 2023
@d47081
Copy link
Collaborator Author

d47081 commented Apr 7, 2023

Implemented, after creating separated SQLite branch

Initial database structure looks like this (InnoDB engine with foreign keys and transactions support)

2023-04-07_14-39

upd

DB

Realization has some new features:

  • robots.txt files support (Please add support for the robots.txt file #2), related to each domain, where also could be applied default one, provided in config file
  • page rank (coming soon)
  • pages limit per domain to index
  • crawl meta only for limited disk quota hostings
  • domain status (enable/disable)

In total, this updates includes website settings customization, speed and data optimization.

More details in README.md or the release version soon.

Database project could be deployed from the MySQL Workbench project located here
https://github.com/YGGverse/YGGo/tree/main/database

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant