Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Is SleekDB suitable for a growing project? #193

Closed
webghostx opened this issue May 1, 2021 · 3 comments
Closed

Question: Is SleekDB suitable for a growing project? #193

webghostx opened this issue May 1, 2021 · 3 comments
Labels
discussion question Further information is requested

Comments

@webghostx
Copy link

Hello

First of all, congratulations on this impressive project. I like the usability so much 馃槃馃憤

My question is very specific and does not fit here very well, but I would still like to try to get an assessment. Scalability worries me because you write SleekDB is designed to be a database for low to medium operational loads.

I'm working on a project that provides data for a website via API and later also for a mobile app. A bot is running in the background that collects content from the web. Although I hope that my project will grow quickly, I first decided to start on cheap shared hosting. I also discarded noSQL as a database and opted for the common MySQL. Now I doubt the decision and am thinking of SleekDB.

The situation:
I expect around 100 - 1000 new entries per day that will be automatically written to the database. On the other hand, queries are made to the API (I hope for a few million a day later):

  • news: last x posts (simple query)
  • news filtered: x post according to multiple conditions
  • search: full text search with conditions in all posts

It will be a mix of news site and search engine for a specific area

Database table-fields for posts relevant to conditions

- id: int ai
- publisher_id: int (relation)
- publisher_name: varchar (index)
- author: varchar (index)
- date_pub: time
- date_pub_update: time 
- author_name:(index)
- title: varchar (index)
- title_sub: varchar (index)
- pub_cats: varchar (index)
- pub_tags: varchar (index)
- post_type: tinyint 
- content: Big text (Fulltext index)

I use only a few relations, but I want to make queries with many conditions. The full-text search function is also important, which worries me in MySQL. I also hope to be more free to develop a search algorithm with noSQL

The only thing that generally worries me in a noSQL solution is the data collection for statistics. But I could also imagine operating a separate SQL database for this.

The final question is, is SleekDB suitable for my project?
Do I have to expect to switch to another database such as MongoDB later, if the traffic gets too big?

@Timu57
Copy link
Member

Timu57 commented May 3, 2021

Hi @webghostx
Thank you for your kind words.

I use only a few relations, but I want to make queries with many conditions. The full-text search function is also important, which worries me in MySQL. I also hope to be more free to develop a search algorithm with noSQL

We already have a kind of full-text-search implemented: https://sleekdb.github.io/#/searching

The only thing that generally worries me in a noSQL solution is the data collection for statistics. But I could also imagine operating a separate SQL database for this.

Using a separat table to calculate the statistics once every time period is always a good thought if there are many entries.

The final question is, is SleekDB suitable for my project?
Do I have to expect to switch to another database such as MongoDB later, if the traffic gets too big?

If you have millions of entries per day a file based database like SleekDB will not be enough. But if you have millions of entries someday, changing the database will not be a problem.

MySQL vs MongoDB vs SleekDB - My opinion and knowledge

SleekDB

SleekDB is best used for read-heavy tasks because of its caching layer. If you have many writes the cache will be refreshed (if no specific cache time to live is set) on every request.

MongoDB

MongoDB is a NoSQL document database. That means it is also best suited for read-heavy tasks. It is more scalable than a "traditional" relational databases like MySQL.

MySQL

Well known relational database with many features. It is resource heavy and reads are not that fast - if not configured properly. Most of the shared hosting solutions come with it. But be aware that shared hosting solutions have most of the time a type of request limit.

My conclusion and advice

Bigger companies and projects most of the time don't use just one type of database.
They use a mix of multiple different databases.

I would recommend to stick with MySQL or use SleekDB for the start and expand in the future if the project takes off.

You can also use MongoDB if you want to, but be warned. MongoDB have no type of full text search. Full text search is achieved with software like Elastic Search, which needs a lot of resources (ram) that can be quite costly at the beginning.

SleekDB needs almost no configuration and can be used easily. It needs no external database and lives inside your project, means it saves you time and hassle.

MySQL is well known, can handle many entries, has features like full text search and joins.

If you are willing to spend the time and have the knowledge to configure a MongoDB cluster and Elastic Search and also can pay the relatively high price for that I would say that would be the best combo.

If you don't have the knowledge or the time, but want a solution that is enough for a foreseeable future just stick with MySQL. You could also look into PostgreSQL if you want to.

If you want an easy solution that covers all your needs regarding search, joins etc. and is enough for the start, use SleekDB. (100 - 1000 entries per day)

@Timu57 Timu57 added discussion question Further information is requested labels May 3, 2021
@webghostx
Copy link
Author

Hi @ Timu57

Thank you very much for your detailed assessment. That helps me a lot.

I will start with SleekDB and use MySQL for statistics. I don't think I would be happy with SQL alone.

Elastic Search is a great thing, but I want to start with a small cost and see it grow.

If you have millions of entries per day a file based database like SleekDB will not be enough. But if you have millions of entries someday, changing the database will not be a problem.

I don't think the writes will increase that much later, so that shouldn't be a problem. I hope the read access will be a lot :)

SleekDB is best used for read-heavy tasks because of its caching layer. If you have many writes the cache will be refreshed (if no specific cache time to live is set) on every request.

Then I will have to see how I can set the cache well. On the other hand, I can also control the timing of the writing processes somewhat. It doesn't always have to be very up-to-date. I should be able to do that.

Thank you very much

@Timu57
Copy link
Member

Timu57 commented May 4, 2021

You are welcome 鈽猴笍

@Timu57 Timu57 closed this as completed May 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants