-
-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch MongoDB to RDMS for storage #317
Comments
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions. |
Needed to proper backup. |
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions. |
Backup is secured, but we still need to move the MongoDB content to the MariaDB instance. |
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions. |
Refreshing this ticket more than 3 years later:
That's for the issue we face. Now MongoDB was chosen by previous ZF developer and we never really questioned it until its RAM usage became problematic. It's important to note that we have no MongoDB expertise. Our instance is not configured. We use the official image (4.2.9) and its defaults. It's also important to note that we don't really need a schemaless DB. Most of what we store is code-driven and the main (only?) flexible part is the flags in the config. We could store that in a JSON field. Using Mongo also means a different code, different tools, different backup scenarios, etc. And writing aggreation queries is difficult and not self-explanatory. So it doesn't make sense to use it if we don't benefit from its main feature and it's the only one in a large collection of projects (actually the cardshop API also uses mongo but will switch as well). For maintenance sake, we want to use a single type of database across our projects (this doen't exclude KV store like redis where see fits) and we've decided to use PostgreSQL because of its wide support and reputation for saefty and performances. Key tasks:
The most important task here being choosing the python stack because we will use those in other projects and we want this sorted once and for a good while. |
Checklist of items to not forget :
Post-migration :
|
Some topics which have been discussed live with @rgaudin, I reproduce them here for history :) requested_by and canceled_by on tasks / requested_tasksIn Mongo, tasks have a "requested_by" and "canceled_by" fields which is a user name but not all users exists. tasks without worker or scheduleIn Mongo, tasks have a "worker" and "schedule_name" property. Both are names, and in same rare cases the corresponding schedule and/or worker does not exists, probably because it has been renamed. 529 tasks out of 14432 are concerned. schedule durations for missing workerIn Mongo, there is a "durations" property on schedule where we store a duration per worker. This duration is associated to its worker by name. There is some schedule durations which are linked to a worker name which does not exists anymore. The decision has been taken to get rid of these durations. schedule durations with missing tasksIn Mongo, there is a "durations" property on schedule where we store a duration per worker. This duration is associated to the task which has been used to compute this duration by ID. There is some tasks which does not exists anymore. The decision has been taken to keep these durations, since this is a normal behavior. We have to support durations per worker without anymore the corresponding task in DB (i.e. when we purge tasks, we must not delete the duration, just remove the link to the task). |
for all those reason, I believe we shall switch from mongo to mariadb. once done, memory usage should be lower and more controllable.
Just make sure that task creation is atomic to ensure only a single worker is assigned to a task.
pymysql
andpeewee
should be sufficient.The text was updated successfully, but these errors were encountered: