Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more garbage collector rules #7545

Closed
10 tasks done
jdalsem opened this issue Nov 27, 2014 · 10 comments
Closed
10 tasks done

Add more garbage collector rules #7545

jdalsem opened this issue Nov 27, 2014 · 10 comments
Assignees
Milestone

Comments

@jdalsem
Copy link
Member

jdalsem commented Nov 27, 2014

A recent issue #7518 is the reason i started this ticket. Maybe Elgg core should provide more garbage collector scripts by default. There is a lot of data that is never used again. Elgg should try to keep the data clean. There are multiple reasons why the garbage is heaping up, and it is not only related to coding problems. Sometimes it can also be due to connections getting dropped, or a server down or a configuration error. These situations cannot be prevented, thus we should take care of cleanup. Currently the garbage collector is doing that, but it is limited. There is way more garbage in the database than what is currently cleaned up by the garbage collector plugin.

In the past there was a plugin made by ColdTrick that did a lot more garbage collection. Maybe that plugin could be revisited and the extra garbage collections could be discussed here so they could be added to core.

The following tables are already cleaned up by Elgg by the current garbage collector plugin

  • groups_entity (checks for groups not having an entity in entities table)
  • metastring (checks for orphaned metastrings)
  • objects_entity (checks for objects not having an entity in entities table)
  • sites_entity (checks for sites not having an entity in entities table)
  • users_entity (checks for users not having an entity in entities table)

The garbage collector extended plugin (https://github.com/ColdTrick/garbagecollector_extended) provided the following extra checks that we could add to the garbage collector core plugin

Cleanup the following Elgg tables:

  • access_collections
    • removes ACL's where owner_guid entity does not exist anymore
    • remove members from ACL that do not exist anymore (not yet covered in a plugin)
  • annotations (removes annotations on entities that do not exist anymore)
  • entities
    • removes entities where owner_guid does not exists as entity anymore too dangerous
  • metadata (removes metadata on entities that do not exist anymore)
  • private_settings (removes private settings on entities that do not exist anymore) table no longer exists
  • entity_relationships (removes relationships where one of the 2 entities do not exist anymore)
  • river (removes river items where object OR subject does not exist anymore) no doing this, could be handled by the view

Are these garbage collectors we want in core?

@beck24
Copy link
Member

beck24 commented Nov 27, 2014

I say yes, I've been thinking about that myself
A few times I've run into data corruption of the opposite manner where by an entities table record exists for a user but no corresponding row in users_entity - we need to check both ways.

@jdalsem
Copy link
Member Author

jdalsem commented Nov 27, 2014

an entities table record exists for a user but no corresponding row in users_entity

i think that "corruption" could be intended for a certain use case. Custom User Class with a custom table for additional entity data... so I do not think that check should be part of core

@ewinslow
Copy link
Contributor

One way to do this would be to switch to innodb and add foreign keys. Then
its mostly automatic.

On Wed, Nov 26, 2014, 11:05 PM Jeroen Dalsem notifications@github.com
wrote:

an entities table record exists for a user but no corresponding row in
users_entity

i think that "corruption" could be intended for a certain use case. Custom
User Class with a custom table for additional entity data... so I do not
think that check should be part of core


Reply to this email directly or view it on GitHub
#7545 (comment).

@jdalsem
Copy link
Member Author

jdalsem commented Nov 27, 2014

That may be true, but it will not cover all the use cases + it's harder to implement and forces people to use innodb

@jdalsem
Copy link
Member Author

jdalsem commented Feb 11, 2015

Another piece of "garbage" is old plugins #5063

@hypeJunction
Copy link
Contributor

What happened to gargabecollector doing clean up of orphaned rows?

@jdalsem
Copy link
Member Author

jdalsem commented Apr 11, 2018

there are no secondary tables anymore, so those rules became obsolete

@hypeJunction
Copy link
Contributor

There are still meta, annotations etc

@jdalsem
Copy link
Member Author

jdalsem commented Apr 11, 2018

yeah but those were never cleaned up AFAIK

@Facyla
Copy link
Contributor

Facyla commented Nov 29, 2018

This could help addressing #10770 if orphans entities (owner does not exist) were deleted.
Also, anything with an owner but a non-existing container could be reaffected to the owner. Or deleted by default to keep BC, as long as it triggers a delete hook to enable entity container change instead.

@jdalsem jdalsem added this to the Elgg 5 milestone Aug 2, 2022
jeabakker added a commit to jeabakker/Elgg that referenced this issue Oct 28, 2022
jeabakker added a commit to jeabakker/Elgg that referenced this issue Oct 28, 2022
@jdalsem jdalsem closed this as completed Oct 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

6 participants