Skip to content
This repository has been archived by the owner on Sep 14, 2019. It is now read-only.

Purge database records periodically #135

Open
jvehent opened this issue Oct 9, 2015 · 6 comments
Open

Purge database records periodically #135

jvehent opened this issue Oct 9, 2015 · 6 comments

Comments

@jvehent
Copy link
Contributor

jvehent commented Oct 9, 2015

We currently store DB records forever, and after two years of running MIG on thousands of production systems, we still use a minimal amount of space (~70GB). Still, most of these records are not useful anymore and could be purge to speed up search queries.

I'm proposing the following strategy:

  • Purge agent rows from the agents table after 30 days if the agent has not received any command
  • Purge agent and command rows after 180 days
  • Never purge records from the actions, signatures and investigator tables

This should be implemented in the scheduler using a new periodic task.

@jvehent
Copy link
Contributor Author

jvehent commented Oct 9, 2015

Some numbers:

command records

mig=> select count(*) from commands;
20417300

agent records

mig=> select count(*) from agents;
21086710

Agent records that don't have any commands:

mig=> SELECT COUNT(*) from (SELECT id FROM agents WHERE agents.heartbeattime < NOW() - INTERVAL '30 days' EXCEPT SELECT agentid FROM commands) as q;
15033316

About 71% of agents never get targeted and can be deleted. I'm calling this a win :)

@ameihm0912 , @gdestuynder : any thoughts?

@gdestuynder
Copy link
Contributor

Best would probably be to make this configurable to accommodate different people's retention needs (including "0" for forever) - but yes i noticed the db was getting big before.
I also wonder how much space it would take if exported + compressed (for those who need long term history)

@jvehent
Copy link
Contributor Author

jvehent commented Oct 10, 2015

Running this manually right now.

DELETE FROM agents
WHERE id IN (SELECT id FROM agents
             WHERE agents.heartbeattime < NOW() - INTERVAL '30 days'
             EXCEPT SELECT agentid FROM commands);

@jvehent
Copy link
Contributor Author

jvehent commented Oct 12, 2015

Query to delete commands after 1 year:

SELECT COUNT(*) FROM commands WHERE finishtime < NOW() - INTERVAL '365 days';
 1076359

Query to delete agentdestroy commands after 90 days:

SELECT COUNT(*) FROM commands
WHERE actionid IN (SELECT id FROM actions
            WHERE operations->0->>'module'='agentdestroy'
            AND expireafter < NOW() - INTERVAL '90 days');
 74699

@jvehent
Copy link
Contributor Author

jvehent commented Oct 12, 2015

Running now:

DELETE FROM commands
WHERE finishtime < NOW() - INTERVAL '365 days';

DELETE FROM commands
WHERE actionid IN (SELECT id FROM actions
                   WHERE operations->0->>'module'='agentdestroy'
                   AND expireafter < NOW() - INTERVAL '90 days');

DELETE FROM agents
WHERE id IN (SELECT id FROM agents
             WHERE agents.heartbeattime < NOW() - INTERVAL '30 days'
             EXCEPT SELECT agentid FROM commands);

VACUUM ANALYZE;

@jvehent
Copy link
Contributor Author

jvehent commented Oct 15, 2015

The queries above deleted millions of records, but no space was reclaimed. This may be an AWS RDS quirk, or a Posgres one, I'm not sure yet.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants