Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Script to garbage collect in a really lame way.
PHP
branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
README.md
script.php

README.md

session trasher

Clone this repo: ./script -p=MYSQLROOTPASSWORD -d='DATETIME-START-DATE'

Note: This script is opinionated, which means it has the least configuration necessary (for myself). Fork and improve if necessary.

What's this?

Status Quo

Our setup: sessions are saved in MySQL and kept for up to 8 days.

Problem?

If lots of data is involved GC sucks with MySQL (no matter which table engine involved). We average around 3-5 GB of session data depending on the time of the year and the traffic of course.

Imagine an SQL statement such as:

DELETE FROM sessions WHERE rec_datemod <= DATE_SUB(NOW(), INTERVAL 8 DAY)

On MyISAM this will of course easily lock the table because of a full table scan, that is despite having an index on rec_datemod.

InnoDB will not lock up as fast, but depending on the amount of data currently in the table, it will lock up eventually as well. DELETE FROM table is not a recommended operation in InnoDB-land, people like to dump and reload instead. With a close to zero-downtime objective, this is pretty hard and adds unnecessary operational complexity.

We usually set innodb_buffer_pool_size to about 80% of the available memory and even that doesn't help much. There's something about MySQL where when a query matches a large amount of the data in your table it will result in a full table scan regardless (translated: you're out of luck).

PHP!

PHP likes to do session garbage collection automatically in a request, this leads to problems as well when a lot of data is involved.

Related configuration options:

session.gc_maxlifetime
session.gc_probability
session.gc_divisor

Solution

Turn off PHP's session garbage collection (session.gc_probability = 0) and try to run session garbage collection asynchronous.

  1. Run a larger database server (e.g. an m1.large on AWS EC2 doesn't cut it) with a lot of RAM.
  2. Run a session garbage collector via cronjob as often as possible.
  3. Watch the queries!

You mad, bro?

SQL databases suck. After a while it can be unreasonably difficult to keep your database server running because you always require a lot of RAM. RAM is still expensive (especially in the cloud).

Enter Session Trasher

Session Trasher is meant to run as a cronjob. It'll create queries which delete by second so to speak.

This is not incredibly fast, but it works without interrupting the service. Just to make sure: run it during the night. ;-)

Fin

Final solution: migrate somewhere else.

Something went wrong with that request. Please try again.