Extendible PHP Data Cleaner
PHP JavaScript

README.md

Mr. Clean

Latest Version Software License Build Status Coverage Status Quality Score Total Downloads

Mr. Clean is an extendible PHP cleaner that allows you to easily clean up strings, arrays, objects, and anything in between.

Table of Contents

Installation

Using composer:

{
    "require": {
        "joetannenbaum/mr-clean": "~0.0"
    }
}

Basic Usage

Fire it up like so:

require_once 'vendor/autoload.php';

$cleaner = new MrClean\MrClean();

Scrubbers

Scrubbers are the classes and functions that actually do the work, and you can assign as many as you want to clean your object.

$scrubbers = [
    'trim',
    'stripslashes',
    'strip_tags',
    'remove_weird_characters',
];

$scrubbed = $cleaner->scrubbers($scrubbers)->scrub('I\'m not that dirty.');

Scrubbers should always be passed as an array, and will be run in the order that you specify.

Any single argument string manipulation function can be used. To reference a class, simply convert the StudlyCase to snake_case. In the example above, remove_weird_characters refers to a (fictional) class named RemoveWeirdCharacters.

Pre/Post

To save some typing, you can set scrubbers to run every time before and after each cleaning:

$cleaner->pre(['trim']);
$cleaner->post(['htmlentities']);

// trim will run before each of these, htmlentities after each
$cleaner->scrubbers(['strip_tags'])->scrub('This should be cleaned.')
$cleaner->scrubbers(['remove_weird_characters'])->scrub('So should this.')

What Can Be Cleaned

Better question: what can't? An array of arrays, a string, an array of objects, a single object, you try it, Mr. Clean will probably be able to clean it. All of the following will work:

$scrubbed = $cleaner->scrubbers(['trim'])->scrub('Holy string, Batman.');

$scrubbed = $cleaner->scrubbers(['trim'])->scrub(['Holy', 'array', 'Batman']);

$scrubbed = $cleaner->scrubbers(['trim'])->scrub([
        ['Holy', 'array', 'of', 'arrays', 'Batman'],
        ['Holy', 'array', 'of', 'arrays', 'Batman'],
    ]);

$scrubbed = $cleaner->scrubbers(['trim'])->scrub((object) [
        'first_word'  => 'Holy',
        'second_word' => 'object',
        'third_word'  => 'Batman',
    ]);

$scrubbed = $cleaner->scrubbers(['trim'])->scrub([
        (object) [
            'first_word'  => 'Holy',
            'second_word' => 'array',
            'third_word'  => 'of',
            'fourth_word' => 'objects',
            'fifth_word'  => 'Batman',
        ],
        (object) [
            'first_word'  => 'Holy',
            'second_word' => 'array',
            'third_word'  => 'of',
            'fourth_word' => 'objects',
            'fifth_word'  => 'Batman',
        ],
    ]);

$scrubbed = $cleaner->scrubbers(['trim'])->scrub([
        (object) [
            'first_word'  => 'Holy',
            'second_word' => 'mixed',
            'third_word'  => ['bag', 'Batman'],
        ],
    ]);

Cleaning Specific Keys

Sometimes you don't want to use the same scrubbers on every key in an object or associative array. No problem. Just let Mr. Clean know which ones to apply where and he'll take care of it:

$scrubbers = [
    'first_name' => ['trim'],
    'last_name'  => ['stripslashes', 'htmlentities'],
];

$data = [
    [
        'first_name' => 'Joe ',
        'last_name'  => 'O\'Donnell',
    ],
    [
        'first_name' => ' Harold',
        'last_name'  => 'Frank & Beans',
    ],
];

$scrubbed = $cleaner->scrubbers($scrubbers)->scrub($data);

/*
[
    [
        'first_name' => 'Joe',
        'last_name'  => "O'Donnell",
    ],
    [
        'first_name' => 'Harold',
        'last_name'  => 'Frank & Beans',
    ],
]
*/

You can also still specify scrubbers that should run for everything:

$scrubbers = [
    'strip_tags',
    'first_name' => ['trim'],
    'last_name'  => ['stripslashes', 'htmlentities'],
    'htmlspecialchars',
];

Available Scrubbers

Mr. Clean comes with a bevy of pre-built scrubbers you can use:

Boolean

Converts falsey text and anything considered empty to false, otherwise returns true. Falsey text includes (not case sensitive):

  • no
  • n
  • false
$movies_seen = [
    'The Dark Knight'   => 'y',
    'The Green Lantern' => 'n',
    'The Avengers'      => 'yes',
];

$scrubbed = $cleaner->scrubbers(['boolean'])->scrub( $movies_seen );

/*
[
    'The Dark Knight'   => true,
    'The Green Lantern' => false,
    'The Avengers'      => true,
];
*/

HTML

Strips tags not on the whitelist, removes empty content tags, and repeated opening or closing tags. The whitelist includes:

  • a
  • p
  • div
  • strong
  • em
  • b
  • i
  • br
  • ul
  • ol
  • li
  • h1
  • h2
  • h3
  • h4
  • h5
  • h6
$dirty = '<p><p>Some bad HTML here.</p><hr /><em></em><div>Soon to be cleaner.</div>';

$scrubbed = $cleaner->scrubbers(['html'])->scrub( $dirty );

// <p>Some bad HTML here.</p><div>Soon to be cleaner.</div>

Strip CSS Attributes

Strips the style, class, and id attributes off of all HTML elements.

$dirty = '<p style="font-weight:bold;" id="bold-el" class="boldest">This was once bold.</p>';

$scrubbed = $cleaner->scrubbers(['strip_css_attributes'])->scrub($dirty);

// <p>This was once bold.</p>

Nullify

If a trimmed string doesn't have any length, null it out:

$dirty = [
    'cool',
    'also cool',
    ' ',
    '    ',
];

$scrubbed = $cleaner->scrubbers(['nullify'])->scrub($dirty);

/*
[
    'cool',
    'also cool',
    null,
    null,
];
*/

Null If Repeated

If a string is just a repeated character ('1111111' or 'aaaaaaaaa') and has a length greater than two, null it out:

$dirty = [
    '11111111',
    '22',
    'bbbbbbbb',
    '333334',
];

$scrubbed = $cleaner->scrubbers(['null_if_repeated'])->scrub($dirty);

/*
[
    null,
    '22',
    null,
    '333334',
];
*/

Strip Phone Number

Strip a phone number down to just the good bits, numbers and the letter 'x' (for extensions):

$dirty = [
    '555-555-5555',
    '(123) 456-7890',
    '198 765 4321 ext. 888',
];

$scrubbed = $cleaner->scrubbers(['strip_phone_number'])->scrub($dirty);

/*
[
    '5555555555',
    '1234567890',
    '1987654321x888',
];
*/

Extending

You can register custom scrubbers with Mr. Clean.

Writing a Scrubber

First, write your class. All you have to do is extend MrClean\Scrubber\BaseScrubber which adheres to MrClean\Scrubber\ScrubberInterface. There is a single property, value available to you. This is the string you will manipulate:

namespace Your\Namespace;

use MrClean\Scrubber\BaseScrubber;

class YourCustomScrubber extends BaseScrubber {

    public function scrub()
    {
        return str_replace('!', '.', $this->value);
    }

}

And that's it. Now just register your scrubber with Mr. Clean.

Registering a Scrubber

The register method will take a string indicating the full path of the class, or an array of class paths.

$cleaner->register('Your\Namespace\YourCustomScrubber');

Now, go ahead and use it:

$dirty = [
    'I need to calm down!',
    'Me too!',
];

$scrubbed = $cleaner->scrubbers(['your_custom_scrubber'])->scrub($dirty);

/*
[
    'I need to calm down.',
    'Me too.',
]
*/