Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Slugifier interface proposal #100

Closed
wants to merge 2 commits into
from
Jump to file or symbol
Failed to load files and symbols.
+78 −0
Split
@@ -0,0 +1,78 @@
+Slugifier Interface
+===================
+
+This document describes a common interface for slugifier libraries.
+
+The goal is to provide a simple common interface for classes which generate
+URL-safe string, or *slugs*. Such classes are commonly known as *slugifiers*.
+
+There are many slugifier libraries in existence, each of which handles the
+process in a slightly different way.
+
+Frameworks and CMSs that have custom needs MAY extend the interface for their own
+purpose, but SHOULD remain compatible with this document. This ensures that
+third-party libraries can use the same implementation.
+
+The term "URL-safe" implies a string which does not contain any characters forbidden in
+[RFC 1738][]. The word "slugify" refers to the process of producing a URL-safe string
+from a given source string.
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
+"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
+interpreted as described in [RFC 2119][].
+
+[RFC 2119]: http://tools.ietf.org/html/rfc2119
+[RFC 1738]: http://tools.ietf.org/html/rfc1738
+
+1. Specification
+-----------------
+
+### 1.1 Basics
+
+- The `SlugifierInterface` exposes one method, `slugify`, which accepts a
+ *source string* and returns a slugified string - the *return string*.
+
+### 1.2 Source string
+
+ - There MUST NOT be any restrictions on which characters are accepted in the source string.
+ - Characters in the source string should not assume any special meaning.
+
+### 1.3 Return string
+
+The return string:
+
+ - MUST be URL-safe.
@dbu

dbu Mar 22, 2013

Member

does the rfc1738 define what that means exactly? i wonder if we should say: may not contain any unsafe nor reserved characters. we do not speak about "/" here, but that definitely must be handled by a slugifier, as must be the other reserved characters (imagine i use a slug in a get parameter, a & would break things.)

@dantleech

dantleech Mar 22, 2013

Well, I define the term in the first part of the document:

The term "URL-safe" implies a string which does not contain any characters forbidden in
[RFC 1738][].

The actual part of the RFC is 2.2 URL Character encoding issues. Maybe I should make explict mention to that.

The "/" chacracter is classed as unsafe in the RFC.

+ - MUST be a string.
+ - MAY be an empty string.
@dbu

dbu Mar 22, 2013

Member

does this make sense? should we not rather say the opposite, that it must be a non-empty string even when the source was empty? a slugifier could generate a random hash if there is no source. if the user does not want this, he could check the source, rather than having to check the return string even if his source was non-empty.

@staabm

staabm Mar 22, 2013

maybe the sligifier should throw an exception on empty param?

@dantleech

dantleech Mar 22, 2013

Well, my thinking is that the slugifier should be as simple as possible - it should just be a filter that does not worry about implementation details, I think if the implementing class decides to return a non-empty string when given an empty string that is its decision.

I have created a wiki page on symfony-cmf which lists all the slugifiers I have found and have listed some of the features of each.

  • 3/4 slugifiers specify a separator in the signature, e.g. function slugify($string, $separator = '-')
  • 3/4 slugifiers return an empty string if given an empty string.
  • None of the slugifiers throw exceptions.

Wiki page: https://github.com/symfony-cmf/symfony-cmf/wiki/Slugifier

@VictorBjelkholm

VictorBjelkholm Jun 22, 2013

It definitely makes sense to return an empty string if you pass it at empty string. I think it's more confusing if you can pass an empty string and get back something else than a empty string.

@DASPRiD

DASPRiD Sep 30, 2013

@dantleech There's BaconStringUtils missing on that wiki page: https://github.com/Bacon/BaconStringUtils

+
+2. Package
+----------
+
+The interface described and a test suite to verify your implementation is provided
+in [psr/slugifier](https://packagist.org/__________) package.
+
+3. `Psr\Slugifier\SlugifierInterface`
+-------------------------------
+
+```php
+<?php
+
+namespace Psr\Slugifier;
+
+/**
+ * Describes a slugifier instance
+ *
+ * See _______
+ * for the full interface specification.
+ */
+interface SlugifierInterface
+{
+ /**
+ * Return a URL safe version of a string.
+ *
+ * @param string $string
+ * @return string
+ */
+ public function slugify($string);
@dbu

dbu Mar 23, 2013

Member

so from your survey, i think it would make sense to add an optional $separator parameter. as its on the interface, implementations can specify their own default, or would $separator = '-' force this as the default for all implementations?

@dantleech

dantleech Mar 25, 2013

I guess that was the implication of the survey, but I guess this is where it also gets complicated - after all a seperator is really an option and should probably be handled by a prior method call to the class. In addition to the seperator we can also imagine the options of "lowercasing" the slug etc.

But then all that is implementation specific, so I would still vote for just keeping one argument.

@dbu

dbu Mar 25, 2013

Member

to promote DI it actually makes sense to say its implementation specific how to configure your slugifier. static methods are not an option anyways. in a symfony case, you would configure the separator and lowercase or whatever options the concrete implementation offers in the service configuration. and if different places need different defaults, you just inject differently configured instanced of the slugifier to them.

@staabm

staabm Mar 25, 2013

I would second that a separator could be set while construction of the slugifier impl and doesn't need to be defined in the interface, because it is just a implementation detail.

+}
+```