Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Slugifier interface proposal #100

Open
wants to merge 2 commits into from

10 participants

@dantleech

This is a proposal for providing a common interface for the many slugifying/urlizing implementations out there.

This PR is a result of the discussion on the symfony-cmf mailing list: https://groups.google.com/forum/?fromgroups=#!topic/symfony-cmf-devs/3l1XVcIhf_c

/cc @dbu, @lsmith77

@lsmith77

typo in the method name

@lsmith77

I guess one expectation related to this is to ensure that the slug is unique. I guess what would happen then is that someone would create a slugifier that is able to f.e. do a DB lookup to check if the slug exits already and if it does, add some integer etc.

then again as soon as this is done, one would also need some identifier (f.e. the parent path) etc, so maybe such things are beyond the scope of this interface.

@lsmith77

it might make sense to reference an RFC to define what "URL-safe" means.

ah doh .. you linked the RFC further down .. so nevermind :)

@lsmith77

subnamespace needs to be updated

@lsmith77

subnamespace needs to be updated

@lsmith77

text needs to be updated to not talk about log

@staabm

like the idea in general :+1:

@dbu dbu commented on the diff
proposed/slugifier-interface.md
((31 lines not shown))
+
+- The `SlugifierInterface` exposes one method, `slugify`, which accepts a
+ *source string* and returns a slugified string - the *return string*.
+
+### 1.2 Source string
+
+ - There MUST NOT be any restrictions on which characters are accepted in the source string.
+ - Characters in the source string should not assume any special meaning.
+
+### 1.3 Return string
+
+The return string:
+
+ - MUST be URL-safe.
+ - MUST be a string.
+ - MAY be an empty string.
@dbu
dbu added a note

does this make sense? should we not rather say the opposite, that it must be a non-empty string even when the source was empty? a slugifier could generate a random hash if there is no source. if the user does not want this, he could check the source, rather than having to check the return string even if his source was non-empty.

@staabm
staabm added a note

maybe the sligifier should throw an exception on empty param?

Well, my thinking is that the slugifier should be as simple as possible - it should just be a filter that does not worry about implementation details, I think if the implementing class decides to return a non-empty string when given an empty string that is its decision.

I have created a wiki page on symfony-cmf which lists all the slugifiers I have found and have listed some of the features of each.

  • 3/4 slugifiers specify a separator in the signature, e.g. function slugify($string, $separator = '-')
  • 3/4 slugifiers return an empty string if given an empty string.
  • None of the slugifiers throw exceptions.

Wiki page: https://github.com/symfony-cmf/symfony-cmf/wiki/Slugifier

It definitely makes sense to return an empty string if you pass it at empty string. I think it's more confusing if you can pass an empty string and get back something else than a empty string.

@DASPRiD
DASPRiD added a note

@dantleech There's BaconStringUtils missing on that wiki page: https://github.com/Bacon/BaconStringUtils

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@dbu dbu commented on the diff
proposed/slugifier-interface.md
((29 lines not shown))
+
+### 1.1 Basics
+
+- The `SlugifierInterface` exposes one method, `slugify`, which accepts a
+ *source string* and returns a slugified string - the *return string*.
+
+### 1.2 Source string
+
+ - There MUST NOT be any restrictions on which characters are accepted in the source string.
+ - Characters in the source string should not assume any special meaning.
+
+### 1.3 Return string
+
+The return string:
+
+ - MUST be URL-safe.
@dbu
dbu added a note

does the rfc1738 define what that means exactly? i wonder if we should say: may not contain any unsafe nor reserved characters. we do not speak about "/" here, but that definitely must be handled by a slugifier, as must be the other reserved characters (imagine i use a slug in a get parameter, a & would break things.)

Well, I define the term in the first part of the document:

The term "URL-safe" implies a string which does not contain any characters forbidden in
[RFC 1738][].

The actual part of the RFC is 2.2 URL Character encoding issues. Maybe I should make explict mention to that.

The "/" chacracter is classed as unsafe in the RFC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@dbu dbu commented on the diff
proposed/slugifier-interface.md
((61 lines not shown))
+
+/**
+ * Describes a slugifier instance
+ *
+ * See _______
+ * for the full interface specification.
+ */
+interface SlugifierInterface
+{
+ /**
+ * Return a URL safe version of a string.
+ *
+ * @param string $string
+ * @return string
+ */
+ public function slugify($string);
@dbu
dbu added a note

so from your survey, i think it would make sense to add an optional $separator parameter. as its on the interface, implementations can specify their own default, or would $separator = '-' force this as the default for all implementations?

I guess that was the implication of the survey, but I guess this is where it also gets complicated - after all a seperator is really an option and should probably be handled by a prior method call to the class. In addition to the seperator we can also imagine the options of "lowercasing" the slug etc.

But then all that is implementation specific, so I would still vote for just keeping one argument.

@dbu
dbu added a note

to promote DI it actually makes sense to say its implementation specific how to configure your slugifier. static methods are not an option anyways. in a symfony case, you would configure the separator and lowercase or whatever options the concrete implementation offers in the service configuration. and if different places need different defaults, you just inject differently configured instanced of the slugifier to them.

@staabm
staabm added a note

I would second that a separator could be set while construction of the slugifier impl and doesn't need to be defined in the interface, because it is just a implementation detail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@AmyStephen

Hey @dantleech - would you open a discussion on the group, too, and link back to here? https://groups.google.com/forum/?fromgroups#!forum/php-fig (Glad to see this proposed as an interface, very important, thanks!).

@l3pp4rd

Hi,

Why it should be a standard? RFC clearly defines a standard, regarding characters allowed in the URL and URL in general. In what ways will it help users? Maybe, instead, write a document "How my URLs should look like and why". Other than that, people imagine slugify as an implementation detail and final result, this would complicate logic, being as an extra method without a satisfying result.

@pmjones
Owner

The real question here is, "How many member projects are already doing this already, which ones are they, and is the existing practice common enough to warrant a proposal?" If there are few or none, I would argue there's no compelling need for a PSR regarding it.

@dantleech dantleech referenced this pull request in Atlantic18/DoctrineExtensions
Open

PSR for Slugifier/Urlizer Interface #649

@dantleech

@l33p4rd Its less about standardizing the I/O (or rather just O) and more about not reinventing the wheel and not forcing a single URLizer upon the user, so not presuming that the implementation we have chosen is going to be suitable for the end users needs. Afterall, internationalization, for example, is a tricky beast, or one implementation I have seen dropped words like "the" and "this" to shorten the URLs.

Oh and also, I think slugify is misleading as it refers to the "-" character I believe, which is an implementation detail. UrlizerInterface is more descriptive, but doesn't sound as good :)

@l3pp4rd

tell me what is the definition of reinvention of a wheel in software world? isn't it called an innovation? is there a wheel defined in software? Nevertheless, even wheels must be different for different needs like speed, weight.. what this pull request is about is: creating an interface to "wheels". How often you will need more than one slugifier library in your project? Why slugifier libraries should brake BC for such an idea, next you will think, oh wait slugifier is a bad interface name, lets brake BC again by renaming it to...

@dantleech
@dbu
dbu commented

to stay with the analogy of @l3pp4rd i am really glad that there is a sort of standard for wheels that tells how they are attached to my car, so i can buy different wheels for winter and summer conditions, and so on. that is exactly the purpose of this interface. it should not define exactly how slugify has to be implemented, that is exactly the choice you should be given. what it does define is that there is this interface and you know when you put in an arbitrary string you get out a slug. so a general purpose library can generate a slug and know its a slug without every library having to reimplement their own slug. or force some choice on one or the other slug library that the user might not like, and when he is using several libraries ending up with 3 slug implementations that make different looking slugs.

@AmyStephen

@dantleech Can you put this on packagist? I'd love to use it. Maybe we need a more casual way to share interfaces with one another that FIG isn't ready to consider? (Or, maybe that's a good way to prove an interface that FIG might want to consider.)

@lsmith77
@dantleech

@AmyStephen This is what we currently have in symfony-cmf/CoreBundle https://github.com/symfony-cmf/CoreBundle/tree/master/Slugifier which is available in packagist: https://packagist.org/packages/symfony-cmf/core-bundle

There is also a WIP bundle using the interface and callback slugifier in CoreBundle: https://github.com/symfony-cmf/RoutingAutoBundle

@flack

Don't know if this is relevant, but I use this package:

https://github.com/bergie/midgardmvc_helper_urlize

It's available on packagist, and can do transliterations of non-Latin alphabets. Maybe some of the logic could be used for PSR implementations

@dantleech

So that makes 9 slugifiers on the wiki page. Incidentally I am proposing a standard CMF interface library - https://github.com/symfony-cmf/symfony-cmf/wiki/Future-ideas#standard-library-dtl which would include stuff like this.

@AmyStephen

@dantleech - I opened a discussion on the FIG list about the possibility of creating a high-level namespace (ex. "Api") for use by developers who want to work on common Interfaces. Although a few supported the notion, there were others who did not. For that reason, I published my Interfaces in a CommonApi Repo to see if others want to work on common Interfaces/Apis. If you are interested in being involved, I created a discussion area on Google Plus where we can discuss. Personally, I believe there is a lot of value in the broader PHP community helping to organize and standardize this work before FIG considers it.

@dantleech

Yeah, thats pretty much what I am proposing for the CMF in that it would potentially act as a staging area for FIG proposals, although it could also include interfaces which apply only to the CMF libraries. Hmm.

/cc @dbu @lsmith77 @WouterJ

@dbu

things like the slugifier would really fit better in the general repositories proposed by @AmyStephen

for the cmf specific things, i still think that CmfCoreBundle is good enough, until such time as something should become so general that it is not symfony cmf specific anymore and then a cmf interface repository would not make sense if its a more general interface.

@DASPRiD

Is there any progress on this?

@dantleech

@DASPRiD not really. If there is interest I could try and move it forward.

@DASPRiD

@dantleech I'm, as a slugifier author, am certainly interested to get this in.

@dantleech

@DASPRiD if you have an existing library maybe you could add it to this wiki page

@DASPRiD
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Mar 21, 2013
  1. @dantleech

    Added slugifier proposal

    dantleech authored
Commits on Mar 22, 2013
  1. @dantleech

    Updated

    dantleech authored
This page is out of date. Refresh to see the latest.
Showing with 78 additions and 0 deletions.
  1. +78 −0 proposed/slugifier-interface.md
View
78 proposed/slugifier-interface.md
@@ -0,0 +1,78 @@
+Slugifier Interface
+===================
+
+This document describes a common interface for slugifier libraries.
+
+The goal is to provide a simple common interface for classes which generate
+URL-safe string, or *slugs*. Such classes are commonly known as *slugifiers*.
+
+There are many slugifier libraries in existence, each of which handles the
+process in a slightly different way.
+
+Frameworks and CMSs that have custom needs MAY extend the interface for their own
+purpose, but SHOULD remain compatible with this document. This ensures that
+third-party libraries can use the same implementation.
+
+The term "URL-safe" implies a string which does not contain any characters forbidden in
+[RFC 1738][]. The word "slugify" refers to the process of producing a URL-safe string
+from a given source string.
+
+The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
+"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
+interpreted as described in [RFC 2119][].
+
+[RFC 2119]: http://tools.ietf.org/html/rfc2119
+[RFC 1738]: http://tools.ietf.org/html/rfc1738
+
+1. Specification
+-----------------
+
+### 1.1 Basics
+
+- The `SlugifierInterface` exposes one method, `slugify`, which accepts a
+ *source string* and returns a slugified string - the *return string*.
+
+### 1.2 Source string
+
+ - There MUST NOT be any restrictions on which characters are accepted in the source string.
+ - Characters in the source string should not assume any special meaning.
+
+### 1.3 Return string
+
+The return string:
+
+ - MUST be URL-safe.
@dbu
dbu added a note

does the rfc1738 define what that means exactly? i wonder if we should say: may not contain any unsafe nor reserved characters. we do not speak about "/" here, but that definitely must be handled by a slugifier, as must be the other reserved characters (imagine i use a slug in a get parameter, a & would break things.)

Well, I define the term in the first part of the document:

The term "URL-safe" implies a string which does not contain any characters forbidden in
[RFC 1738][].

The actual part of the RFC is 2.2 URL Character encoding issues. Maybe I should make explict mention to that.

The "/" chacracter is classed as unsafe in the RFC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+ - MUST be a string.
+ - MAY be an empty string.
@dbu
dbu added a note

does this make sense? should we not rather say the opposite, that it must be a non-empty string even when the source was empty? a slugifier could generate a random hash if there is no source. if the user does not want this, he could check the source, rather than having to check the return string even if his source was non-empty.

@staabm
staabm added a note

maybe the sligifier should throw an exception on empty param?

Well, my thinking is that the slugifier should be as simple as possible - it should just be a filter that does not worry about implementation details, I think if the implementing class decides to return a non-empty string when given an empty string that is its decision.

I have created a wiki page on symfony-cmf which lists all the slugifiers I have found and have listed some of the features of each.

  • 3/4 slugifiers specify a separator in the signature, e.g. function slugify($string, $separator = '-')
  • 3/4 slugifiers return an empty string if given an empty string.
  • None of the slugifiers throw exceptions.

Wiki page: https://github.com/symfony-cmf/symfony-cmf/wiki/Slugifier

It definitely makes sense to return an empty string if you pass it at empty string. I think it's more confusing if you can pass an empty string and get back something else than a empty string.

@DASPRiD
DASPRiD added a note

@dantleech There's BaconStringUtils missing on that wiki page: https://github.com/Bacon/BaconStringUtils

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+
+2. Package
+----------
+
+The interface described and a test suite to verify your implementation is provided
+in [psr/slugifier](https://packagist.org/__________) package.
+
+3. `Psr\Slugifier\SlugifierInterface`
+-------------------------------
+
+```php
+<?php
+
+namespace Psr\Slugifier;
+
+/**
+ * Describes a slugifier instance
+ *
+ * See _______
+ * for the full interface specification.
+ */
+interface SlugifierInterface
+{
+ /**
+ * Return a URL safe version of a string.
+ *
+ * @param string $string
+ * @return string
+ */
+ public function slugify($string);
@dbu
dbu added a note

so from your survey, i think it would make sense to add an optional $separator parameter. as its on the interface, implementations can specify their own default, or would $separator = '-' force this as the default for all implementations?

I guess that was the implication of the survey, but I guess this is where it also gets complicated - after all a seperator is really an option and should probably be handled by a prior method call to the class. In addition to the seperator we can also imagine the options of "lowercasing" the slug etc.

But then all that is implementation specific, so I would still vote for just keeping one argument.

@dbu
dbu added a note

to promote DI it actually makes sense to say its implementation specific how to configure your slugifier. static methods are not an option anyways. in a symfony case, you would configure the separator and lowercase or whatever options the concrete implementation offers in the service configuration. and if different places need different defaults, you just inject differently configured instanced of the slugifier to them.

@staabm
staabm added a note

I would second that a separator could be set while construction of the slugifier impl and doesn't need to be defined in the interface, because it is just a implementation detail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+}
+```
Something went wrong with that request. Please try again.