Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question/Bug: how do I preserve dashes and not replace them with separator? #107

Closed
adamcunnington-mlg opened this issue Nov 1, 2021 · 10 comments

Comments

@adamcunnington-mlg
Copy link

adamcunnington-mlg commented Nov 1, 2021

I am trying to slugify a string, replacing non-matching characters with an underscore, but I don't want to consider a dash a non-matching character. This is easily achieved with regex but it looks like there is some special slugify behaviour when it comes to dashes which is preventing this working. I can't see anything in the docs or code, about how to work around this.

What did I do?

slugify.slugify("test-foo bar", separator='_', regex_pattern=r'[^-a-zA-Z0-9_]+')

What happened?
test_foo_bar

What did I expect?
test-foo_bar

@adamcunnington-mlg
Copy link
Author

Pretty sure it is a big, caused by https://github.com/un33k/python-slugify/blob/master/slugify/slugify.py#L179

Looks like there is a test case missing.

@adamcunnington-mlg
Copy link
Author

@un33k

@un33k
Copy link
Owner

un33k commented Nov 2, 2021

Please note that '-' is a reserved character in slugs. This package allows an atomic replacement of the default separator of '-' with something else towards the end of the process.

A simple preprocess of the text is recommended for special cases.

All PRs are considered provided they remain backward compatible.

@adamcunnington-mlg
Copy link
Author

adamcunnington-mlg commented Nov 2, 2021

Ok, I think this is definitely PR-worthy. I agree that '-' is a reserved character - but this is precisely the reason why you shouldn't remove them. I can pre-process the string; sure; but this seems a bit broken.

Will find time to raise PR but I'm not sure how this can remain backwards compatible as this would be changing (fixing) behaviour where input string contains a - character. Do you imagine this being behind a feature toggle that defaults to current behaviour perhaps?

@adamcunnington-mlg
Copy link
Author

Also, there's a logical problem with the suggested workaround. Pre-processing the string is highly problematic - what character am I going to replace - with, before I pass it to a call to slugify that is going to replace characters that don't match my regex? I have a very limited set of options and whatever I choose could legitimately appear elsewhere in the string and be erroneously substituted by the post-processing.

The only thing I could do is replace - with some string that i think is going to be reasonably unique. This is brittle and bad. This behaviour should belong to slugify.

@un33k
Copy link
Owner

un33k commented Nov 3, 2021

I think we have examples on how to handle this situation.

def test_multi_character_separator(self):

@un33k
Copy link
Owner

un33k commented Dec 4, 2021

Closing this now, feel free to reopen if the above example is not addressing your needs and/or you have a non-breaking PR.

@un33k un33k closed this as completed Dec 4, 2021
@PradeepTammali
Copy link

I am facing the same issue here. I have a text like this 121232-some text here-some text here again
Trying to slugify the text with separator ' '. Expecting the output 121232-some text here-some text here again but got 121232 some text here some text here again

Is there any work around for this.

@adamcunnington-mlg
Copy link
Author

@PradeepTammali workaround is something like this:

s = str.replace('-', 'thisSucks')
slugify(s, separator='_', regex_pattern='[^a-zA-Z0-9_]+').replace('thisSucks', '-')

@un33k
Copy link
Owner

un33k commented Dec 16, 2021

@PradeepTammali Slugify needs one reserved character, and since the name of the package is slugify its job is to convert some random text to a dash-delimited clean text for use in URLs. With that said, the delimiter can also be changed to whatever user wants. In short, the ask here is not doable. I encourage channelling one's energy to raising a meaningful PR instead.

Repository owner locked as resolved and limited conversation to collaborators Dec 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants