New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTML entity decoding cannot be disabled #79
Comments
Every library has its own evolution path. If you have found any shortcoming, a bug or simply looking for a feature that will benefit many, then you can raise a PR. Please note, that this lib relies on other uni-decode libs to do its job, hence for simplicity, we just don't accept PRs that are only cover a very narrow scenarios. This lib converts everything to ASCII! |
So, I take it there is no such option. What does |
@MartinFalatic you may have uncovered a bug, if so, feel free to raise a PR with unit test. |
Here is an example of entity from the test file. def test_html_entities(self):
txt = 'foo & bar' # foo & bar
r = slugify(txt)
self.assertEqual(r, 'foo-bar') |
That's an example of the To fix it, I need to know what your intention was for this option. The intended default appears to be |
It meant to clean things like |
I haven't fully characterized that operation. It certainly attempts a conversion, but I'll be curious to see what it does for more interesting encodings (such as letters or numbers). That would be a test enhancement. What we know right now is that it can't be turned off, so I'd expect that any change should be transparent to anyone not using Next step is to find out how the two possible helper libraries you can choose from ( |
Edit: looks like those helper libs may not be involved. But neither is https://github.com/un33k/python-slugify/blob/master/slugify/slugify.py#L117 ... will have to dig in to see where the conversion is taking place. |
Actually, that part does work... there's still something unexpected going on with this (pretty sure it's The ordering of these is also interesting. If nothing else comes out of this, more documentation and tests (as examples and testing all the options) will be useful. |
So, entity decoding is fully disabled only if you specify all three options ( As an FYI to anyone reading this who is migrating from awesome-slugify, the following construction provides parity with the default behavior of that library for ascii data: awesome-slugify: python-slugify: |
@MartinFalatic I am glad you found the right combo for your situation & thank you for leaving a note for anyone else that is attempting to use it this way! |
I'm trying a very simple test as we look at moving from the outdated
awesome-slugify
topython-slugify
. For backward compatibility reasons we are trying to match its present behavior.For the data involved this can be accomplished after jumping through a few small hoops, but not all of them can be accomplished within
python-slugify
using the options available.One hoop is trivial (you can't seem to force the conversion of
,
to-
short of pre-replacing the comma with something else.)However, the other hoop is more odd: I can't seem to disable HTML entity decoding.
If
entities
isn't what mediates this conversion, what does, and is it configurable? Currently (like commas) replacing#&
with something like_
seems to be the only way to disable that behavior when necessary.The text was updated successfully, but these errors were encountered: