New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIX value differs when calculated via tinymce vs pre save signal #2224
Comments
Could you name a page where this is happening? I'm having trouble reproducing this |
I noticed this with e.g. "Ausländerbehörde" in the test data... I noticed that the content in the database (which is sent in thee pre-save signal) contains unicode (
whereas the content from TinyMCE (that is sent from the form widget) uses the HTML-notation (
So especially when the calculation difference causes MTs to be allowed via one way and forbidden via the other, this could be a very big problem. |
Looks to me as if the proper solution is TextLab fixing their whatever it is, and the next best thing we could do is to insert a normalizing step before sending anything off. |
I received an email on May 11th regarding this, I think: I assume that the value goes down bcs of this issue here, but the copying leading to an outdated value is another problem, right? Should I open an issue for that? And how far along are we with this issue? |
Yes, you can open an issue for this, but I think the prio is quite low if the issue with the differing values is fixed.
Nobody is assigned to this issue, which means that nobody started working on it. |
Yes, prio low works for me once this issue is fixed. |
Ok - is there anybody who could take the issue up? Bcs I think it is quite frustrating for our Kommunen. |
@PeterNerlich We don't have insight into Textlab's algorithm, but what we do know is that little differences in the input text can cause differences in the HIX value, independently of whether the different text is really harder/easier to understand. |
Apparently, this is not fully resolved yet: |
@osmers Hi! Could you please clarify if the problem is reproduced now on all pages, or only on some pages? I couldn't reproduce the issue :( |
@seluianova it does not seem to be a problem with every page - at least the fact that the HIX value changes. That it is outdated after pressing ctrl A is the issue with every page (I think it should not show outdated when I haven't really done anything). |
@ulliholtgrave @svenseeberg Can one of you get us the content from the database so that we can try it locally? |
I've found the steps to reproduce the issue.
The behavior will be the same if step 3 is "Press Ctrl+A and reload the HIX value". The problem is that the first time a page is saved, its content is saved like this:
If I just publish it a second time with no changes, the content becomes:
It's not reproduced with 1 paragraph, because div-tag is not added then. No div - no problem.
So we don't just send different content to TextLab, but also change it in our database. |
David said:
This seems plausible, in that case the place inside the library should be here: Then the next step seems trivial to me. We should make sure we always create a surrounding |
Looking at the source code of that function, we should be able to switch to something like this: try:
- content = fromstring(self.cleaned_data["content"])
+ doc = document_fromstring(self.cleaned_data["content"])
+ content = doc.body
+ children = content.getchildren()
+ # Ensure that this is stable and we don't add another `div` on every form save
+ if len(children) == 1 and children[0].tag == 'div':
+ content = children[0]
+ else:
+ content.tag = 'div'
except LxmlError:
# The content is not guaranteed to be valid html, for example it may be empty
return self.cleaned_data["content"] This basically emulates what |
Describe the Bug
The HIX value differs depending on how it's calculated.
Steps to Reproduce
Expected Behavior
The value should be identical
Actual Behavior
The value differs
Additional Information
I assume this is due to the different way line breaks are handled in tinymce vs the database.
TinyMCE just uses
\n
as line break whereas the content in the database is\r\n
.The text was updated successfully, but these errors were encountered: