Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: don't update url after character-limit #458

Merged
merged 2 commits into from
Jul 8, 2023

Conversation

SethFalco
Copy link
Member

@SethFalco SethFalco commented Jul 8, 2023

Request URI Too Large

The issue was that the URL bar was being updated with the entire paste, even though it was truncated in the textarea.

Rather than fixing that behavior, I thought we could just remove our truncation logic from #handleInput and rely on the browser for this, though.

The textarea element has a maxLength attribute, where we can set to the character limit. Now, the user-agent handles truncation for us in a manner the user should be used to.

Note: This mostly resolves the problem. I don't see this happening with normal usage, but if a user wanted to try to break things, it's possible. Special characters add multiple chars to the URL bar, so a user could manually submit a query of 1,400+ line breaks, for example, to reproduce the issue, even after this fix.

Despite that, I think that this is a good enough fix to close out the issue.

Line Endings

While testing this behavior, I encountered a bug which is resolved by this PR as well, though the solution is a little hacky.

The spec for Form Data enforces CRLF (\r\n) line endings. When taking that approach to communicate with the API, the character counts mismatch between what the user submitted and what the server receives.

For example, I used the web UI to submit a 2,000-character query. However, the server actually received a 2,029-character query, and responded with a 400 because the query was "too long".

In reality, all \n characters were converted to \r\n, silently adding bytes to the request.

I've added an intermediate step if Form Data is used for communication, which normalizes the line endings to UNIX style.

Alternative Approaches

  • We could consider only using the normalized q to compare the length, but not actually modify q.
  • I considered adding a buffer to char_limit, i.e. char_limit + q.count("\n"), but then users could spam and process larger requests, which is undesirable, even if it's just noise.

Screenshots

Screencast.from.2023-07-08.03-56-30.webm

The original problem. The text is truncated to 2,000 characters (of which 71 are line breaks) in the frontend, but the backend receives 2,071 characters because of the CRLF line endings.

Screencast.from.2023-07-08.03-57-07.webm

After applying the fix.

Related

@pierotofy
Copy link
Member

pierotofy commented Jul 8, 2023

Seems like a reasonable workaround; this happens due to FormData which encodes new lines as CRLF in the browser (as you found out). Another way could have been to use the JSON API.

@pierotofy
Copy link
Member

Thanks @SethFalco !

@pierotofy pierotofy merged commit dcc821b into LibreTranslate:main Jul 8, 2023
4 checks passed
@SethFalco SethFalco deleted the fix-437 branch May 14, 2024 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error when translating too many words
2 participants