Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Text Encoding Brute Force operation #439

Merged
merged 3 commits into from
Dec 18, 2018

Conversation

Cynser
Copy link
Contributor

@Cynser Cynser commented Dec 12, 2018

Closes #423.

Made a simple operation that just runs through all the charsets and encodes the input with each. Added a test for it as well.

@n1474335
Copy link
Member

This looks great, thanks.

Mojibake is often a result of text being encoded into the wrong encoding, so the solution is to decode, rather then encode. It would be useful if this operation listed all possible decodings as well as encoding, in case it shows something useful.

@Cynser
Copy link
Contributor Author

Cynser commented Dec 17, 2018

Thanks - I've added a decode option.

Needed to add the try-catch, as some inputs were causing an 'Unrecognized code' error when it wasn't possible to decode with that encoding, causing the whole operation to fail. Not sure if there's a better way of doing that.

@n1474335 n1474335 merged commit dacb3ef into gchq:master Dec 18, 2018
@n1474335
Copy link
Member

Fantastic, thanks. I've updated this a little to use Array.forEach instead of the for loop as I think it looks a little neater. I've also modified the output a little. The operation now returns a JSON list of possible encodings. When it is displayed in the web app, there is a presentation layer on top of this which displays it as an HTML table. This makes things line up a little nicer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants