Skip to content
This repository has been archived by the owner on Jan 29, 2024. It is now read-only.

Sanitization issue with arabic characters #1101

Open
badr-ghatasheh opened this issue Jun 24, 2015 · 48 comments
Open

Sanitization issue with arabic characters #1101

badr-ghatasheh opened this issue Jun 24, 2015 · 48 comments

Comments

@badr-ghatasheh
Copy link

Hi,

There is an issue with applying 'sanitize' strategy while using Arabic characters

for example sanitizing the word 'بدر' will print out the following on the page:

بدر

Can this be fixed?
Thank you

@n00dl3
Copy link

n00dl3 commented Jun 25, 2015

same problem with any utf8 character, "éèë..."

@tspaeth
Copy link
Member

tspaeth commented Jun 26, 2015

This is somehow a thing we should discuss. But I'm just myself wondering, why "simple UTF*" conversion fails in the $sanitize-module this way. Needs a bit of investigation I guess as I'm too far away from using that. jqLite - functionality itself is okay.
But as it impacts also the security a bit on sanitizing elements, my first suggestion would be to introduce a "trusthtml" sanitize strategy as an additional option for those really wanting to open that door :-).

Somehow a parallel issue exists also on angular itself to watch: angular/angular.js#4790

@knalli what do you think?

And here is a little playground: http://plnkr.co/edit/w4Erw7yXHQp9d2ndr4HZ?p=preview

@knalli
Copy link
Member

knalli commented Jun 28, 2015

I'm a little bit confused.

@badr-ghatasheh: You are writing you are faced with this issue when

applying 'sanitize' strategy

but you are actually not meaning sanitize but escape, do you? Because htmlEscapeValue will not be used when sanitize is activated (it is not required).

Remarking your "fix": That's not a fix, that would make the whole thing ad absurdum: http://plnkr.co/edit/cDCBXlWVgYtny9ciJGV9?p=preview It only works using standard AJS 1.3+ with a $sce running in the background (i.e. ng-bind-html).

@tspaeth I've listed the different kinds of: http://plnkr.co/edit/cDCBXlWVgYtny9ciJGV9?p=preview I'm quite not sure what this does mean for us right now.

@badr-ghatasheh
Copy link
Author

Hi @knalli

I Updated the issue, I seem to have mixed the function names.

However, the issue still applies to 'sanitize' strategy and utf8 characters.

@samal-rasmussen
Copy link

@knalli
Copy link
Member

knalli commented Jul 7, 2015

I've posted an answer: Use sanitizeParameters, escape or escapeParameters.

Sorry for the delay in general, we are short of time. I have no final solution despite the obvious fact that sanitize (not sanitizeParameters) + utf codes breaks. As angular-translate is delegating this to AJS' $sce, I'm quite not sure what the possibilities are.

@bettysteger
Copy link

👍

@tspaeth
Copy link
Member

tspaeth commented Jul 16, 2015

Linking in #1131 as a possible duplicate...

@DaAwesomeP
Copy link

👍 Happens in for other characters too (such as accents): #1081

@DaAwesomeP
Copy link

Would it be possible to make a workaround with a custom filter?

@knalli
Copy link
Member

knalli commented Sep 13, 2015

Just to repeat: As this issue is caused by $sanitize internally, the only solution can be so far using not using sanitize or a alternative implemented sanitization. Any suggestions are welcome!

This means: You can still use escape and escapeParameters which aren't unsecure! Both sanitize and sanitizeParameters would be more comfortable in some situations.

@knalli
Copy link
Member

knalli commented Sep 15, 2015

I've reworked my previous example and have collected all different kinds of flows. Please notify me if something is still missing!

http://plnkr.co/edit/8BHONOjhsxBxxO9oVYOK?p=preview

So far, I'm seeing that usage of filter is always being somehow affected with issues. As I'm already mentioned, the strategies escape (formely escaped) and the newer ones escapeParameters won't harm the texts, but the sanitize and sanitizeParams will do (when using ng-bind or the filter, AFAIK technically the same).

As we cannot change the result of $sanitize(), I would recommend currently to use escape.

@000panther
Copy link
Contributor

I would recommend posting this issue somewhere in the security manual, where there is currently no strategy recommended - highlighting the current bug and also the solution (using escape for now). It took me three days to notice that this is a known bug! While this time I tried different versions, I thought my sanitation was not gripping correctly.

TL:DR: IMHO this bug + intermediate solution should be in the security docs!
http://angular-translate.github.io/docs/#/guide/19_security

I will make a pull request.

BTW: In my case Ö Ü Ä ü ö ä & are also double escaped.

@knalli
Copy link
Member

knalli commented Sep 5, 2016

Just released 2.12.0 containing the new strategies sce and sceParameters.

@knalli
Copy link
Member

knalli commented Oct 12, 2016

Current demo running with 2.12.1: http://plnkr.co/edit/rJ0R6GzHGbMVxhSa237C?p=preview

@danielbolivia
Copy link

Hi everyone.

Well I have the same problem here, but I think the problem could be when we use $translateProvider.useSanitizeValueStrategy('sanitize');...someone says when we do this its ocurre a double sanitize, could be because we Injected the script for sanitize and we do this $translateProvider.useSanitizeValueStrategy('sanitize'); is a double sanitize?
I think we dont need sanitize becasue We already injected script sanitize..

Maybe that's the problem? or I am wrong?
Please contribute.
Thanks.

@augnustin
Copy link

For the records, I got it working thanks to the reference above:

Change .useSanitizeValueStrategy parameter to 'sce'

      $translateProvider.useSanitizeValueStrategy('sce');

@LiamK
Copy link

LiamK commented Apr 17, 2017

I just tried using the the 'sce' parameter and it works as far as the translations were concerned. However, it also triggers a digest loop error, which is not there with either 'escape' or 'sanitize' params.

Any idea what might be the problem? I removed the watch() statements in the controller but that had no effect.

UPDATE: I tried sceParameters and that worked.

@keybeth
Copy link

keybeth commented Jul 17, 2017

I tested with all strategies but I cannot translate a placeholder. Also I put HTML entity, hexa code, and the character and nothing doesn't work . The literals are good, but in the input placeholder never put the correct character.

<input type="email" name="email" class="form-control" required autofocus ng-model="ctrl.login.email" translate-attr-placeholder="attribute.email" translate>

locale*es.json

"attribute.email":"Correo electrónico",

@knalli
Copy link
Member

knalli commented Jul 17, 2017

fyi: current demo running AJS@1.6.4 & angular-translate@2.15.2

@keybeth I'm applied the character in question here. It's okay?

@keybeth
Copy link

keybeth commented Jul 17, 2017

@knalli, I don't have problem when translate the character like literal

The next line works, with "attribute.email":"Correo electr&oacute;nico", in my locale.json
<label for="inputEmail" class="sr-only">attribute.email</label>

but, with the same configuration doesn't work here:

<input type="email" name="email" class="form-control" required autofocus ng-model="ctrl.login.email" translate-attr-placeholder="attribute.email" translate>

Also, I tried

<input type="email" name="email" class="form-control" required autofocus ng-model="ctrl.login.email" placeholder="{{'attribute.email' | translate}}">

With sce, null,escape,sanitize strategy with or without params, but nothing works in the placeholder param.

@knalli
Copy link
Member

knalli commented Jul 17, 2017

What is the difference compared with my demo link (the second one, targeting exactly your special character). If the only difference is the character is in the $app.js vs $locale.json, then the issue is a standard encoding one when delivering the files. Otherwise I don't get it.

@keybeth
Copy link

keybeth commented Jul 17, 2017

@knalli I did the test and when a took the translatiosn form there:

.translations('es', { 'attribute.email': 'Correo electrónico' })
was good, but when I took the translation from the static file:

$translateProvider.useStaticFilesLoader({ prefix : 'i18n/locale-', suffix : '.json' })

no was good. I copied and pasted the same string line like I see in my browser to do the first configuration without any other configuration. It seems a trouble from static files

@knalli
Copy link
Member

knalli commented Jul 17, 2017

As I already mentioned: That's a common issue serving (static) files. Get your server output encoding right, that's not an issue for an JS application (in which angular-translate is running).

rgm added a commit to opengb/seed that referenced this issue Nov 2, 2017
Adds ngSanitize so we can use HTML in translation strings, changes
angular-translate's sanitization strategy to `sanitize` instead of
`escape`.

My motivation here is to make the translation strings be non-ridiculous.
Without being able to inject un-escaped HTML into the translation, we
end up chopping up the translation keys to the point where they're not
comprehensible within lokalise, and doing weird gymnastics in the
angular template like `<span translate>key</span>: <strong
translate>value</strong>` because both are contained within the same
parent.

The cost here is that there's an [open bug on angular-translate][1] for
the `sanitize` strategy where unicode gets double-encoded. I tried `sce`
as a strategy, but it seemed open to XSS (ie. include
`<script>alert('xss')</script>` in a translation and you'll see the
alert. The `sanitize` strategy successfully killed this vector). The
cost here is that as much as possible we need to use the `<tag
translate>` form instead of the `{$ 'string' | translate $}` form.

For the most part, we're not translating arbitrary user input, but no
sense in leaving that door open for us to forget about down the road.

[1]: angular-translate/angular-translate#1101
rgm added a commit to opengb/seed that referenced this issue Nov 3, 2017
Adds ngSanitize so we can use HTML in translation strings, changes
angular-translate's sanitization strategy to `sanitize` instead of
`escape`.

My motivation here is to make the translation strings be non-ridiculous.
Without being able to inject un-escaped HTML into the translation, we
end up chopping up the translation keys to the point where they're not
comprehensible within lokalise, and doing weird gymnastics in the
angular template like `<span translate>key</span>: <strong
translate>value</strong>` because both are contained within the same
parent.

The cost here is that there's an [open bug on angular-translate][1] for
the `sanitize` strategy where unicode gets double-encoded. I tried `sce`
as a strategy, but it seemed open to XSS (ie. include
`<script>alert('xss')</script>` in a translation and you'll see the
alert. The `sanitize` strategy successfully killed this vector). The
cost here is that as much as possible we need to use the `<tag
translate>` form instead of the `{$ 'string' | translate $}` form.

For the most part, we're not translating arbitrary user input, but no
sense in leaving that door open for us to forget about down the road.

[1]: angular-translate/angular-translate#1101
@knalli
Copy link
Member

knalli commented Nov 4, 2017

All strategies are listet at https://angular-translate.github.io/docs/#/guide/19_security

@rmakara
Copy link

rmakara commented Feb 19, 2018

We had the same problem with our application just before go-live. :) We found that Github issue to look for the solution, but unfortunately changing sanitize strategies didn't work. We use Magento as backend application and Angular for frontend application. In our case, the problem wasn't in ngSanitize, nor Angular translations as we thought at the beginning. The problem was in the process of merging many JavaScript files into one, which is done by Magento to optimize the page.

Files before the merge were UTF-8 but after merge we had the only one file that wasn't properly encoded and we had "rubbish" in sanitized text strings. To solve the problem we turned that Magento's merging process off.

@barbecube
Copy link

barbecube commented Feb 23, 2018

Hi, I'm facing problem with encoding when using sanitize strategy and translate-attr- directive or translate filter. Directive translate works ok. Using version 2.17.0. Is there fix coming? Thanks

@astronati
Copy link

The bug is still reproducible

@fynnfeldpausch
Copy link

fynnfeldpausch commented May 29, 2018

Is there any progress on this issue?

Edit:
For anyone interested, I am currently using the following workaround. Feedback welcome.

.config(function($provide) {
    $provide.decorator('translateFilter', function($delegate) {
        return function() {
            var result = $delegate.apply(this, arguments);
            return angular.element('<div/>').html(result).text();
        };
    });
});

@knalli
Copy link
Member

knalli commented May 29, 2018

Is there any progress on this issue?

Short version: no.

Long version: Well, the problem is: I have shown a collection of different situations/strategies and even a maybe-fix, but I'm still unsure about the outcome. While this is an old issue and I'm getting new questions/issues about this topic from time to time, there is actually no really feedback effectively. I'm linking everytime at this issue here.

We are still interested in solving this, and any help is welcome. Maybe a refresh or restart in thinking is fine, I don't know. Because it's not a quick-fix and requires a lot of work in finding and adapting any change, it seems to be not an attractive issue :/

@auchEnia
Copy link

We would be hapy to switch to sceParameters strategy, but unfortunatelly it is not as secure as sanitizeParameters. The user can still provide for example <img src="x" onerror="alert('test')"> as an input text value and if that value is then used as a translation parameter, the onerror function is actually executed. The only strategy which prevents this behaviour is the sanitizeParamaters. There is also no way to provide some sort of black/white list in $sce to filter out this kind of stuff.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests