Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forbidden characters #619

Closed
fgalan opened this issue Nov 4, 2014 · 13 comments
Closed

Forbidden characters #619

fgalan opened this issue Nov 4, 2014 · 13 comments
Assignees
Labels

Comments

@fgalan
Copy link
Member

fgalan commented Nov 4, 2014

Let's consider the following set of forbidden characters:

< 
> 
“
‘
=
;
(
)

Two behaviors have to be implemented in Orion CB, one related to requests and other related to responses.

Regarding requests, the CB must avoid these characters being used in some "protected" fields in requests. In particular:

  • entity id
  • entity type
  • attribute name
  • attribute type
  • attribute value
  • metadata name
  • metadata type
  • metadata value
  • providingApplication
  • convValues (currently this is used only internally, but when we implement operations to browse subscription, it will also need protection).
  • reference (currently this is used only internally, but when we implement operations to browse subscription, it will also need protection)
  • scopeValue (currently this is used only internally, but when we implement operations to browse subscription, it will also need protection)
  • attributeExpression (not currently used, but probably also needs protection)
  • anything else?

(Taking into account the large number of fields, maybe it makes sense to adopt a "global protection" approach, protecting every field).

If the user attempts to use any forbidden character, the broker should return a "400 Bad Request" NGSI error, explaining the details of the problem (but without citing the offending characters?), e.g. "Forbidden characters in field X, please check Orion documentation". As any other bad input problem, this should be logged with the proper LM_W.

Special note regarding XML: the checking has to be done after "XML entities" translation, i.e. it is legal that a client sends for instance <contextValue>H&amp;M shop</contextValue> as it translates as "H&M". Thus, the checking routine should check "H&M" not "H&M" because, in the second case, it will raise a bad input and disable any possibily of XML users to use some characters forbidden in XML but that aren't part of the set defined above (e.g. "&"). I guess that the XML parser deals with translation, thus probably intercepting the fields as offered by the XML parsing library will work.

Regarding responses, CB should process the resulting payload string after rendering and just before passing it to the HTTP library to send the response in order to escape any forbidden characters. Note this is necessary because of two reasons: 1) some user with DB privileges could have introduced information at DB layer (thus not protected by the above behavior regarding requests) using forbidden characters, 2) in the case of some payload response pieces that are "bypassed" from the request in special cases (e.g. "Service not recognized", which literally uses the URL provided by the user in the request).

We will use the URL encoding HTML encoding, thus the translation for each one of the forbidden characters is the following:

<    &lt;
>    &gt;
“    &quot;
‘    &#39;
=    &#61
;    &#59;
(    &#40;
)    &#41;

(Reference: http://www.asciitable.com/)

@fgalan fgalan added the backlog label Nov 4, 2014
@fgalan fgalan added this to the Release 0.17.0 milestone Nov 4, 2014
@fgalan
Copy link
Member Author

fgalan commented Nov 4, 2014

Feedback de Antonio Plasencia:

Si la entrada de datos que forma el ataque aparece en la respuesta codificada como URL, los
ataques conseguidos en nuestro análisis no funcionarán (al ir los datos a parar a un XML o a un
HTML)

Sin embargo, quizás sí podrían funcionar ataques donde los datos vayan a parar a un “href”, a un
Javascript o incluso a CSS. Esto ya depende del cliente de la API.

Como medida más eficaz para todos los casos, es más recomendable usar la codificación en
entidades HTML. ¿Sería posible en vuestro caso?

En el siguiente enlace se discuten las estrategias a seguir según los casos:

https://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet

También se recomienda usar ESAPI de Owasp como librería para realizar las validaciones o
escapados que hagan falta, tanto en la entrada como en la salida de datos:

https://www.owasp.org/index.php/ESAPI#tab=Downloads

Está disponible para multitud de lenguajes.

@fgalan
Copy link
Member Author

fgalan commented Nov 4, 2014

El problema que veo al utilizar HTML encoding para codificar los caracteres prohibidos en las respuestas es que introduciría a su vez un caracter prohibido, ya que la condificación sigue el formato "&xxx;", es decir usa el ";". ¿O quizás estoy interpretando mal lo que propones?

Por otra parte, revisando la referencia de Owasp que mandas, no parece que tengan librería para C++ (lenguage de implementación de Orion CB).

@aplasencia
Copy link

El ";" por sí sólo no supondría un riesgo si el resto de caracteres mencionados se codifican en entidades HTML. En este sentido, también incluiría los paréntesis para evitar riesgos cuando los datos acaben en scripts.

Por otra parte, si ESAPI no está disponible para C++ no sé si se podría llegar a utilizar (quizás a través de un intérprete de Python, pero en estos temas me pierdo ya completamente). Si no es posible habría que codificarlo "in house".

@fgalan
Copy link
Member Author

fgalan commented Nov 4, 2014

Aclarado que ";" por si solo no es un problema, entonces podemos utilizar codificación HTML. No creo que haya problema a codificaciones "in house" ( @kzangeli es un experto en la materia ;)

@fgalan
Copy link
Member Author

fgalan commented Nov 4, 2014

Añadidos paréntesis en el body del issue al conjunto de caracteres prohibidos.

@mrutid
Copy link
Member

mrutid commented Nov 18, 2014

Have you considered when the Injection is placed within the URL itself?

See example at:

https://jirapdi.tid.es/browse/DM-243

In general the less information in error responses the better.

@kzangeli
Copy link
Member

Good point.
It definitely gives an error, "service not found".
I think I changed that, yes. I'll check just to be sure

@fgalan
Copy link
Member Author

fgalan commented Nov 21, 2014

Feature partially implemented in PR #631

@fgalan
Copy link
Member Author

fgalan commented Nov 21, 2014

Feature partially implemented in PR #641

@fgalan
Copy link
Member Author

fgalan commented Nov 21, 2014

Although most of the work has been done in PRs #631 and #641, user input "bypass" in some cases could be potentially exploited for script injection. E.g the "Entity id 'E1'" message in the following response:

{
    "contextElement": {
        "id": "E1", 
        "isPattern": "false", 
        "type": ""
    }, 
    "statusCode": {
        "code": "404", 
        "details": "Entity id: 'E1'", 
        "reasonPhrase": "No context element found"
    }
}

Some similar happens with errors related with pagination parameters.

The following "algorithm" could be used for detecting these cases:

The problem is in error messages and in particular in the "details" field of the statusCode or errorCode objects (code and reasonPhrase are "fixed"). This is the unique case in which the client input may "bypass" to the response (all the others responsed fields are rendered from DB, where protection measures has been put in place in previous PRs).

Let's assume the constructor for statusCode and errorCode is the fill(int code, std:string details) method (*), it would be a matter of locating all the lines in the codes in which that fill() uses either: 1) a variable (i.e. no constans strings), 2) a constant string with %s, e.g. using the following "grep magic" (probably it has some bug, but I think you get the idea):

find -name *.cpp | grep fill | grep '%s'
find -name *.cpp | grep fill | grep -v '"'

In the case of variables, we have to check if the varible is field using "%s".

(*) In the case of having several constructors (I don't remember now), the same procedure could be applied to all them.

@kzangeli
Copy link
Member

In either case, don't forget we already html-escape all output.

@fgalan
Copy link
Member Author

fgalan commented Nov 22, 2014

In either case, don't forget we already html-escape all output.

I was testing with a wrong branch. I have checkout to feature/619_escaped_output and test again:

fermin@debvm$ curl 'localhost:1026/v1/contextEntities/script()=<>>;'
<contextElementResponse>
  <contextElement>
    <entityId type="" isPattern="false">
      <id>script&#40;&#41;&#61;&lt;&gt;&gt;&#59;</id>
    </entityId>
  </contextElement>
  <statusCode>
    <code>404</code>
    <reasonPhrase>No context element found</reasonPhrase>
    <details>Entity id: /script&#40;&#41;&#61;&lt;&gt;&gt;&#59;/</details>
  </statusCode>
</contextElementResponse>

fermin@debvm$ curl 'localhost:1026/v1/contextEntities?limit=;'
<orionError>
  <code>400</code>
  <reasonPhrase>Bad Request</reasonPhrase>
  <details>Bad pagination limit: /&#59;/ [must be a decimal number]</details>
</orionError>

It seems that even in the case of input "bypass" we are protected by httml escape in any case. Thus it seems we don't need any additional modification and the code is ok now.

@fgalan
Copy link
Member Author

fgalan commented Nov 24, 2014

It seems that even in the case of input "bypass" we are protected by httml escape in any case. Thus it seems we don't need any additional modification and the code is ok now.

It gets confirmed that no extra modifications are needed to protect CB from script injection. However, remaining "clean up" works has been moved to tech debt issue #661

@fgalan fgalan closed this as completed Nov 24, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants