Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for raw_input_stream and json_input_stream #3604

Merged
merged 7 commits into from Feb 26, 2015

Conversation

Ignasimg
Copy link
Contributor

There might be cases were a client sends a request with data not in the key1=value1&key2=value2&... fashion for those cases I've added the raw_input_stream.
The input_stream function has been modified to get the data from raw_input_stream.
json_input_stream has been added for cases where the body of the request is in a json format.

Some notes:
$_input_stream has been preserved cause some people might be using it on a extension of the class, even though since 3.0 it's a pre-release version maybe it could be deleted.
On the other hand and even though it's not implemented maybe it would be better (for performance) to just parse both the input / json once, and leave it on a variable, to avoid parsing on the next call.

Added support for json input stream. (Not tested)
@narfbg
Copy link
Contributor

narfbg commented Feb 19, 2015

Hmm ... I'm thinking pretty much what you've described in your notes, only you deffinately don't need to preserve the property name, and given that the raw data doesn't need special parsing, I'd rather have something like this:

protected $_input_stream_raw;
protected $_input_stream_parsed;

public function input_stream($index = NULL, $xss_clean = NULL)
{
    if ( ! is_array($this->_input_stream_parsed))
    {
        isset($this->_input_stream_raw) OR $this->_input_stream_raw = file_get_contents('php://input');
        parse_str($this->_input_stream_parsed, $this->_input_stream_raw);
        is_array($this->_input_stream_parsed) OR $this->_input_stream_parsed = array();
    }

    return $this->_fetch_from_array($this->_input_stream_parsed, $index, $xss_clean);
}

public function __get($name)
{
    if ($name === 'input_stream')
    {
        isset($this->_input_stream_raw) OR $this->_input_stream_raw = file_get_contents('php://input');
        return $this->_input_stream_raw;
    }
}

Dedicating a function to the JSON format isn't necessary as it is one of the easiest things to parse. With the above code, you could simply do $json_input = json_decode($this->input->input_stream) and have control over json_decode()'s other options. Using a getter also allows making other stuff like the $ip_address property protected because it shouldn't be modified by userspace code, but still have it readable (don't make that change now).

But, judging by your commit messages, I assume you're mistaking the docblocks for our documentation ... They're inline documentation, yes, but only for those who dive into the framework's code. Any new features should also be explained in the appropriate page under user_guide_src/source/ and with a changelog.rst entry as well.

@Ignasimg
Copy link
Contributor Author

Oh ok, I'll try to update the other files you mention later.

Some things I don't like about the solution you purpose:

1- The function input_stream and the pseudo-property input_stream give 2 completely diferent things I don't really mind about the name, but I think it can lead to confussion

2- I think the function input_stream would be better like that:

public function input_stream($index = NULL, $xss_clean = NULL)
{
    if ( ! is_array($this->_input_stream_parsed))
    {
        parse_str($this->input_stream, $this->_input_stream_parsed);
        is_array($this->_input_stream_parsed) OR $this->_input_stream_parsed = array();
    }

    return $this->_fetch_from_array($this->_input_stream_parsed, $index, $xss_clean);
}

Losing a little bit of performance cause yours was like inlined but I think it's much more clean.


About the json function for parsing the body of the request, on one side I agree with you that it's trivial to just decode on the controller and you can also have a much nicer way to access the extra attributes provided by the json decoder, on the other side, even though parse_str don't have so many options and so it's more clean, the task of the parse_str and jsondecode functions is quite similar, parse a string into an array (even though json might have more options to set it as an object and other crazy stuff) anyway, the main reason I put the json_input_string function is to use the _fetch_from_array function, for the cleaning of the variables.

But anyway this stuff I have just written it's useless cause I found a much better solution, but before getting my hands dirty into it I'd like to have your opinion on that. As I said the parse_str and jsondecode functions have a similar functionality, that is convert a string to an array. And even though I can't think of any other common format (maybe some kind of xmlparser that parses both attributes and child into an array? that'd be weird but for the sense of example) it's deffinely a format independent task.

So what I think it should be something similar to:

public function input_stream($index = NULL, $xss_clean = NULL, $parser = 'parse_str')
{
    if ( ! is_array($this->_input_stream_parsed))
    {
        call_user_func_array($parser, array($this->input_stream, &$this->_input_stream_parsed));
        is_array($this->_input_stream_parsed) OR $this->_input_stream_parsed = array();
    }

    return $this->_fetch_from_array($this->_input_stream_parsed, $index, $xss_clean);
}

Majority of people can still use parse_str by default and others who need more special things can build their partial application that uses those very same parameters, and use it.

@narfbg
Copy link
Contributor

narfbg commented Feb 19, 2015

Agreed on point 1, maybe still call it raw_input_stream but a property?

Point 2 can go either way (although you've switched parameter places in your parse_str() call ...), but performance was my goal. :)


Ideally, how the stream input is parsed would depend on the Content-Type header, but if we go that route, it's too much work for input_stream() alone while going further is a massive effort that I'd rather leave for CI4.

When that's not the case, you can't automatically respond with HTTP 415, which has left you with a switchable parameter ... Yes, that's neat and you should absolutely use it as a custom extension if it fits your case, but if we start adding such options, it means we're inclined to add support for all kinds of parsers in the future and that's just not feasible.
Even the $xss_clean parameter was added just for consistency with the rest of CI_Input's methods, which are almost all just mappings to already parsed multipart/form-data, so I really don't want to go beyond that for CI3.

@Ignasimg
Copy link
Contributor Author

I switched the parse_str parameters cause I think you switched them before hahahah

And I don't understand "but if we start adding such options, it means we're inclined to add support for all kinds of parsers in the future and that's just not feasible."

In my opinion you are just adding support for functions that have the same signature as parse_str. (And in fact, jsondecode, already doesn't)

So if someone needs the json, in the controller they'd just do:

// I didn't try the code so there might be some bug
$this->input->input_string('funnyindex', true, function($source, &$result) {
 $result = json_decode($source, true);
});

And where it says json_decode it could just be any parser with the apropiate parameters for each place, so in my opinion the support for the given parser is added by the enduser not by CI.

@narfbg
Copy link
Contributor

narfbg commented Feb 19, 2015

Well, yes - that does look neat with the callback. But excluding an abstracted API, it brings zero value when you can do that with the same amount of code without using the framework's functions, and especially when the user is responsible of injecting the parser - it's still their effort, so we offer nothing in practice.

Plus, you seem to be be advocating this because of the XSS clean feature, and that one should only be used on output, not input. When you're not dealing form-data, its even less useful - even more often harmful than not.

@Ignasimg
Copy link
Contributor Author

Well, for the most part I'd agree with you that XSS sanitizing is only needed on output (even it's not 100% truth, more on that later), but then why all the input functions have the option to sanitize? I guess is just a way to help.

But about the XSS sanitizing on input there is one very particular case: mongoDB (by what I've seen in other posts I think you don't like this database so much but the truth is that it's gaining popularity everyday, so... this is life). Queries on mongoDB can execute arbitrary javascript, and hence it's a must to filter possible XSS on input. Not only that but also when you work with mongoDB you must work with json strings.

I'm not saying it couldn't be made another way, in fact I already said on the other post we were discussing some days ago, that if some third party software has some security issue with some kind of string it should be the driver who handles it not another class; And I still think like that and maintain it. But the truth is that when you work with mongoDB json and xss sanitizing should be really coupled together.

Now that being said my question is, if you help people who don't use mongodb to XSS-sanitize their inputs (which as you said they don't really have any need, since they could do it on output), why won't you help people who use mongoDB since those are probably gonna receive their input as json, and will want it to be XSS-sanitized?

And I also don't agree with you about the "adding zero value", maybe it doesn't add as much value as just saying which parser you want and it magically does so, but it deffinely helps, just as the XSS filtering on the input class.

@narfbg
Copy link
Contributor

narfbg commented Feb 20, 2015

Well, for the most part I'd agree with you that XSS sanitizing is only needed on output (even it's not 100% truth, more on that later), but then why all the input functions have the option to sanitize? I guess is just a way to help.

The option exists solely for backwards compatibility. It is something that CodeIgniter has done the wrong way from the start and it is now very difficult to remove that feature without breaking every single application that uses the option, which also defaulted to TRUE in previous versions. Changing it to the value of 'global_xss_filtering' is a step in discouraging its usage.
In fact, the more messages I exchange with you on this topic, the more I think I should just remove that option from input_stream(), because it's a new feature that doesn't need to maintain BC and consistency with other methods alone is probably not worth it.

But about the XSS sanitizing on input there is one very particular case: mongoDB (by what I've seen in other posts I think you don't like this database so much but the truth is that it's gaining popularity everyday, so... this is life). Queries on mongoDB can execute arbitrary javascript, and hence it's a must to filter possible XSS on input. Not only that but also when you work with mongoDB you must work with json strings.

It is true that I don't like it, but that's irrelevant ... I'm not sure how you got that impression, I've only said that it doesn't fit into our DB infrastructure because it's not an SQL database. But anyway ...

XSS is a browser attack and has nothing to do with MongoDB, even if it does execute arbitrary JS code. Plus, even if a context-aware XSS filter just happens to be useful in input escaping for MongoDB, ours is unfortunately not context aware and it deals a lot more with HTML than with raw JavaScript, making it a lot more aggressive and unsuitable for that purpose.
And JSON itself is not an XSS attack vector itself, IIRC.

I'm not saying it couldn't be made another way, in fact I already said on the other post we were discussing some days ago, that if some third party software has some security issue with some kind of string it should be the driver who handles it not another class; And I still think like that and maintain it. But the truth is that when you work with mongoDB json and xss sanitizing should be really coupled together.

Well, you said it - it's the MongoDB driver/library's job to do it. ;)

Now that being said my question is, if you help people who don't use mongodb to XSS-sanitize their inputs (which as you said they don't really have any need, since they could do it on output), why won't you help people who use mongoDB since those are probably gonna receive their input as json, and will want it to be XSS-sanitized?

I believe I already answered that above. Plus, CodeIgniter doesn't support MongoDB - not our problem.

And I also don't agree with you about the "adding zero value", maybe it doesn't add as much value as just saying which parser you want and it magically does so, but it deffinely helps, just as the XSS filtering on the input class.

// Your suggestion:
$this->input->input_stream(NULL, FALSE, function($input, &$output) {
    $output = json_decode($input, TRUE);
});

// Raw PHP (assuming we get the stream as a property):
$json = json_decode($this->input->raw_input_stream);

I really don't see any value added by that.

@Ignasimg
Copy link
Contributor Author

Ignasimg commented Feb 20, 2015 via email

@narfbg
Copy link
Contributor

narfbg commented Feb 20, 2015

You don't need to submit a new pull request ... back-and-forth commits to adjust a proposed patch is one of the few thins that we don't nitpick on. :)

Just push to the same branch, I think you can even do a hard reset, commit & force push if you want to keep it clean.

@Ignasimg
Copy link
Contributor Author

Ok I think this is it.

@narfbg
Copy link
Contributor

narfbg commented Feb 21, 2015

Now the nitpicking begins ... :)

/**
* Input stream data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert these space deletions.

@narfbg
Copy link
Contributor

narfbg commented Feb 21, 2015

Most of these are styling issues, so you should probably read our styleguide as well.

The property also needs to be described in the class reference section, see how that's done via .. attribute:: in the Output class docs.

And again, probably change $_input_stream to $_parsed_input_stream or similar ...

is_array($this->_input_stream) OR $this->_input_stream = array();
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still inserting whitespace :)

@@ -303,8 +305,8 @@ public function server($index, $xss_clean = NULL)
*
* Useful when you need to access PUT, DELETE or PATCH request data.
*
* @param string $index Index for item to be fetched
* @param bool $xss_clean Whether to apply XSS filtering
* @param string $index Index for item to be fetched
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still altering these lines ... they shouldn't appear as changed in the diff at all. Original was a single tab for separator.

@narfbg
Copy link
Contributor

narfbg commented Feb 26, 2015

After the whitespace changes, it's good to go. :)

@Ignasimg
Copy link
Contributor Author

Ok, sorry about all that, I might look like slow in the head, but I'm using such a funny tool to deal with git... gess I shall use the apropiate tools from now on... =_='

narfbg added a commit that referenced this pull request Feb 26, 2015
Add support for raw_input_stream
@narfbg narfbg merged commit 23824b8 into bcit-ci:develop Feb 26, 2015
@narfbg
Copy link
Contributor

narfbg commented Feb 26, 2015

Literally nobody does it right the first time, don't worry about that. :)

@avenirer
Copy link
Contributor

The important thing is that you helped. Literally everyone has something bad to say, but very few find the time to do something good.

@ivantcholakov
Copy link
Contributor

After merging (23824b8) I got the following:

A PHP Error was encountered
Severity: Notice
Message: Indirect modification of overloaded property CI_Input::$security has no effect
Filename: core/Input.php
Line Number: 133

Is the added magic method __get($name) {...} mandatory for this task?

@ivantcholakov
Copy link
Contributor

A forgot to mention the real error:

A PHP Error was encountered
Severity: Error
Message: Cannot assign by reference to overloaded object
Filename: core/Input.php
Line Number: 133

The same might happen with other property names too, after fixing $this->security.

Edit:

PHP 5.6.5-1+deb.sury.org~utopic+1 (cli) (built: Jan 26 2015 11:51:35) 

@Ignasimg
Copy link
Contributor Author

Yep, the quick and dirty solution is to change __get to &__get and change the assigns by reference (lines 133 and 138) to normal assignments...

narfbg can you revert and let's just do it in a function to avoid magic method mines?

narfbg added a commit that referenced this pull request Feb 27, 2015
narfbg added a commit that referenced this pull request Feb 27, 2015
ivantcholakov added a commit to ivantcholakov/CodeIgniter that referenced this pull request Feb 27, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants