Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memoizing Request#POST #749

Closed
wants to merge 1 commit into from
Closed

Memoizing Request#POST #749

wants to merge 1 commit into from

Conversation

Wardrop
Copy link
Contributor

@Wardrop Wardrop commented Oct 24, 2014

I've added logic to my application so when JSON data is POST'd to it, it's automatically parsed and merged with Request#POST. The processing is meant to transparent to the rest of the application. This is how I deal with JSON data posted to the server by JavaScript code on the client.

request.POST.merge!(JSON.parse('...'))

The problem is, if no form data has been POST'd, Request#POST returns a new empty hash every time it's called, so my merge! doesn't stick. Looking at the source code for Request#POST, it's easy to see the problem. I'm wondering if we can modify Request#POST so it memoizes the empty hash it returns if no form data exists, instead of returning a new hash every time.

Otherwise, I'm open to suggestions on how I could go about this another way that achieves the same result, but Request#POST returning a new hash every time seems wrong either way.

@Wardrop
Copy link
Contributor Author

Wardrop commented Oct 23, 2014

Wondering what people think of this refactor. Very similar to the original, except it merges form data with @env["rack.request.form_hash"] instead of replacing it. This implementation allows middleware and other application code to add to @env["rack.request.form_hash"].

Compare to original: https://github.com/rack/rack/blob/master/lib/rack/request.rb#L201

def POST
  raise "Missing rack.input" if @env["rack.input"].nil?
  @env["rack.request.form_hash"] ||= {}

  if !(@env["rack.request.form_input"].equal? @env["rack.input"]) && (form_data? || parseable_data?)
    parsed_result = parse_multipart(env)
    if parsed_result
      @env["rack.request.form_hash"].merge! parsed_result
    else
      form_vars = @env["rack.input"].read

      # Fix for Safari Ajax postings that always append \0
      # form_vars.sub!(/\0\z/, '') # performance replacement:
      form_vars.slice!(-1) if form_vars[-1] == ?\0

      @env["rack.request.form_vars"] = form_vars
      @env["rack.request.form_hash"].merge! parse_query(form_vars)

      @env["rack.input"].rewind
    end
    @env["rack.request.form_input"] = @env["rack.input"]
  end

  @env["rack.request.form_hash"]
end

@raggi
Copy link
Member

raggi commented Oct 25, 2014

Subclass (or wrap) rack request to add your own semantics.

@Wardrop
Copy link
Contributor Author

Wardrop commented Oct 25, 2014

Of course, or in my case monkey patching it, but I think my proposition may be a better implementation of Request#POST that's more predictable. I'm sure most people would assume Request#POST always returns the same object, but in the case of an empty hash it's different every time it's called. I don't think that behavior is ideal, and I think my patch elegantly addresses this while being unlikely to disturb any existing code.

Is it not worth considering a pull? If not, I'd be interested to hear your rationale.

@raggi
Copy link
Member

raggi commented Oct 25, 2014

So we've actually had a bug before when one of the code paths didn't return the same object after a partial parse. IIRC it was rails related.

If we do accept this, it's a significant semantic change that we should test it's impact on the frameworks.

I do agree with you wrt consistency.

@Wardrop
Copy link
Contributor Author

Wardrop commented Oct 26, 2014

Agreed. How does Rack handle such changes? I'm happy if this had to wait until v1.7?

@raggi
Copy link
Member

raggi commented Oct 26, 2014

That's up to @tenderlove now.

@tenderlove
Copy link
Member

Mutating data internal to the request object makes me cringe big time. I think an argument could be made for duping a populated form hash so that this method always returns a new object. You should wrap the request and proxy methods rather than expecting that mutating internal data structures works.

Aaron Patterson
http://tenderlovemaking.com/
I'm on an iPhone so I apologize for top posting.

On Oct 26, 2014, at 2:01 PM, James Tucker notifications@github.com wrote:

That's up to @tenderlove now.


Reply to this email directly or view it on GitHub.

@Wardrop
Copy link
Contributor Author

Wardrop commented Oct 26, 2014

I can understand that. I'm curious though, if I was developing middleware to parse POST'd JSON into a hash to be made accessible via Request#POST, how would one achieve that? The implementation of Request#POST seems non-extensible in this regard.

My implementation of Request#POST is such that any middleware or application can safely modify@env["rack.request.form_hash"]. Request#POST uses the condition @env["rack.request.form_input"].equal? @env["rack.input"] to know if it's parsed the input and merged the resulting hash, so it only does it once, and should be completely tolerant of other code making changes to that hash, even before Request#POST has had a chance to touch it. I see it as a rather graceful, low-impact method of allowing other code to add additional parsing for POST'd data.

@tenderlove
Copy link
Member

I can understand that. I'm curious though, if I was developing middleware to parse POST'd JSON into a hash to be made accessible via Request#POST, how would one achieve that? The implementation of Request#POST seems non-extensible in this regard.

You hold on to an instance of the Rack request, then implement your own version of POST, delegate to the underlying request, then merge the parsed JSON with the return value of the wrapped request object. Can you be more specific about how it isn't extensible using a proxy object?

My implementation of Request#POST is such that any middleware or application can safely modify@env["rack.request.form_hash"]

Now we're talking about mutating shared state again. :'(

@Wardrop
Copy link
Contributor Author

Wardrop commented Oct 27, 2014

I'm not sure I'm completely following. Here's my 5 minute JSON parser which assumes my implementation of Request#POST:

require 'json'

class JSONFormParser
  def initialize(app)
    @app = app       
  end                

  def call(env)
    if env['CONTENT_TYPE'] && env['CONTENT_TYPE'].match(%r{^application/json})
      parsed_body = JSON.parse(env["rack.input"].read) rescue nil
      env["rack.request.form_hash"] ||= {}
      env["rack.request.form_hash"].merge!(parsed_body) if Hash === parsed_body
      env["rack.input"].rewind
    end
    @app.call(env)
  end  
end

The important thing is that this is effectively transparent to the end user, meaning their code doesn't have to change. They keep using Request#POST as they always have, except now POST'd JSON data is parsed for them like an ordinary form would be.

How would I achieve this using proxying? You can't, not outside of your application. Even then, you're making an assumption that the same Request object is used for the life of the request. Only data in the rack env object can be relied upon, hence why the current implementation of Request#POST is necessary. This is what I mean by Request#POST being non-extendable. It means other code (e.g middleware) can't add extra form parsing capabilities to Rack.

If you don't want to mutate shared state, you'd have to add extra infrastructure to Rack to allowing other code to subscribe their own form parsers, etc. They may technically be better, but a re-implemented Request#POST like I've written achieves the same result more simply. It's a 90% solution, anything more I imagine would have to wait for a Rack 2.0.

I hope I haven't missed something?

@tenderlove
Copy link
Member

I see. Since we're relying on mutating shared data (the env hash), and the request object only mutates the hash sometimes, then we have a problem. (Sorry, I've got a cold, so I think I'm the one who wasn't understanding)

I don't want to make this change in 1.6, but I'm OK merging it in for 1.7.

@ioquatix
Copy link
Member

We could do this, but I think we should also freeze the cached data.

@jeremyevans
Copy link
Contributor

We already cache the params, so caching POST seems reasonable. For consistency, we should probably also cache GET as well if we do this. Note that params caching occurs as a separate Request instance variable (not in env), so presumably POST and GET caching should operate the same, which is a different implementation than this PR.

@ioquatix ioquatix added this to the v3.0.0 milestone Feb 5, 2020
@ioquatix
Copy link
Member

@jeremyevans do you still agree with your assessment? I agree with it too. However, I think keeping all state within the Request instance without modifying env would be the most appropriate implementation. WDYT?

@jeremyevans
Copy link
Contributor

Yep, I still think it's reasonable to cache in a Request instance variable.

jeremyevans added a commit to jeremyevans/rack that referenced this pull request Jan 25, 2022
Fixes rack#749)

In all other cases, the result was cached, so not caching in this
case is inconsistent, and can result in unexpected behavior if
POST is called multiple times on the same request.
@jeremyevans
Copy link
Contributor

I did a more detailed review of this and I think the only issue is we already always cache GET results in env, and already cache POST results in env, except for the corner case where we don't try parsing the input data due to it not having a recognized content type. We should cache in that case as well, using the same approach as other GET/POST caching. I've submitted #1795 to implement that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants