Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEATURE REQUEST: Use Rails.cache to cache api_formatted objects #23

Closed
coneybeare opened this issue May 24, 2011 · 30 comments
Closed

FEATURE REQUEST: Use Rails.cache to cache api_formatted objects #23

coneybeare opened this issue May 24, 2011 · 30 comments

Comments

@coneybeare
Copy link
Contributor

The idea is that I would be able to fine tune a config block in the model to use memcache, expiration time, etc…

When the renderer renders the model instance as json, xml or whatever, the rendered element gets cached so that if it appears again on some other api call, it is there.

For simple models this is not necessary and could even slow things down, but on more complex models, where the api is calling methods that may induce joins or database calls, this can be a life saver.

I think that some automatic key should be generated such as api_template_#{template}model#{class.name}id#{instance.id} and use that to query.

What do you think?

@fabrik42
Copy link
Owner

Hey Matt,

a couple thoughts on this:

I think usually caching should be a task for controllers, not models. So we should think about the way to enable/use caching together with api templates (in the template declaration or the renderer).

But generally I think this is a good idea. We could start with equipping the Hash returned by model.as_api_response with a cache_key method (used by Rails to determine the cache key). Somehow like this:

response_hash = api_attributes.to_response_hash(self)
response_hash.instance_eval do
  def cache_key
    "#{class.to_s}_#{api_template.to_s}_{id}"
  end
end

@bennytheshap
Copy link
Contributor

I don't know if github allows this, but I'd totally vote for it.

@coneybeare
Copy link
Contributor Author

I like this, as long as we are inserting the cache key to each record, not just the entire returned structure. This is because someone might have list1, list2, and list3 with common elements, and those common element should only be fetched once. I am on Vacation now, but can work with you on this when I get back in a few weeks

@fabrik42
Copy link
Owner

fabrik42 commented Jun 4, 2011

I fear this will get tricky as acts_as_api is not meant to be responsible for gathering the data, just rendering it.

But it's a really cool idea, please let me know when you have time to work on it so I can join you?

@coneybeare
Copy link
Contributor Author

You brought up some good points that are making me second guess trying to put this in. For myself, I would probably just hack it to get it to work for my situation, not necessarily keeping the M and C (of MVC) separate. But putting this kind of hack into a published gem doesn't seem like a good idea :(

@coneybeare
Copy link
Contributor Author

Thought a little about this. While caching complete methods and returns should definitely be controller based, I don't think there is necessarily anything wrong with a model caching itself. This is only similar how rails loads an ActiveRecord into memory instead of working directly out of the DB the entire time. Thus, putting logic into a model to cache itself, and only itself, for ease of reading later isn't so bad.

I propose we add an option to api_accessible called :cache which when set to true (default is false) would make the as_api_response check the Rails.cache first and return the value if hit, else, generate the response and cache using the specified key format. If we want to get fancy, we can give an option for a cache_key_prefix to eliminate collisions.

Thoughts?

@fabrik42
Copy link
Owner

fabrik42 commented Jun 6, 2011

Sounds good. Maybe something like this:

You can define default caching behaviour as you proposed in the model

api_accessible :v1_default, :cache => 10.minutes do |t| 
  t.add :id
  t.add :name
end

But I think you should also be able to control the cache in the controller

format.json { render_for_api :v1_default, :json => @users, :root => :users, :api_cache => 2.minutes }

E.g. to clear the cache on #update

format.json { render_for_api :v1_default, :json => @users, :root => :users, :api_cache => false }

@benedikt
Copy link
Contributor

benedikt commented Jun 6, 2011

I don't think caching should be part of acts_as_api at all. You could always cache the bare objects or use action/page caching...

@bennytheshap
Copy link
Contributor

Where this gets tricky is with associations. Any ideas about how to handle that?

On Jun 6, 2011, at 9:06 AM, fabrik42
reply@reply.github.com
wrote:

Sounds good. Maybe something like this:

You can define default caching behaviour as you proposed in the model

api_accessible :v1_default, :cache => 10.minutes do |t|
 t.add :id
 t.add :name
end

But I think you should also be able to control the cache in the controller

format.json { render_for_api :v1_default, :json => @users, :root => :users, :api_cache => 2.minutes }

E.g. to clear the cache on #update

format.json { render_for_api :v1_default, :json => @users, :root => :users, :api_cache => false }

Reply to this email directly or view it on GitHub:
#23 (comment)

@coneybeare
Copy link
Contributor Author

@benedikt When looking at this from an api perspective, let me give you a use case for when this is particularly useful for me.

I have a search api for my app ambiance which is searching sounds, rendering an api response that calls in ratings (millions of rows join table). A search for "Rai" and "Rain" will turn up many of the same results, and are usually called right after each other as a predictive search. These results that turn up greatly benefit from caching the response and is essentially action caching, but at a level that makes sense for an API call.

After optimizing the DB calls to the max and profiling the code to remove any issues, the single bottle neck is the join call to get the rating field for the api. I am left with 2 options:

  1. Introduce data redundancy in the tables by caching the rating field in the sound
  2. Caching a response as indicated above so the next search call for this model will not have to do the same join on the ratings, only to produce the same result.

I like number 2 better. What else would you suggest?

@coneybeare
Copy link
Contributor Author

@bennytheshap I don't see how it would matter as the caching is only caching the generated json or xml for an object. Nothing about how the xml or json is generated would change at all. As with any caching scheme, you will probably need to implement logic to detroy/update cached values when certain events trigger. Can you explain your concern a little more?

@bennytheshap
Copy link
Contributor

My concern is that if I have a class A that is associated with class B (and
elements of B are included in the API response for A), then if A's cache
time is 10 minutes and B's is 5 minutes, should the 5 minutes win? Could A
cache itself in parts so that it can avoid re serializing most of itself but
then re-json B and stick it in?

On Mon, Jun 6, 2011 at 9:28 AM, coneybeare <
reply@reply.github.com>wrote:

@bennytheshap I don't see how it would matter as the caching is only
caching the generated json or xml for an object. Nothing about how the xml
or json is generated would change at all. As with any caching scheme, you
will probably need to implement logic to detroy/update cached values when
certain events trigger. Can you explain your concern a little more?

Reply to this email directly or view it on GitHub:
#23 (comment)

@benedikt
Copy link
Contributor

benedikt commented Jun 6, 2011

@coneybeare
Try to store the result of the rating field in an instance variable so it gets serialized during the caching process. Then after you get the collection of sound objects from database iterate them and Rails.cache.fetch while making sure the rating gets calculated once in the block of Rails.cache.fetch.

@fabrik42
Copy link
Owner

fabrik42 commented Jun 6, 2011

While I really understand the need of caching, I'm still not 100 percent convinced if acts_as_api is the right place to do this.
As I mentioned before, acts_as_api is not meant to be responsible for collecting data, just rendering it.

I like the idea in general, but I fear that we implement something that won't fit the needs of any other user of the lib.
As you can see in this thread, there are a lot of different caching strategies :) So maybe we should leave it up to the developer as caching a response from acts_as_api should be pretty simple.

@bennytheshap: I don't think this would be a good idea, because a) it would get pretty complicated and b) acts_as_api is orm agnostic. It does not know about associations, it just handles ruby methods. And caching every sub template on its own could lead to lots overhead (esp. if you don't have any other use for it).

@coneybeare
Copy link
Contributor Author

@fabrik42 I agree that not everyone will benefit from this. Perhaps, instead of caching within the gem, we provide model hooks before and after the rendering. This way, users can do whatever they want without affecting other users of the gem.

For example, we can provide a before hook that asks if rendering should proceed. In that callback, I would check the cache for a key for the model instance. If found, I return it and the rendering is not necessary. If nil, the rendering continues and an after hook is called, letting the user cache or do whatever after the rendering has completed.

Thoughts on this approach?

On Jun 6, 2011, at 11:03 AM, fabrik42reply@reply.github.com wrote:

While I really understand the need of caching, I'm still not 100 percent convinced if acts_as_api is the right place to do this.
As I mentioned before, acts_as_api is not meant to be responsible for collecting data, just rendering it.

I like the idea in general, but I fear that we implement something that won't fit the needs of any other user of the lib.

@bennytheshap: I don't think this would be a good idea, because a) it would get pretty complicated and b) acts_as_api is orm agnostic. It does not know about associations, it just handles ruby methods. And caching every sub template on its own could lead to lots overhead (esp. if you don't have any other use for it).

Reply to this email directly or view it on GitHub:
#23 (comment)

@bennytheshap
Copy link
Contributor

I like it.

On Mon, Jun 6, 2011 at 10:28 AM, coneybeare <
reply@reply.github.com>wrote:

@fabrik42 I agree that not everyone will benefit from this. Perhaps,
instead of caching within the gem, we provide model hooks before and after
the rendering. This way, users can do whatever they want without affecting
other users of the gem.

For example, we can provide a before hook that asks if rendering should
proceed. In that callback, I would check the cache for a key for the model
instance. If found, I return it and the rendering is not necessary. If nil,
the rendering continues and an after hook is called, letting the user cache
or do whatever after the rendering has completed.

Thoughts on this approach?

On Jun 6, 2011, at 11:03 AM, fabrik42reply@reply.github.com wrote:

While I really understand the need of caching, I'm still not 100 percent
convinced if acts_as_api is the right place to do this.
As I mentioned before, acts_as_api is not meant to be responsible for
collecting data, just rendering it.

I like the idea in general, but I fear that we implement something that
won't fit the needs of any other user of the lib.

@bennytheshap: I don't think this would be a good idea, because a) it
would get pretty complicated and b) acts_as_api is orm agnostic. It does not
know about associations, it just handles ruby methods. And caching every sub
template on its own could lead to lots overhead (esp. if you don't have any
other use for it).

Reply to this email directly or view it on GitHub:
#23 (comment)

Reply to this email directly or view it on GitHub:
#23 (comment)

@coneybeare
Copy link
Contributor Author

@benedikt What you suggest still doesn't solve the problem of rendering the same model multiple times. This is the issue we are trying to solve here, and where to put the logic or hooks to make this happen. I only gave a single example of how performance might be increased in a certain scenario. At least for my app, ruby-prof tells me that the majority of the time spent in an API call is in the rendering of the model, and thus I seek ways to minimize this across thousands of calls.

@benedikt
Copy link
Contributor

benedikt commented Jun 6, 2011

@coneybeare
You're right. But based on what you described above I assume it's not the rendering itself that is slow, but the method calls to your model performed by the rendering process?

@coneybeare
Copy link
Contributor Author

@benedikt Yes that is true, it is the method calls. I guess it is preference… I would rather cache the entire rendered structure than cache the individual slow parts of it separately. To me, it seems cleaner, executes less code, uses less code to write and opens up fewer places to introduce error

@coneybeare
Copy link
Contributor Author

I've got something working locally, but I have never implemented a callback method from a gem before so I ask the community if there is a better way. Basically, there are two default methods that can be overwritten in each model to do stuff with

      # Creates the api response of the model and returns it as a Hash.
      # Will raise an exception if the passed api template is not defined for the model
      def as_api_response(api_template)

        if response_hash = prerendered_api_response(api_template.to_s)
          return response_hash 
        end

        api_attributes = self.class.api_accessible_attributes(api_template)
        raise ActsAsApi::TemplateNotFoundError.new("acts_as_api template :#{api_template.to_s} was not found for model #{self.class}") if api_attributes.nil?

        api_response_rendered api_template.to_s, api_attributes.to_response_hash(self)        
      end

      protected

      # Callback to overwrite if rendering should proceed or not.
      # The input is the api_template name
      # Return nil if as_api_response should render, otherwise return the prerendered content
      def prerendered_api_response api_template
        nil
      end

      # Callback to overwrite if you want to do something with the response_hash before returning.
      # The input is the api_template name and the rendered_hash
      # Manipulate or do something with it, then return it when done
      def api_response_rendered api_template, api_response
        api_response
      end

I use it like this (not part of the gem):

  def prerendered_api_response api_template
    if ExampleServer::Application.config.action_controller.perform_caching && api_template
      Rails.cache.read "api_response_#{self.class.to_s}_#{id}_#{api_template.to_s}"
    end
  end

  def api_response_rendered api_template, api_response
    if ExampleServer::Application.config.action_controller.perform_caching && api_response
      Rails.cache.write "api_response_#{self.class.to_s}_#{id}_#{api_template.to_s}", api_response, :expires_in => 1.day
    end
    api_response
  end

@fabrik42
Copy link
Owner

fabrik42 commented Jun 9, 2011

We also could try to implement it using the active model callbacks. (http://api.rubyonrails.org/classes/ActiveModel/Callbacks.html#method-i-define_model_callbacks)

But the problem here is that they don't support custom arguments to be passed to the callback functions.

@coneybeare
Copy link
Contributor Author

@fabrik42 Yeah, i think we need the args because there can be many different api_templates. We also need a way to prevent execution of the method and I am not sure that can be accomplished with the active model callbacks. This is your decision here… I wan't to know which way you think is best before i give you a pull request

@fabrik42
Copy link
Owner

fabrik42 commented Jun 9, 2011

Ok, a couple thoughts:

  • I'd call it before_api_response and after_api_response, even though it is not a real callback but more kind of filter I think it's still easier to understand + "render" in a model does not sound so good imho ;)
  • I would not predefine the methods. Instead I would use respond_to? to check if they exist.
  • I'd move the possible ActsAsApi::TemplateNotFoundError exception before the "before" method, to prevent that something is returned even though the api template does not exist anymore.
    In this example acts_as_api would deliver results when the cache exists while the api template was already removed.
  • If all of this works, would you mind to write a wiki entry about your caching?

@coneybeare
Copy link
Contributor Author

I can write it up. The before_api_response is a little misleading though because if it returns anything other than nil, that content will be served instead of the real render. Perhaps we add three methods:

before_api_response: just a hook before, no stopping of execution
existing_api_response (or similarly named): Method that returns nil to proceed, anything else to serve that as well
after_api_response: just a hook before, no stopping of execution

@fabrik42
Copy link
Owner

fabrik42 commented Jun 9, 2011

You're right about the potential mislead of the method names combined with this behaviour.

I think it should be
before_api_response -> Only hook
around_api_response -> Like the around callbacks or ActiveModel. You can do stuff before and after the execution of as_api_response and you have to yield it yourself (so you can also return something else in fact).
after_api_response -> Only hook

@coneybeare
Copy link
Contributor Author

I have been hacking away at it but I think the around documentation is pretty poorly written. I asked a SO question and will see what people say: http://stackoverflow.com/questions/6299001/how-can-i-use-custom-callbacks-on-an-active-model

@fabrik42
Copy link
Owner

I think it should work something like this:

def as_api_response(api_template)
  api_attributes = self.class.api_accessible_attributes(api_template)
  raise ActsAsApi::TemplateNotFoundError.new("acts_as_api template :#{api_template.to_s} was not found for model #{self.class}") if api_attributes.nil?

  before_api_response(api_template) if respond_to? :before_api_response

  response_hash = if respond_to? :around_api_response
    around_api_response api_template do
      api_attributes.to_response_hash(self)
    end
  else
    api_attributes.to_response_hash(self)
  end

  after_api_response(api_template) if respond_to? :after_api_response

  response_hash
end

Then you could implement a around_api_response method like this:

def around_api_response(api_template)
  # do something
  # e.g. return something else instead

  # call the real as_api_response
  resp = yield

  # do something else
  return resp  
end

This is pretty similar to the usage of rails around callbacks

@coneybeare
Copy link
Contributor Author

I figured I could do it this way, but was trying to use the official ActiveSupport way so as to support flexibility of different method names, lambdas, procs and blocks. I will dig in and figure it out.

On Jun 9, 2011, at 1:35 PM, fabrik42reply@reply.github.com wrote:

You're right about the potential mislead of the method names combined with this behaviour.

I think it should be
before_as_api_response -> Only hook
around_as_api_response -> Like the around callbacks or ActiveModel. You can do stuff before and after the execution of as_api_response and you have to yield it yourself (so you can also return something else in fact).
after_as_api_response -> Only hook

Reply to this email directly or view it on GitHub:
#23 (comment)

@fabrik42
Copy link
Owner

I already did this, but the problem is that the official callbacks don't support parameters.

For me it seems like run_callbacks(kind, *args, &block) once was meant to accepts args
https://github.com/rails/rails/blob/master/activesupport/lib/active_support/callbacks.rb#L80

But the __define_runner method creates the callback methods in a way that they only accept one parameter (for per-key conditions)
https://github.com/rails/rails/blob/master/activesupport/lib/active_support/callbacks.rb#L372

So if we want to use the official Callback methods we can't pass parameters. Maybe we need to save them in an instance var? I don't like it, but it would solve the problems...

@coneybeare
Copy link
Contributor Author

Submitted a pull request

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants