New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy-load items #794

Merged
merged 4 commits into from Jan 15, 2016

Conversation

Projects
None yet
2 participants
@RubenVerborgh
Member

RubenVerborgh commented Jan 3, 2016

This pull request allows documents to lazily load attributes and content, as discussed in nanoc/rfcs#3.

To do

  • update new_item documentation
  • implement support in filesystem datasource
@RubenVerborgh

This comment has been minimized.

Show comment
Hide comment
@RubenVerborgh

RubenVerborgh Jan 3, 2016

Member

Tested this with my BibTeX datasource, and I'm seeing a compilation speed gain already.

Two questions for @ddfreyne:

  • Am I correct to implement the laziness for content in Content, or should this be in Document? My reasoning for changing Content: new_item directly uses Content.create.
  • Does the laziness handling with the holder seem alright? This was needed for freezing.
Member

RubenVerborgh commented Jan 3, 2016

Tested this with my BibTeX datasource, and I'm seeing a compilation speed gain already.

Two questions for @ddfreyne:

  • Am I correct to implement the laziness for content in Content, or should this be in Document? My reasoning for changing Content: new_item directly uses Content.create.
  • Does the laziness handling with the holder seem alright? This was needed for freezing.
@ddfreyne

This comment has been minimized.

Show comment
Hide comment
@ddfreyne

ddfreyne Jan 3, 2016

Member

The frozenness and symbolisation handling feels a little awkward, and I think it could do with some more abstraction. Here’s an idea:

class LazyValue
  def initialize(thing)
    @thing = thing
  end

  def value
    @value ||= transform(raw_value)
  end

  def raw_value
    @raw_value ||=
      if @thing.respond_to?(:call)
        @thing.call
      else
        @thing
      end
  end

  def transform(x)
    x
  end

  def freeze
    value.freeze
  end
end

class LazyAttributesValue < LazyValue
  def transform(x)
    x.__nanoc_symbolize_keys_recursively
  end

  def freeze
    value.__nanoc_freeze_recursively
  end
end

class LazyContentValue < LazyValue
end

That code could be used like this (pseudocode):

class TextualContent < Content
  def string
    @string_holder.value
  end
end

class Document
  def initialize(content, attributes_like, identifier)
    #
    @attributes_holder = LazyAttributesValue.new(attributes_like)
    #
  end

  def attributes
    @attributes_holder.value
  end
end

Untested, but hopefully this gets the idea across! This approach is purely functional and easier to use/understand.

Not sure about the “holder” name though.

Member

ddfreyne commented Jan 3, 2016

The frozenness and symbolisation handling feels a little awkward, and I think it could do with some more abstraction. Here’s an idea:

class LazyValue
  def initialize(thing)
    @thing = thing
  end

  def value
    @value ||= transform(raw_value)
  end

  def raw_value
    @raw_value ||=
      if @thing.respond_to?(:call)
        @thing.call
      else
        @thing
      end
  end

  def transform(x)
    x
  end

  def freeze
    value.freeze
  end
end

class LazyAttributesValue < LazyValue
  def transform(x)
    x.__nanoc_symbolize_keys_recursively
  end

  def freeze
    value.__nanoc_freeze_recursively
  end
end

class LazyContentValue < LazyValue
end

That code could be used like this (pseudocode):

class TextualContent < Content
  def string
    @string_holder.value
  end
end

class Document
  def initialize(content, attributes_like, identifier)
    #
    @attributes_holder = LazyAttributesValue.new(attributes_like)
    #
  end

  def attributes
    @attributes_holder.value
  end
end

Untested, but hopefully this gets the idea across! This approach is purely functional and easier to use/understand.

Not sure about the “holder” name though.

@ddfreyne

This comment has been minimized.

Show comment
Hide comment
@ddfreyne

ddfreyne Jan 3, 2016

Member

Am I correct to implement the laziness for content in Content, or should this be in Document?

Seems correct!

Does the laziness handling with the holder seem alright? This was needed for freezing.

It’s confusing, and it’s not DRY, so it’s a little tricky to tell. I think abstracting the laziness handling away (e.g. using the suggestion above) could make this easier.

Member

ddfreyne commented Jan 3, 2016

Am I correct to implement the laziness for content in Content, or should this be in Document?

Seems correct!

Does the laziness handling with the holder seem alright? This was needed for freezing.

It’s confusing, and it’s not DRY, so it’s a little tricky to tell. I think abstracting the laziness handling away (e.g. using the suggestion above) could make this easier.

@RubenVerborgh

This comment has been minimized.

Show comment
Hide comment
@RubenVerborgh

RubenVerborgh Jan 3, 2016

Member

Your suggestion of LazyValue also crossed my mind, especially if this becomes a common thing (I was still missing the transform idea though). Where should I put this auxiliary class?

Member

RubenVerborgh commented Jan 3, 2016

Your suggestion of LazyValue also crossed my mind, especially if this becomes a common thing (I was still missing the transform idea though). Where should I put this auxiliary class?

@ddfreyne

This comment has been minimized.

Show comment
Hide comment
@ddfreyne

ddfreyne Jan 3, 2016

Member

lib/nanoc/base/entities/lazy_value.rb would work! Put it inside Nanoc::Int though, to make it clear this is for internal use only.

Member

ddfreyne commented Jan 3, 2016

lib/nanoc/base/entities/lazy_value.rb would work! Put it inside Nanoc::Int though, to make it clear this is for internal use only.

@RubenVerborgh

This comment has been minimized.

Show comment
Hide comment
@RubenVerborgh

RubenVerborgh Jan 3, 2016

Member

Alright, will do tomorrow. BTW, any reason to prefer respond_to?(:call) over is_a?(Proc)?

Member

RubenVerborgh commented Jan 3, 2016

Alright, will do tomorrow. BTW, any reason to prefer respond_to?(:call) over is_a?(Proc)?

@ddfreyne

This comment has been minimized.

Show comment
Hide comment
@ddfreyne

ddfreyne Jan 3, 2016

Member

I prefer prefer respond_to?(:call) over is_a?(Proc) because it makes duck typing easier—not just procs can yield values. This is also in line with how Rack apps are built (the contract there is to respond to #call.)

Member

ddfreyne commented Jan 3, 2016

I prefer prefer respond_to?(:call) over is_a?(Proc) because it makes duck typing easier—not just procs can yield values. This is also in line with how Rack apps are built (the contract there is to respond to #call.)

@RubenVerborgh

This comment has been minimized.

Show comment
Hide comment
@RubenVerborgh

RubenVerborgh Jan 5, 2016

Member

Implemented Nanoc::Int::LazyValue. I did not entirely follow your suggestion, as your freeze method requires the value to be computed, which kind of voids the purpose of being lazy. I implemented it with a hash that only becomes frozen when the value is actually requested.

Member

RubenVerborgh commented Jan 5, 2016

Implemented Nanoc::Int::LazyValue. I did not entirely follow your suggestion, as your freeze method requires the value to be computed, which kind of voids the purpose of being lazy. I implemented it with a hash that only becomes frozen when the value is actually requested.

@RubenVerborgh

This comment has been minimized.

Show comment
Hide comment
@RubenVerborgh

RubenVerborgh Jan 6, 2016

Member

Careful: old temp files seem to be affected by this change. PStore tries resurrecting the item with @string as a string value, which is not the case anymore. Simply deleting the tmp folder works. In general, however, it seems that Nanoc should be careful reading temp files.

Member

RubenVerborgh commented Jan 6, 2016

Careful: old temp files seem to be affected by this change. PStore tries resurrecting the item with @string as a string value, which is not the case anymore. Simply deleting the tmp folder works. In general, however, it seems that Nanoc should be careful reading temp files.

@ddfreyne

This comment has been minimized.

Show comment
Hide comment
@ddfreyne

ddfreyne Jan 6, 2016

Member

Hmm, loading cached data might be a problem. I’ll look into whether or not this has a significant impact.

The tmp store has a version number; bumping it would be good (it’ll invalidate previous data).

Member

ddfreyne commented Jan 6, 2016

Hmm, loading cached data might be a problem. I’ll look into whether or not this has a significant impact.

The tmp store has a version number; bumping it would be good (it’ll invalidate previous data).

@RubenVerborgh

This comment has been minimized.

Show comment
Hide comment
@RubenVerborgh

RubenVerborgh Jan 6, 2016

Member

Alright, definitely needs to be bumped.

Member

RubenVerborgh commented Jan 6, 2016

Alright, definitely needs to be bumped.

@ddfreyne

This comment has been minimized.

Show comment
Hide comment
@ddfreyne

ddfreyne Jan 10, 2016

Member

Bump the version of the compiled content cache by changing the number argument in the initializer. In this example, change it from 1 to 2:

  class CompiledContentCache < ::Nanoc::Int::Store
    def initialize
      super('tmp/compiled_content', 1)

It’s a good idea to not only bump the version, but also implement #marshal_dump and #marshal_load on TextualContent (and perhaps BinaryContent, for consistency) so that it stores/loads the string value:

      def marshal_dump
        [@filename, string]
      end

      def marshal_load(array)
        @filename = array[0]
        @string = Nanoc::Int::LazyValue.new(array[1])
      end
Member

ddfreyne commented Jan 10, 2016

Bump the version of the compiled content cache by changing the number argument in the initializer. In this example, change it from 1 to 2:

  class CompiledContentCache < ::Nanoc::Int::Store
    def initialize
      super('tmp/compiled_content', 1)

It’s a good idea to not only bump the version, but also implement #marshal_dump and #marshal_load on TextualContent (and perhaps BinaryContent, for consistency) so that it stores/loads the string value:

      def marshal_dump
        [@filename, string]
      end

      def marshal_load(array)
        @filename = array[0]
        @string = Nanoc::Int::LazyValue.new(array[1])
      end

@ddfreyne ddfreyne modified the milestone: 4.2 Jan 10, 2016

@RubenVerborgh

This comment has been minimized.

Show comment
Hide comment
@RubenVerborgh

RubenVerborgh Jan 13, 2016

Member

I implemented marshal_dump and marshal_load for TextualContent. Did not do this for BinaryContent, as it complexifies without actually changing behavior (but I can still add this if you like).

Support for lazy loading is now complete.

Shall I also implement this for the Filesystem source? In this PR or another? And do you have any advice/preference for tackling this, given that the load_objects code is already quite complex?

Member

RubenVerborgh commented Jan 13, 2016

I implemented marshal_dump and marshal_load for TextualContent. Did not do this for BinaryContent, as it complexifies without actually changing behavior (but I can still add this if you like).

Support for lazy loading is now complete.

Shall I also implement this for the Filesystem source? In this PR or another? And do you have any advice/preference for tackling this, given that the load_objects code is already quite complex?

@ddfreyne

This comment has been minimized.

Show comment
Hide comment
@ddfreyne

ddfreyne Jan 15, 2016

Member

I believe updating the Filesystem data source is worth it, although I’d do it in a separate PR. (The code for the filesystem data source is quite complex, and perhaps it warrants a refactoring first. I think it’s easier if I do that, but you’re welcome to give it a shot too.)

Member

ddfreyne commented Jan 15, 2016

I believe updating the Filesystem data source is worth it, although I’d do it in a separate PR. (The code for the filesystem data source is quite complex, and perhaps it warrants a refactoring first. I think it’s easier if I do that, but you’re welcome to give it a shot too.)

@ddfreyne

This comment has been minimized.

Show comment
Hide comment
@ddfreyne

ddfreyne Jan 15, 2016

Member

Looks good! Thanks :)

Member

ddfreyne commented Jan 15, 2016

Looks good! Thanks :)

ddfreyne added a commit that referenced this pull request Jan 15, 2016

@ddfreyne ddfreyne merged commit a15793f into master Jan 15, 2016

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@ddfreyne ddfreyne deleted the feature-lazy-items branch Jan 15, 2016

@RubenVerborgh

This comment has been minimized.

Show comment
Hide comment
@RubenVerborgh

RubenVerborgh Jan 15, 2016

Member

Thanks—agree on the refactoring, it seems best I leave this up to you.

Member

RubenVerborgh commented Jan 15, 2016

Thanks—agree on the refactoring, it seems best I leave this up to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment