Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/zlib support #44

Merged
merged 14 commits into from
Nov 10, 2016
Merged

Conversation

jeff-kilbride
Copy link
Contributor

This PR adds support for gzip values. I have tried to keep it as simple as possible.

I have added a gzip key to the various options configuration objects. Setting this key to true enables a default gzip implementation using lowest compression / best speed settings. Setting this key to an object simply passes that object to the gzip/gunzip methods as configuration. I am not currently doing any validation of these settings, if an object is passed. This should only be used for advanced configuration of the underlying gzip methods. More validation could be added, but I would like some input on this first given the extra code required. Obviously, passing false or omitting the gzip key disables this feature.

The changes I have made are working for me locally with my own tests and with a previous fork of the project from about a week ago. However, I am having issues running npm test with the latest fork from today. I will gladly add some Jasmine tests, when this is resolved. I wanted to go ahead and create a PR, so you could see the changes I have made.

}
options.parse = true;

var gzip = (options.gzip || options.gzip === false) ? options.gzip : redisOptions.gzip;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would more simple comparison work the same? like:
var gzip = options.gzip ? options.gzip : redisOptions.gzip;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, no, if you want users to be able to turn gzip on/off at the set/get level. If redisOptions.gzip is truthy (object) and the user wants to turn it off for a particular set/get, right now they can pass false:

myCache.set('myCacheKey', 'myCacheValue', { gzip: false }, function...
myCache.get('myCacheKey', { gzip: false }, function...

The more simple comparison would not catch that. The options.gzip would evaluate to false and the redisOptions.gzip would be used. That's why I have an explicit test for false. It's the same type of test that you are doing in the set method for the options.ttl value, in order to catch zero:

var ttl = (options.ttl || options.ttl === 0) ? options.ttl : redisOptions.ttl;

That's actually where I got the idea. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok :-)

@jeff-kilbride
Copy link
Contributor Author

I'm a little confused by the codacy/pr checks. It's complaining about things like strings must be in double-quotes. Are you moving in that direction? Should I change my code to pass the codacy checks?

@PuKoren
Copy link
Contributor

PuKoren commented Sep 7, 2016

@jeff-kilbride don't worry about codacy, it even have different policies that ours (and it is yelling because of issues you didn't even added). We will remove it in near future in favor of sonar

@jeff-kilbride
Copy link
Contributor Author

I have merged the latest changes and added a few fixes. I still need to add gzip specific tests and something for the Readme. However, if you set gzip: true in the options for the current tests in this branch, all tests are now passing.

I will try to get this done in the next few days, but I have some other pressing things I need to work on.

@mrister
Copy link
Contributor

mrister commented Sep 8, 2016

@jeff-kilbride Thank you very much for your contributions, anytime is good as long as you can find it to finish this feature.

@jeff-kilbride
Copy link
Contributor Author

Sorry for the delay! I have added gzip specific tests and updated the Readme file.

I wrapped all the existing tests in a new describe statement and put the gzip tests in their own describe to keep them separate. So, the redis-store-spec.js file shows a lot of changes, but they are nearly all whitespace. I copied all the existing get/set tests into redis-store-zlib-spec.js and added a few new ones. Two of my new tests grab the string length of the compressed values and print out the compression percentage to the console. I thought this output was interesting, but it can easily be removed if you prefer not to have it.

Also, while editing the Readme file, I noticed a possible error. In your examples, you are passing a ttl option:

var ttl = 5;

redisCache.set('foo', 'bar', ttl, function(err) {
...

The ttl being passed is a Number, but the set method is expecting an Object. I believe this should be:

var ttl = { ttl: 5 };

redisCache.set('foo', 'bar', ttl, function(err) {
...

or

var ttl = 5;

redisCache.set('foo', 'bar', { ttl: ttl }, function(err) {
...

If this is correct, I can make these changes to the Readme and push another commit. Let me know.

@PuKoren
Copy link
Contributor

PuKoren commented Sep 22, 2016

Thanks for your contribution @jeff-kilbride !

About the error in the readme, you can do it on a separate PR if you want to, this one will be focused on your gzip feature

Thanks for the refactoring of tests and for the feature! We will review it asap to plan a publication

conn.setex(key, ttl, val, handleResponse(conn, cb));
}
else {
conn.set(key, val, handleResponse(conn, cb));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

l.249 l.252 l.258 and l.261 are duplicated, maybe we could refactor that in a function or something? What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be refactored. I did that on purpose to clearly separate the new code from the existing. I wanted you to be able to see my changes easily.

To refactor, the conn.set and conn.setex would need to be moved into a new function. What do you think we should name that function?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could name it set as of new versions of redis, it seems that setex will be deprecated as set is used with same parameters: http://redis.io/commands/set

So i would say, set, and add a layer for previous versions compatibility inside it

What do you think guys? @dial-once/developers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree. The setex should be replaced, but I'm not sure if the node-redis module supports the new format for the set command, yet. Has anyone tested that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not on my side. But we can keep the same code inside our set function anyway, I was just referencing this to chose the correct naming for the function

Copy link
Contributor Author

@jeff-kilbride jeff-kilbride Sep 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Makes sense. I'm still a little confused, though. Sorry!

Right now, there is self.set(). In order to refactor the duplicate code in lines 248-262, there will need to be a new function that basically does this:

function newSet(conn, key, value, ttl, cb) {
  if (typeof ttl === 'function') {
    cb = ttl;
    ttl = '';
  }

  if (ttl) {
    conn.setex(key, ttl, value, handleResponse(conn, cb));
  }
  else {
    conn.set(key, value, handleResponse(conn, cb));
  }
}

Then lines 248-253 and 257-262 would each be replaced by:

  newSet(conn, key, gzVal, ttl, cb);
// or
  newSet(conn, key, val, ttl, cb);

Is this what you are thinking of? If so, what would you like to call newSet? It's basically a helper function for self.set.

EDIT: Sorry, I forgot the conn variable in my first draft.

Copy link
Contributor

@PuKoren PuKoren Sep 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of something like this directly inside the set:

      function persist(err, value) {
        if (gzErr) {
          return cb && cb(gzErr);
        }
        if (ttl) {
            conn.setex(key, ttl, value, handleResponse(conn, cb));
          }
          else {
            conn.set(key, value, handleResponse(conn, cb));
          }
      }

      if (gzip) {
        zlib.gzip(val, gzip, persist);
      }
      else {
        persist(null, val);
      }

Not sure if it is best practice, but I find it more readable than having multiple set commands inside the main object

But any solution will do it for me, I have absolutely no clue about the naming of the function

Wooh, this project needs a good cleanup 🌞

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah... ok. Makes sense. Pushed a new change.

store: redisStore,
host: config.redis.host,
port: config.redis.port,
password: config.redis.auth_pass,
auth_pass: config.redis.auth_pass,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auth_pass is being easily deprecated we should promote usage of password. https://github.com/NodeRedis/node_redis#options-object-properties

Copy link
Contributor

@PuKoren PuKoren Sep 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is right, we also have some tests about it. A little refactor would be good, I'll add a ticket do to it in a separate PR (#52)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

auth_pass: config.redis.auth_pass,
db: config.redis.db,
ttl: config.redis.ttl,
gzip: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now thinking about it, I wanted to ask what you guys @PuKoren @jeff-kilbride would think about making this a little bit more abstract. To be able to easily plug in other compressors/decompressors and gzip being the first choice for now.

So instead of having gzip:true having a compression: 'gzip'(or default to it if compression: true). Based on the value of compression property user can pick which one they need (maybe in future there would be others) or do their own. I know that this means a little big refactor, but it should open up options. (maybe compression is a bad name, but we can adapt and chose another)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good idea, will avoid having to deal with a lot of options later on and open the road to the compression pipelining 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an interesting idea. It would would need to be modularized and the correct module would need to be lazy-loaded based on the options provided. It could wind up being a good-sized refactor, though.

You may want to find out how much interest there is in the community for other compression algorithms, first. I can think of a few -- snappy, LZ4, LZO, etc... -- but I don't know how many people understand the differences between them or would want to use them. The nice thing about gzip is that it's built into the Node Zlib module and has no further dependencies. It's not the fastest algorithm, but it's a decent tradeoff, especially when set to it's lowest compression / fastest speed configuration -- which is what I typically use.

Anyway, I am open to the idea. I always prefer my tools being more flexible, rather than less. :)

@jeff-kilbride
Copy link
Contributor Author

This latest commit refactors the duplicate code in the set method, as discussed above.

@PuKoren
Copy link
Contributor

PuKoren commented Sep 26, 2016

Anyone have some time to refactor compression options as discussed here?

If not, then it LGTM for next feature release

@mrister
Copy link
Contributor

mrister commented Sep 26, 2016

@PuKoren As I suggested the refactor, maybe I should do it :-) I will. Just hope there is no big hurry on this.

@mrister
Copy link
Contributor

mrister commented Sep 26, 2016

compression pipelining.. interesting :)

@jeff-kilbride
Copy link
Contributor Author

I will continue to help whenever I can. Maybe it would be a good idea to start a new issue for the refactor.

@mrister I would enjoy hearing your ideas on how this should be implemented.

@mrister
Copy link
Contributor

mrister commented Sep 26, 2016

@jeff-kilbride Yes perhaps another issue. Well, my initial Idea was to have a list of codecs (compressors and decomposers) in which you can optionally add or remove a codec as a dependency. The gzip one will serve as a reference implementation of that. (So a separate module that will get loaded when needed by this libs user)
After that community can create other that we can ship with this module or just have their own. The list would allow for pipelining if so required.

@jeff-kilbride
Copy link
Contributor Author

@mrister For pipelining, you mean being able to call multiple compression algorithms for the same value?

@mrister
Copy link
Contributor

mrister commented Sep 26, 2016

@jeff-kilbride Yes basically the output of one would be input for the other algorithm. I am not sure if this is a real feature request or @PuKoren just joking :-) Either way it can be done.

@PuKoren
Copy link
Contributor

PuKoren commented Sep 27, 2016

@mrister was not joking but maybe it is a bit early, because:

  • it may be totally useless (CPU cost/speed/compression ratio ridiculous)
  • it can be added later quite easily without breaking change (convert compression property to an array)

I think that if you have some time to do the generic compression stuff, you should not worry about pipelining as it may take too long for what it's worth

@mrister
Copy link
Contributor

mrister commented Sep 27, 2016

OK then, still I think We need to do this before the release (Separate PR and issue) to not introduce breaking changes.

@jeff-kilbride
Copy link
Contributor Author

Hi All,

I have merged the latest changes to develop into this branch and updated some of my tests to get them working again.

Is there any update to the progress on adding additional compression implementations? It should be fairly easy to refactor this branch to avoid breaking changes in the future. The args.gzip variable could be renamed to args.compression (or something similar, if you have another idea). It could still accept either a boolean or object. A value of true could default to the gzip implementation. An object could have a type property with the name of the compression implementation -- gzip or any future additions -- as well as implementation specific parameters to be passed to the underlying compression layer (the same way gzip parameters are currently being passed).

These changes would be cosmetic, for now, because gzip would remain the only compression algorithm currently supported. However, it would then be possible to add additional compression implementations without breaking the interface in future releases. It would require a bigger rewrite to handle pluggable compression modules, but that could be handled by anyone desiring to use something other than gzip.

If this sounds like the correct approach, I can make these cosmetic changes and commit them in this PR.

@PuKoren
Copy link
Contributor

PuKoren commented Nov 8, 2016

I am for the cosmetic change

What about you guys @dial-once/developers ?

@jkernech
Copy link
Contributor

jkernech commented Nov 8, 2016

I'm for it too 👍

@mrister
Copy link
Contributor

mrister commented Nov 8, 2016

yup go for it @jeff-kilbride 👍

@jeff-kilbride
Copy link
Contributor Author

I have changed the opts.gzip parameter to opts.compress with the following default values:

{
  type: 'gzip',
  params: {
    level: zlib.Z_BEST_SPEED
  }
}

The idea is that, in the future as pluggable modules are added, the type property will be the name of the module / implementation and the (optional) params property will contain any module specific values to be passed to the underlying compression algorithm. Obviously, right now, the type property is ignored, since only gzip is supported. However, the code can now be refactored to support pluggable modules without having to change the interface.

Let me know what you guys think.

@PuKoren
Copy link
Contributor

PuKoren commented Nov 9, 2016

Great work @jeff-kilbride, its very good to have implemented it with options/type. We can do anything in the future with this implementation 👍

compress: true,
isCacheableValue: function (val) {
// allow undefined
if (val === undefined)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor for code style we should use parenthesis on if else.

done();
});
}
catch(e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code style } on same line as catch.

@mrister
Copy link
Contributor

mrister commented Nov 9, 2016

All good @jeff-kilbride nice work! minor code style changes.

Copy link
Contributor

@mrister mrister left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some smaller code style fixes on try/catch and if else blocks

@jeff-kilbride
Copy link
Contributor Author

@mrister Requested code style changes fixed.

if (ttl) {
conn.setex(key, ttl, pVal, handleResponse(conn, cb));
}
else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry again some small changes: else on same level as preceding }

if (compress) {
zlib.gzip(val, compress.params || {}, persist);
}
else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

@jeff-kilbride
Copy link
Contributor Author

@mrister Done.

@PuKoren PuKoren merged commit bab8a57 into dial-once:develop Nov 10, 2016
@jeff-kilbride jeff-kilbride deleted the feature/zlib_support branch November 10, 2016 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants