Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing salimanes fork patch #130

Closed

Conversation

Dynom
Copy link

@Dynom Dynom commented Aug 31, 2013

This is an up-to-date patch, based on Salimane's work: https://github.com/salimane/php-resque-jobs-per-fork

This PR introduces an environment variable JOBS_PER_FORK, that will define the amount of jobs being processed per fork. A larger value introduces a slightly more risk (in case of errors or unexpected dependencies), but also reduces overhead quite a bit. I'm not sure yet what the optimum value is, but that will different for each job.

Tests show up to 10x faster jobs and fewer resources needed.

@chrisboulton
Copy link
Owner

I've been thinking about this, and I can see the obvious benefit that a slightly higher risk of lower isolation of job processing that this might give some people, so I'd like to merge it in.

That said, I'm sure that something like this could be built in without adding an additional class just for this purpose, and that's how I'd like to see the implementation - directly inside Resque_Worker.

I'd also love to see the documentation added and some tests added too. 😁

@Dynom
Copy link
Author

Dynom commented Sep 8, 2013

I've made some improvements and refactored it to be in a single class. I added documentation and the tests all pass.

However, I don't have time to improve the tests. This is largely because I feel that the library needs some work before that makes much sense.

@danhunsaker
Copy link
Contributor

Out of curiosity, what work is that?

@Dynom
Copy link
Author

Dynom commented Sep 11, 2013

To summarise; there are unnecessary logic complexities and encapsulation violations that makes phpresque hard to extend, improve and test.

@danhunsaker
Copy link
Contributor

It seems that it would be useful to have a more comprehensive list of such problem areas so they can be addressed. When you have the time, it would be nice if you could create a few issues to that effect. That way we can get more developers involved improving the library itself, or at least discussing the weaker points to determine how such improvements would be possible.

Because summaries are great for quick communication, but terrible for much else. :)

@danhunsaker
Copy link
Contributor

Just a side note that this is essentially the functionality provided to Ruby's Resque library via the resque-multi-job-forks plugin. So there's precedent here.

@chrisboulton
Copy link
Owner

To summarise; there are unnecessary logic complexities and encapsulation violations that makes phpresque hard to extend, improve and test.

I'd sure love to know what some of these are so we can go about improving the library. :)

@danhunsaker
Copy link
Contributor

@Dynom Just pinging for more feedback, again. :-)

@Dynom
Copy link
Author

Dynom commented Nov 19, 2013

Sure @danhunsaker. You definitely deserve a bit more details after my previous outburst. A couple of things that I found while I was bringing this fork up-to-date I'll mention below. Most of it is just my opinion obviously and it's my attempt to raise some point of discussion about what can be done to improve PHP-Resque and to make it even better.

Remove the dependency on Credis

Currently there is a hard-to-override dependency on CRedis. With some trickery it's possible to manually set Resque::$redis to something else, but it's just a work around and not something that gives me enough confidence to roll it in production. Credis has it's limitations and there are much nicer libraries out there, such as Predis [https://github.com/nrk/predis].

Improve injection

Currently the library uses quite a few public static fields and methods, which not only introduces some problems with writing tests, but also makes it very awkward to change (for example) the Redis client to something else.

Re-use more existing components

Currently a self-written event-system and logger are used. Perhaps it would be nice to switch to Symfony's event handler [https://github.com/symfony/EventDispatcher] or Evenement [https://github.com/igorw/evenement] and for logging Monolog [https://github.com/Seldaek/monolog] might fit the bill. Perhaps some choices were made because of the required PHP version of these libraries, but maybe they could be re-evaluated.

Reducing complexity

I believe that some implementations can be simplified. An example of this is for example the method Resque_Worker::work(). In this method there are up to 4 nested if statements.

Another example, is feature/responsibility discoverability. It's done using method_exists and is_object, instead of checking what interfaces are implemented, which would make sanity checks quite a bit easier and more self-explanatory, I think.

Reduce the scattering of concerns

Especially in this fork there is some additional complexity, because of the work in Resque_Worker::handle(). But there are more places where the separation of concerns could be improved.

Perhaps it would be an idea to introduce a Queue class that incorporates some logic that is now placed in the Resque and Resque_Worker classes. This would allow for method such as $queue->add($job), without exposing the underlaying structure.

Documentation

There is pretty good project documentation, especially for an OSS library. But some inline documentation is missleading or wrong. It would be nice to see this correct. An example would be in Resque_Failure::getBackend().

Tests

The tests currently depend on an actual running Redis instance, I would also like to see tests that test the workings of the resque implementation. This is always a bit of a grey area and a matter of taste I suppose. Using mock libraries such as Mockery [https://github.com/padraic/mockery] or Phake [https://github.com/mlively/Phake] might be a good start.

Refactor to PSR-2

A very minor "improvement" to the quality of the code, but a great improvement to the library as a whole, would be to use the PSR-2 coding style. It's publicly accepted and it might attract more people to contribute. Also it will become much easier to check if forks meet the style, by using the efforts of:

@danhunsaker
Copy link
Contributor

Trying to remember why I didn't respond more quickly to this. I'll address more of this post when I have a bit of sleep in me, but just real quick, I wanted to address PSR-2.

The majority of PSR-2 is actually pretty good. But any spec fails to gain my support, personally, the instant it opts for spaces to indent code. We're not using typewriters or consoles for writing code anymore. I mean, sure, you're more than welcome to vi(m) or emacs or whatever other beasts you've been using for the past 50 or so years if you want. But most of us are at least using Sublime Text or Notepad++, and many are using full-blown IDEs. Any argument I've heard to favor spaces over tabs fails to stand up to that simple reality. So while I'm not saying PSR-2 would be a bad thing, I refuse to implement it myself, and I will never have my code editor set up for it.

PSR-0, -1, and -3, on the other hand - brilliant! Full support!

Speaking of... We have PSR-3 logging support now, so plug in Monolog or any other PSR-3-compatible logger you desire and have at! As to events systems, I suspect they might be slightly overkill in most cases, but I haven't really looked into them, yet. As it stands now, PHP-Resque doesn't have an events system so much as plugin support in the form of hooks. Strictly speaking, yes, event hooks. But really? Not quite an actual event system. Still, I'll take a look at a few and see if we can be generic about it.

OK, 😴 time, now.

@danhunsaker
Copy link
Contributor

Alright, I'll take these in groups.

Injection

It would be good to add the ability to inject different Redis libs into the system, depending on the project's existing requirements and so forth. Credis is, for its part, much better than Redisent, its predecessor not only for the Credis project itself, but also within PHP-Resque. But no library is perfect. It was chosen over Predis because it is much smaller and simpler, and because it was, originally, bundled within the repo. It is pulled in via Composer (or manually, for those not using Composer), so it should be safe to replace this direct dependency with a recommendation or three and an injection mechanism. It will require that individuals upgrading to the latest version be aware that they'll need to inject a Redis lib manually in their code, but that's not too difficult to address.

As to injection of other things, we've since added logger injection. Without a PSR on it, I'm not aware of any other standard interfaces for event systems, so moving to an injector for this might be tricky. Also, I'm not convinced our events quite warrant a full event system the likes of which you've mentioned. Further research may change my mind on this, though. Like the logger, though, I'd probably recommend retaining the current event "system" as a fallback mechanism.

Are there any other components which would benefit from injection that I've missed?

Refactor

Much of the code very well could be simplified. We'd want to take care, though, not to use metrics such as control statement nesting depth to determine what needs to be streamlined. It is usually better to focus on reducing redundant code, by increasing code reuse. This can easily get out of hand with code spidering all over the library, though, if we aren't careful. Which I agree we may not have been in the past. The fact that there are three distinct perform() methods in any given job execution cycle might be a good example, though with the separation of responsibilities between each of these components, it's probably one of the better approaches.

As to using is_object and method_exists, these tend to be preferred because they give the developers using the library more flexibility with which external projects they inject. If we checked whether the PSR-3 interface was implemented on loggers, for example, we would disqualify a number of projects that are nonetheless compatible with PHP-Resque's logging requirements. PSR-3 compatibility is the requirement we wish to enforce, not use of the actual PSR-3 interface in getting there. Similar arguments apply to other aspects of the library.

Either way, though, much of the project could probably do with a refactor. I suspect much of the current state of things is a result of trying to make as direct a port from the Ruby version of the time as possible, despite differences between Ruby and PHP that make the Ruby approach less efficient. (Not the least of which is Ruby itself, but I digress.)

Supporting Materials

Documentation and tests are a constant battle for just about every project I've ever worked on. It would be nice to write documentation, build tests to the docs, then develop code to the tests. This almost never actually happens, though. Still, something to work on. Though I'll point out that the docs and tests, while vital to the project, are not actually part of the library, and thus aren't technically problems with the lib itself. But yeah, important work.

And I've already given my opinion on PSR-2.

These are my thoughts. Looking forward to seeing others'.

@arturo-c
Copy link

Agree with @Dynom on many points, I actually just added an issue here #183.

Some of the concerns of @danhunsaker:
On Injection, I believe that we can make use of containers such as http://pimple.sensiolabs.org/ to avoid burden on individuals needing to be aware of all dependencies.

On Refactor to PSR-2, I really don't want to favor one over the other and I believe that this has really minimal effect on the overall quality of code so I wouldn't prioritize work on it, but to be honest I do prefer PSR-2.

On Tests, I do strongly believe that improving test coverage and extracting some of the dependencies in order to provide mocks is essential to the overall direction of this project, I have seen many projects stall because nobody wants to change anything that might affect other classes.

On complexity and scattering of concerns, there is some work to be done.

I have looked over the issues in this repo, and it seems that there is an overall concern on maintaining the project in somewhat similarity and fork of ruby resque. While I agree with this concept, I also believe that there is a lot of improvement that can be done, not in adding or removing features, but in improving code quality and extensibility for the future of the project.

As I said on my other post, I'll be glad to start work on these issues because I have been given bandwidth on my project to improve this library as we're beginning to rely heavily on it, so I can lead these initiatives if @chrisboulton, @danhunsaker and any other maintainers agree.

@Dynom
Copy link
Author

Dynom commented Apr 24, 2014

That you can spent some time on this library is excellent news @aturo-c!

@danhunsaker
Copy link
Contributor

(Tagging #183 so my comment doesn't get lost there; I'll be addressing it here, too...)

The first thing to remember is that PHP-Resque is a port of the Ruby original, almost directly. The implementation decisions made here would have been simply to mirror the ones made there as directly as possible - not because they were the best design for PHP, but because they were known to work. (You can confirm this much of it by comparing the Ruby implementation which was current as this port was being built with the implementation this port opted for at the time. The logic is essentially identical.) Getting the library built would have been the primary concern. After it was working, refactoring for PHP best practice could become a priority.

But, as with most projects undertaken in an employer's interests, once it was working, other projects would have been introduced, with higher priorities, and the PHP improvements would be pushed further and further from the focus. Community improvements would be possible, but the time available to review them (as well as familiarity with what's being changed where, and how it works) would gradually dwindle. I'm just extrapolating these specifics from my own experiences and the information I do have about these various goings-on, so @chrisboulton may correct me on some of it, but I feel fairly confident that this is at least close to the chain of events we're dealing with, here.

So we have in-built implementations of pseudo-events and so forth, because the Ruby version of the time did as well. We have hard dependencies, because the Ruby version did as well. OK, fine, we accept that when Ruby did something a certain way, the port also did it that way for speed of implementation. But why was the underscore approach selected instead of 5.3+ namespacing? Most likely because 5.2 was still fairly prevalent when the code was written. It serves the same purpose with the same degree of effect, so there isn't really much to gain from switching now. It would certainly be worthy of a major version increment to implement such a change, because it would break literally every project using it when the change went into effect. (Of course, that could be worked around by having the underscored equivalents inherit from the namespaced master implementations so that both are available until enough versions have elapsed to remove support altogether.)

At any rate, yes, all of this could be improved. I support all of the improvements suggested here, though I may disagree with suggested specific options (for example, Redis library injections are great, but Predis amounts to bloat for PHP-Resque's purposes, so I would never recommend it to anyone who isn't using Redis for other things in the same project, especially with the Redis PECL available). I even support enforcing PSR-2, though I will resist the indentation portion until my dying days. In other words, let's make these changes happen, but let's also make sure we're making the right changes for the right reasons, and not just because it's what works for some other project.

@Dynom
Copy link
Author

Dynom commented Sep 10, 2014

I no longer recommend using this fork, the implementation is broken on some points and the library should be considered broken after merging this PR in.

All comments and information in this PR I still agree with, however.

@Dynom Dynom closed this Oct 14, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants