Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/5 3/background tasks interfaces #405

Merged

Conversation

chfsx
Copy link
Member

@chfsx chfsx commented Jan 9, 2017

As discussed in the JF (9.1.2017) we move the technical discussion of the deployment of BackgroundTasks as a general Service to github since we're able to discuss more precise in the code.

there are currently several questions in http://www.ilias.de/docu/goto_docu_wiki_wpage_4384_1357.html
I will respond to them here later this week

An Example using the new interfaces and namespaceing is available at: https://github.com/studer-raimann/ILIAS/blob/feature/5-3/BackgroundTasksInterfacesAndBasicImplementation/src/BackgroundTasks/Examples/SimpleExample.php

@@ -22,8 +22,8 @@ public function getOptions();

/**
* @param \ILIAS\BackgroundTasks\IO $input
* @param string $user_seleced_option
* @return \ILIAS\BackgroundTasks\IO
* @param $user_selected_option
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still a string, isn't it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will be altered to a Option Container

@@ -13,4 +13,11 @@
* @return void
*/
public function doWork();

/**
* Returns true iff the worker wants to be called within the current HTTP request.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo iff --> if

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"if" stands for "if": =>
"iff" stands for "if and only if": <=>

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, you never stop learning ;)

* @param $user_ids int[]
* @return Bucket
*/
public function putInQueueAndObserve($bucket, $user_ids);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should throw an exception when no observer is registred, I think

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typehinting on Bucket?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if the user_ids cannot be resolved to a user we need to throw an exception!

/**
* @return bool
*/
public function isJob();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these isUserInteraction() and isJob() functions needed? If an object implements Task/Job it's a Job and if it's implementing Task/Userinteraction it's a UserInteraction. So using instanceof should be used if this information is needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I somehow second this, but would also go further: Why would you ever want to now this (may it by using isJob() or by using instanceof)? In most cases it is imho better to instead define the behaviour you want to switch on the class itself. Fabian and I had a similar discussion for the table lock implementation, he might remember.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right.

Back in university somewhere around year 1 I was told: Avoid instanceofs! And somehow that still clings to me even if "insanceof" makes more sense :-)

srgitlab pushed a commit to srsolutionsag/ILIAS that referenced this pull request Jan 13, 2017
Copy link
Member

@klees klees left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR with interfaces only, that helped a lot. I see a lot of open questions regarding the proposed interface and also have some general questions and remarks that I will put in a separate comment, as they are not really tied to a location in the code.

public function doWork();

/**
* Returns true iff the worker wants to be called within the current HTTP request.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since Oskar wrotes this, I assume that iff means if and only if? Is this something every ILIAS dev will recognize?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chfsx Did not know about this, I thought it was a well known abbreviation. I'd prefer if we could use it as it's extremly comfy to use. Iff that won't be the case i'll be sad.

*
* The Basic Exception Class for BackgroundTasks. PLease Specify by extending
*/
class Exception extends \ilException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • abstract, I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we somehow specify, which Exceptions should derive from this? To me it looks as these should only be Exceptions in the general handling of background tasks, not exceptions in specific actions like ZipFolder, right?

*/
class Exception extends \ilException {

const E_BASIC = 10001;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this good for?

/**
* @return bool
*/
public function isJob();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I somehow second this, but would also go further: Why would you ever want to now this (may it by using isJob() or by using instanceof)? In most cases it is imho better to instead define the behaviour you want to switch on the class itself. Fabian and I had a similar discussion for the table lock implementation, he might remember.

* @param $user_selected_option
* @return IO
*/
public function interaction(IO $input, $user_selected_option);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do?

* The IO as a defined format of data passed between two tasks. IO MUST be serialisable since it will bes stored in the database or
* somewhere else
*/
interface IO extends \Serializable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me this looks as if it should be named Value. I also feel that we will need this kind of things in other places soon (async, forms), so we might want to put this in a separate service. I think @Amstutz is playing with a similar thing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't really care about the name. Value sounds indeed better. @Amstutz I'm interested in your approach! 👍

* @return string Gets a hash for this IO. If two objects are the same the hash must be the same! if two objects are different you need to have
* as view collitions as possible.
*/
public function getHash();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the "Serializable", but what is this good for? Why must this interface implement equals if you already have the hash that you could compare?

/**
* @var string get the Type of the
*/
public function getType();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this? Why could it be different from the class name?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not, I can't remember why I put it there. May have had my reasons but they are only in past me's head.

/**
* @return bool Returns true iff the job supports giving feedback about the percentage done.
*/
public function supportsPercentage();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A job not supporting percentage could still be "done" or "not done", right? We could map this to "0%" and "100%" and get rid of this then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concerning the UI it will look nicer if you can display a percentage independent loading bar instead of just jumping from 0 to 100.

/**
* @return int Returns 0 if !supportsPercentage and the percentage otherwise.
*/
public function getPercentage();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we move this to Task?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would i use this and/or implement this? I'm currently wondering, about this: I have a php process, that somehow instantiated the Job and called $job->run. Where would i use $job->getPercentage then? Other way round: If I would want to use this from another process: How would the two instances of the job be identified? Where, as the job, would I put my percentage info? Would everyone need to implement this on its own?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you think about a model akin to the Cronjob, where I call a ping method from within the job? Some service supporting this should then be passed to run imo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea! I'll pass the observer to the run method, and add a ping method to the observer.

@klees
Copy link
Member

klees commented Jan 13, 2017

Hi Fabian, hi Oskar,

I promised some general remarks and questions, so here we go:

  • Did you look into other libraries/apps solving similar problems? I see that you are indeed using some terminology (Worker, Task), so I think you did. But which? Could you comment on how they compare to your proposal?
  • I think this will benefit from a separation into runtime (worker, manager, ...) and "task construction". I would also suggest to factor out this "IO" thingy, as I already mentioned in the comments.
  • After reading the entry in the feature wiki, I suggested to let Bucket inherit from Task (which you did, thx). Now I see, that this was a little misguided. I think the bucket somehow mixes two concerns that could be separated for good. The one concern is, that one or more tasks should be combinable to an overall task, that was what I had seen in the FR. The other concern seems to be, that the bucket captures one execution of a task. I didn't notice that before.
  • As already suggested in some mails, the task-part of the background tasks should be abstractable in a way that merges with the activities in the workflow engine.

Best regards!

@otruffer
Copy link
Contributor

otruffer commented Jan 16, 2017

Did you look into other libraries/apps solving similar problems? I see that you are indeed using some terminology (Worker, Task), so I think you did. But which? Could you comment on how they compare to your proposal?

I did not find an implementation in a framework that matches our exact needs: Composable tasks that can be laid into the background.

I think this will benefit from a separation into runtime (worker, manager, ...) and "task construction". I would also suggest to factor out this "IO" thingy, as I already mentioned in the comments.

Agreed, We'll reword it to "Value"

After reading the entry in the feature wiki, I suggested to let Bucket inherit from Task (which you did, thx). Now I see, that this was a little misguided. I think the bucket somehow mixes two concerns that could be separated for good. The one concern is, that one or more tasks should be combinable to an overall task, that was what I had seen in the FR. The other concern seems to be, that the bucket captures one execution of a task. I didn't notice that before.

Good idea, we should conceptually separate execution (run-time) from construction (compile-time) by namespacing/foldering the interfaces further. When it comes to the buckets, I think it will not matter: When constructing the background task you might want to reuse other buckets created by a method and combine buckets and tasks as you go. When serializing the top-level-bucket (containing other buckets and tasks recursivly) into the queue we might want to flatten the data structure into one bucket containing tasks. Thus in the "run-time" we get one bucket with a list of tasks. We still want that top level bucket as it will be similar to a closure: When we want to delete an operation (e.g. something went wrong) we want to identify all operations and values related to the bucket and wipe them.

As already suggested in some mails, the task-part of the background tasks should be abstractable in a way that merges with the activities in the workflow engine.

Add the moment i'd suggest to add the Activity interface to the Job Interface. This way we should be able to reuse Jobs as Workflow Engine activities.

* @package ILIAS\BackgroundTasks\Task
*
* A Task in the Bucket, which will need some User-Interaction before running the task. A
* User-Interaction is provided as
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@otruffer you just stopped writing within the sentence ;-)

@otruffer
Copy link
Contributor

otruffer commented Feb 10, 2017

Some information before reviewing the Code:

Currently there are implementations for task definition and composition. The tasks cannot be stored in the database yet. There is no implementation of a bucket yet. The tasks can be executed with the simplistic BasicTaskManager, this is just for testing currelty.

For restoring creating tasks we use dependency injection (newly implemented in this branch) see /src/DI/factory.php. This makes testing possible for this module.

For testing Jobs we implemented a PlusJob (see src/BackgroundTasks/Implementation/Tasks/PlusJob.php), which takes two integers and returns one integer. In the plus job you can easily see how you will implement tasks.

For composition testing we implemented a few tests (see tests/BackgroundTask). Most interesting here will be the TaskTest for the moment as it shows Task composition. The simplest task can be found in testValueWrapper()

                // The following part is only needed in tests. In production you will use global $factory.
                $dic = new Container();
		$dic[Observer::class] = function ($c) {
			return new ObserverMock();
		};
		$factory = new Factory($dic);

                // Here's the creation of the plus job and the setting of the values
		$t = $factory->createInstance(PlusJob::class);
		$t->setInput([1, 4]);
                
                // The following part is only needed in tests. In production you will use $taskManager->startTask($t), which will create a Bucket with one Task and enqueue the task.
		$taskManager = new BasicTaskManager();
		/** @var IntegerValue $finalValue */
		$finalValue = $taskManager->executeTask($t, new ObserverMock());
		$this->assertEquals($finalValue->getValue(), 5);

More complicated cases like putting tasks together can be found in the tests as well.

@ILIAS-eLearning/technical-board Would it be possible to discuss the job creation and composition at the technical board?

Copy link
Contributor

@Amstutz Amstutz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a short readme with a short description and examples making the usage of this clear.

Note that we state in a rule for the ./src folder (see the according README.md there) we state that:
6. Libraries [in the src dirctory] SHOULD contain a README.md on the top-level, that explains the purpose and design of the library and gives usage examples.

It is not absolutely mandatory, but in this rather complex case, I would heavily recommend to do so.

@chfsx chfsx self-assigned this May 9, 2017
srgitlab pushed a commit to srsolutionsag/ILIAS that referenced this pull request May 11, 2017
@srgitlab srgitlab force-pushed the feature/5-3/BackgroundTasksInterfaces branch from d289e04 to 0eff972 Compare May 26, 2017 12:50
@srgitlab srgitlab force-pushed the feature/5-3/BackgroundTasksInterfaces branch from bd92ff9 to 5a12c23 Compare May 26, 2017 13:00
@otruffer
Copy link
Contributor

Okay: Brough everything up to date, added a readme and rebased. Now this pull request only contains the interfaces.

@Amstutz
Copy link
Contributor

Amstutz commented Jun 8, 2017

Thank you very much for the readme!

@chfsx
Copy link
Member Author

chfsx commented Jun 8, 2017

Hi @klees and @alex40724
We would like to proceed with this PR and merge it after at least 2 core-developers approved (with 2/3 over-all approvals).

Thanks you very much for your effort on reviewing this, I know this always takes some time and therefore I hope you could again have a look at it. Best regards!

@chfsx
Copy link
Member Author

chfsx commented Jun 8, 2017

Since i opened the PR I'm not allowed on github to approve it, so I do it here: I approve this PR! thanks you very much @otruffer

@alex40724
Copy link
Member

alex40724 commented Jun 8, 2017

The readme refers to an example implementation PlusJob.php - but this is not included, is it? Or has it been removed since the PR "should only include the interfaces"? I put a lot of implementation into my PRs for the KS. I thought this would help to understand the details sometime. For me every line of code helps...

I currently do not object to anything here. But I would prefer that @klees aproves this, too. Since he discussed the details more than I did.

@chfsx
Copy link
Member Author

chfsx commented Jun 8, 2017

no, it's only a fictional example to show how
it works. we provide only the interfaces in this PR (and the readme)

@alex40724
Copy link
Member

alex40724 commented Jun 8, 2017

Yes, I just saw the earlier comment (and edited my previous one). Thanks to all that put effort into this non trivial stuff. I approve, but suggest to wait for @klees comments...

@klees
Copy link
Member

klees commented Jun 12, 2017

I'm on it.

Copy link
Member

@klees klees left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @otruffer,
some nitpicks and suggestions, one major problem (DI/Factory). Thanks for the good work!
Regards!

*
* A meta bucket contains infos about a bucket like its percentage, name etc.
*/
interface BucketMeta {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand, this duplicates some methods from the bucket. Let Bucket return an instance of this or extend Bucket from this. Maybe also rename this to BucketInfo.


use ILIAS\BackgroundTasks\Implementation\Bucket\State;

interface Persistence {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I like this pattern. For future reference: I think this is the "Repository Pattern". It helps testability a lot, imo, when contrasted with "Active Record".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, made the whole testing possible. Repository might have been a better name then :-)

interface Task {

/**
* @return string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be Type, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I guess it should not be Type, because we also have getOutputType. What is this, then? Is this the class of the Task? Why would I want to use this, then, instead of get_class?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is currently "$StringRepresentationOfInputType -> $StringRepresentationOfOutputType". It's not really necessary but for exception text.

* @package ILIAS\BackgroundTasks\Task
*
* A Task, which can be run without any interaction with the user, such as zipping files
* or just collecting some data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice explanation. Maybe move this to your glossary as well, the "without interaction" part is missing there.

*
* @author Oskar Truffer <ot@studer-raimann.ch>
*/
class Factory {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do understand that you need this kind of factory to initialize the tasks (there are other options, still), but I'm against considering this a general pattern for instantiation. Thus move this to the Background-Tasks-folder, please. The arguments I see against this kind of factory are:

  • Folklore says, ReflectionClass is slow. This is not too relevant for the background tasks, but might be a problem when this factory is used in ordinary requests. We should measure the impact of ReflectionClass before we make this generally available.
  • More severe: This pattern confuses "implements XYZ" with "is named XYZ". This is not the same. By announcing a dependency in a constructor, one (optimally) uses interfaces to describe the services an object depends on. This is the "implement XYZ" part. The array-access on the DIC, on the contrary, gives names to certain services. These are optimally only temporary/local when DI is used properly, but ILIAS is not there yet. The problem with this confusion is: the interfaces a service implements only accidentally coincide with its name. There already are services in ILIAS where this is in fact wrong ($tpl vs. ilTemplate). When announcing a dependency, I should only say: "I need some service implementing interface XYZ", but with this pattern I, in fact, say "I need the service named XYZ". This will prevent some moves in the future, e.g. having two services implementing the same interface.

Copy link
Contributor

@otruffer otruffer Jun 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Folklore says, ReflectionClass is slow. This is not too relevant for the background tasks, but might be a problem when this factory is used in ordinary requests. We should measure the impact of ReflectionClass before we make this generally available.

Injection needs reflection and the benchmarks I have found (admittedly they do not seem very thorough) indicate that the overhead generated by the reflection is neglectable. If we don't want reflection, we need to rename the "Dependency Injection" service to "Dependency Container".

More severe: This pattern confuses "implements XYZ" with "is named XYZ". This is not the same. By announcing a dependency in a constructor, one (optimally) uses interfaces to describe the services an object depends on. This is the "implement XYZ" part. The array-access on the DIC, on the contrary, gives names to certain services. These are optimally only temporary/local when DI is used properly, but ILIAS is not there yet. The problem with this confusion is: the interfaces a service implements only accidentally coincide with its name. There already are services in ILIAS where this is in fact wrong ($tpl vs. ilTemplate). When announcing a dependency, I should only say: "I need some service implementing interface XYZ", but with this pattern I, in fact, say "I need the service named XYZ". This will prevent some moves in the future, e.g. having two services implementing the same interface.

Here we have the problem that I provided the interface with the implementation. The Interface provided in this PR stayed the same the implementation was changed in my implementation branch. This is a mistake on my part, sorry. The current implementation uses a dependency map to match interfaces to instantiations in the DIC regarding the class that requests the dependency. Thus it will be possible to have two implementations of an interface and hand them out depending on who requests the dependency (the redis + mysql usecase we discussed).

This said I'm not in favor nor against keeping this interface in the DI and can move it to the BT as well. I thought it might be of interest to provide a basic injector to actually justify the services name as a first step. If you are still against keeping this and we get no other feedback by tomorrow I'll move it to the BT service.

Thanks a lot for the review! :-)

Edit: I added the dependency map's interface for you to see.

@otruffer
Copy link
Contributor

@klees Okay, factory now reads injector and is moved to the bt folder. Anything else?

@chfsx chfsx merged commit 20fc2fc into ILIAS-eLearning:trunk Jun 22, 2017
@chfsx
Copy link
Member Author

chfsx commented Jun 22, 2017

Thanks everybody for the comments, reviews and all the brainstorming!

@srgitlab srgitlab deleted the feature/5-3/BackgroundTasksInterfaces branch June 22, 2018 09:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants