Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job with textual referenced function continually adds arg if using persistent job stores #466

Closed
spengjie opened this issue Sep 30, 2020 · 0 comments
Labels

Comments

@spengjie
Copy link

Sample Code to Reproduce

from apscheduler.jobstores.mongodb import MongoDBJobStore
from apscheduler.schedulers.background import BackgroundScheduler

from myproject.config import mongodb_config


class DummyClass:

    def dummy_method(self, a, b):
        pass


instance = DummyClass()
jobstores = {
    'default': MongoDBJobStore(
        database='test',
        collection='scheduler_test',
        host=mongodb_config.connection_str,
    ),
}
scheduler = BackgroundScheduler(jobstores=jobstores)
job = scheduler.add_job('__main__:instance.dummy_method', args=(1, 2),
                        trigger='interval', seconds=1, id='1')

scheduler.start()
job = scheduler.reschedule_job('1', trigger='interval', seconds=2)
job = scheduler.reschedule_job('1', trigger='interval', seconds=2)
job = scheduler.reschedule_job('1', trigger='interval', seconds=2)
print(job.args)

while True:
    pass

Expected Behavior

The value of job.args in the sample code should always be ().

Current Behavior

The value of job.args in the sample code is (<__main__.DummyClass object at 0x000001DDAEB5D828>, <__main__.DummyClass object at 0x000001DDAEB5D898>, <__main__.DummyClass object at 0x000001DDAEB5DBA8>) when printing it. And its size is still being increasing if keep the script running.

Environment

Python: 3.6.8
APScheduler: 3.6.3

Context

The script will work fine if I change the add_job part to

job = scheduler.add_job('__main__:DummyClass.dummy_method', args=(instance, 1, 2),
                        trigger='interval', seconds=1, id='1')

or

job = scheduler.add_job(instance.dummy_method, args=(1, 2),
                        trigger='interval', seconds=1, id='1')

But I cannot use either of them. I'm using jsonrpc to add jobs remotely. So I have to use textual referenced function. And I'm using Celery to execute the job. Celery creates tasks from any callable by using a decorator. The decorator converts the callable to anther callable class instance. I cannot get the class instance.

Detailed Description

I have located where the root cause is. Please see my analysis below.

  • instance.dummy_method is a method. instance.dummy_method.__self__ is not a class but an intance of the class. So each time __getstate__ is called, instance will be added to args. The sample code will work fine if I remove the logic of adding self.func.__self__ below for textual referenced function. But an exception will raised for calleld function.
# job.py starts from line 235
    def __getstate__(self):
        # Don't allow this Job to be serialized if the function reference could not be determined
        if not self.func_ref:
            raise ValueError(
                'This Job cannot be serialized since the reference to its callable (%r) could not '
                'be determined. Consider giving a textual reference (module:function name) '
                'instead.' % (self.func,))

        # Instance methods cannot survive serialization as-is, so store the "self" argument
        # explicitly
        if ismethod(self.func) and not isclass(self.func.__self__):
            args = (self.func.__self__,) + tuple(self.args)
        else:
            args = self.args
  • I compared the difference between using textual referenced function and callable function and finally found that the value of func_ref for a callable function is __main__:DummyClass.dummy_method. I think it should be __main__:instance.dummy_method. get_callable_name in util.py only returns class name even if the input is an instance.

There are two possible solutions:

  1. Change the logic of get_callable_name. If the input is an instance, return the instance's name rather than its class name. The code for adding self.func.__self__ in job.py can be removed then.
  2. Change the logic of adding self.func.__self__ to:
        func = self.func
        if ismethod(func) and not isclass(func.__self__) and not hasattr(func, '__func__'):
            args = (func.__self__,) + tuple(self.args)
        else:
            args = self.args

I prefer the first one. But I saw v4.0 is in developing and I could't find the code related to this issue, so I will create a pull request with the second solution to branch 3.x later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants