Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job picked from queue with default serializer while worker has json serializer #1357

Closed
tvcuyck opened this issue Oct 8, 2020 · 7 comments
Closed

Comments

@tvcuyck
Copy link

tvcuyck commented Oct 8, 2020

I configured the worker and queue with serializer=json but when the job is picked from the queue it tries to de-serialize with pickle and I get pickle errors.

I added some logging: At startup you can see the json serializer, but when the job is processed, the queue and job have a default serializer.

Starting Task Worker (pid: 13882)
-- Queue serializer <module 'json' from '/usr/lib/python3.8/json/__init__.py'>
-- Worker serializer <module 'json' from '/usr/lib/python3.8/json/__init__.py'>
16:04:47 Registering birth of worker python-consumer
16:04:47 Worker rq:worker:python-consumer: started, version 1.5.2
16:04:47 *** Listening on queue...
16:04:47 Sent heartbeat to prevent worker timeout. Next one should arrive within 480 seconds.
16:04:47 *** Listening on queue...
16:04:47 Sent heartbeat to prevent worker timeout. Next one should arrive within 480 seconds.
-- Queue serializer <class 'rq.serializers.DefaultSerializer'>
-- Job serializer <class 'rq.serializers.DefaultSerializer'>
16:04:56 queue: Export (52bb0e76-4387-4e46-8fab-f462ce54914c)
16:04:56 Sent heartbeat to prevent worker timeout. Next one should arrive within 480 seconds.
16:04:56 Sent heartbeat to prevent worker timeout. Next one should arrive within 90 seconds.
-- Job serializer <class 'rq.serializers.DefaultSerializer'>
16:04:56 Handling failed execution of job 52bb0e76-4387-4e46-8fab-f462ce54914c
-- Job serializer <class 'rq.serializers.DefaultSerializer'>
16:04:56 Sent heartbeat to prevent worker timeout. Next one should arrive within 480 seconds.
16:04:56 Sent heartbeat to prevent worker timeout. Next one should arrive within 480 seconds.
16:04:56 *** Listening on queue...
16:04:56 Sent heartbeat to prevent worker timeout. Next one should arrive within 480 seconds. 

I could fix it by passing the serializer=self.serializer form the worker to the queue and job via the dequeue_any class method in dequeue_job_and-maintain_ttl.

Is this a bug with the json serializer, or was my setup wrong?

@tvcuyck tvcuyck changed the title Job picked from queue with default serializer while worer has json serializer Job picked from queue with default serializer while worker has json serializer Oct 8, 2020
@selwin
Copy link
Collaborator

selwin commented Oct 22, 2020

Are you initializing Worker with serializer=json? The worker CLI also needs to be updated to accept serializer input, I just realized this myself since I usually just run with the default pickle serializer.

@f0cker
Copy link
Contributor

f0cker commented Nov 6, 2020

+1 for this. I'm using the worker cli and need to migrate to JSON. I started looking at creating a PR to update the worker cli, but now I'm thinking it might be easier to add something to defaults.py and then add that to the worker cli?

For example:
DEFAULT_SERIALIZER=json

@selwin
Copy link
Collaborator

selwin commented Nov 14, 2020

Changing the default serializer to json would break RQ for many users so it's not an option at this point.

I'd be happy to accept a PR that adds --serializer option to the worker CLI though.

@JackBoreczky
Copy link
Contributor

I've encountered this issue as well while working on trying to use a custom serializer with json. @nathanielweinman and I found that the default serializer was being used for jobs pulled off a queue due to the serializer not being passed as an argument when fetching jobs, the changes in #1381 worked for our purposes.

As an aside, the use case of the custom serializer was to pull RQ jobs off of Redis and process them in Ruby. Encountered two issues with that:

  1. Just passing in json as the serializer did not work as it does not output a bytes object like pickle does. We had to use encode and decode to covert to and from bytes objects in our custom serializer.
  2. There's a layer of zlib compression on at least the data object on jobs after the serializer is used. Not entirely sure where and why this is used.

@CarlosUvaSilva
Copy link

Hello @JackBoreczky @nathanielweinman

I'm trying to do the same as you but the other way around, queu jobs in Ruby and run them with python. How did you set up your python code? I'm using your PR version of RQ but I'm not being able to run it.
This is my current code:

main.py

client = Redis(
  host=os.getenv("REDIS_HOST"),
  port=os.getenv("REDIS_PORT"),
  db=os.getenv("REDIS_DB")
)

q = Queue(connection=client, serializer=json)
j = Job(connection=client, serializer=json)

def run_background_job(url):
  try:
    print("OI")
    job = j.create(print_url, ['http://www.google.com'])

    q.enqueue_job(job)

print_url.py

def print_url(url):
  print("PRINT_URL")
  print(f'URL TO PRINT: {url} - {get_current_job().id}')

I'm getting a Could not resolve a Redis connection error for some reason, If I try to example with pickle it work

@JackBoreczky
Copy link
Contributor

@CarlosUvaSilva Are you testing using that setup for the pickle serializer? I tried something similar out myself and got the same error, but it went away with passing connection=client to the enqueue_job call. Not sure what causes that but seems unrelated to the custom serializer. Although even once that is working I believe you'll still run into the issue I mentioned with using plain serializer=json.

@CarlosUvaSilva
Copy link

@JackBoreczky hey again, I couldn't get it to work anyway I tried (I even tried your PR but the use case is the inverse of yours so it didn't work).

I ended up forking and monkey patching the deserializer method to get more params from the plain text hash on redis and not from decompressing pickle in data.

@selwin selwin closed this as completed in efe7032 Jan 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants