Skip to content

Files are not deleted although confirmations are sent by the receiver #38

@schooft

Description

@schooft

Without logs due to #37, the reason is unclear but it might be that a cleaner thread died:

[2022-05-10 22:19:38] [utils_network:stop_socket:682] [file_fetcher-0/8] [INFO] Closing cleaner_job_socket
[2022-05-10 22:19:38] [utils_network:stop_socket:682] [file_fetcher-4/8] [INFO] Closing cleaner_job_socket
[2022-05-10 22:19:38] [utils_network:stop_socket:682] [file_fetcher-5/8] [INFO] Closing cleaner_job_socket
[2022-05-10 22:19:38] [utils_network:stop_socket:682] [file_fetcher-7/8] [INFO] Closing cleaner_job_socket
[2022-05-10 22:19:38] [utils_network:stop_socket:682] [file_fetcher-6/8] [INFO] Closing cleaner_job_socket
[2022-05-10 22:19:38] [utils_network:stop_socket:682] [file_fetcher-1/8] [INFO] Closing cleaner_job_socket
[2022-05-10 22:19:38] [utils_network:stop_socket:682] [file_fetcher-3/8] [INFO] Closing cleaner_job_socket
[2022-05-10 22:19:38] [cleanerbase:run:274] [Cleaner] [ERROR] Stopping Cleaner due to unknown error condition.
Traceback (most recent call last):
  File "/opt/hidra/datafetchers/cleanerbase.py", line 269, in run
    self._run()
  File "/opt/hidra/datafetchers/cleanerbase.py", line 284, in _run
    socks = dict(self.poller.poll())
  File "/opt/python/cp27-cp27mu/lib/python2.7/site-packages/zmq/sugar/poll.py", line 103, in poll
  File "zmq/backend/cython/_poll.pyx", line 143, in zmq.backend.cython._poll.zmq_poll
  File "zmq/backend/cython/_poll.pyx", line 123, in zmq.backend.cython._poll.zmq_poll
  File "zmq/backend/cython/checkrc.pxd", line 13, in zmq.backend.cython.checkrc._check_rc
    PyErr_CheckSignals()
  File "/hidra/src/hidra/sender/datamanager.py", line 1121, in signal_term_handler
  File "/hidra/src/hidra/sender/datamanager.py", line 1050, in stop
  File "/hidra/src/hidra/sender/datamanager.py", line 1006, in check_hanging
  File "/opt/python/cp27-cp27mu/lib/python2.7/multiprocessing/process.py", line 158, in is_alive
AssertionError: can only test a child process

The error and the missing log about closing the cleaner job socket from file_fetcher 2/8 seem to hint in that direction.

Edit: The log message just came later in the file:

[2022-05-10 22:19:38] [utils_network:stop_socket:682] [file_fetcher-2/8] [INFO] Closing cleaner_job_socket

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions