Skip to content

populate() with reserve_jobs=True ignores *restrictions #1413

@noahpettit

Description

@noahpettit

Bug

Computed.populate(*restrictions, reserve_jobs=True) ignores the *restrictions argument. It processes ALL pending jobs in the jobs table instead of only those matching the restriction.

How we found it

We have a Registration table with ~730 pending keys. We wanted to populate only a subset (5 keys) matching a filter:

restriction = ProcessScan & "pipeline_preset IS NOT NULL"
# Correctly shows 5 pending:
pending = restriction - Registration
print(len(pending))  # 5

# But this processes all 730 pending keys, not just 5:
Registration.populate(restriction, reserve_jobs=True, display_progress=True)

The progress bar shows 0/730 and it starts computing keys that don't match the restriction at all.

Diagnosis

In autopopulate.py, populate() delegates to _populate_distributed() when reserve_jobs=True. Looking at that method (~line 460):

def _populate_distributed(self, *restrictions, ...):
    ...
    if refresh:
        self.jobs.refresh(*restrictions, priority=priority, delay=-1)  # restrictions used here

    # But here, restrictions are NOT applied:
    pending_query = self.jobs.pending & "scheduled_time <= CURRENT_TIMESTAMP(3)"
    keys = pending_query.keys(order_by="priority ASC, scheduled_time ASC", limit=max_calls)

The *restrictions are passed to self.jobs.refresh() (which creates/updates job entries), but when fetching pending keys to process (line ~499), it queries self.jobs.pending without any restriction filter. So it picks up every pending job in the table regardless of what was passed to populate().

In contrast, _populate_direct() (used when reserve_jobs=False) correctly computes (key_source & restrictions) - target, so the restriction works as expected there.

Workaround

Use reserve_jobs=False:

Registration.populate(restriction, reserve_jobs=False)  # works correctly

Expected behavior

populate(*restrictions, reserve_jobs=True) should only process jobs matching the restrictions, consistent with reserve_jobs=False behavior.

Version

DataJoint 2.1.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions