Skip to content

Simplify task_migrate_institution_history_to_raw_institution by removing duplicate existence checks#1297

Merged
robertatakenaka merged 2 commits intocopilot/create-data-migration-taskfrom
copilot/remove-unnecessary-imports
Feb 6, 2026
Merged

Simplify task_migrate_institution_history_to_raw_institution by removing duplicate existence checks#1297
robertatakenaka merged 2 commits intocopilot/create-data-migration-taskfrom
copilot/remove-unnecessary-imports

Conversation

Copy link
Contributor

Copilot AI commented Feb 6, 2026

O que esse PR faz?

Remove verificações de existência duplicadas e imports desnecessários no task_migrate_institution_history_to_raw_institution. Os métodos de migração já tratam internamente querysets vazios via _migrate_history_to_raw().

Mudanças:

  • Remove 4 imports de modelos History usados apenas para .exists() checks
  • Elimina 4 queries .exists() redundantes e lógica condicional aninhada
  • Reduz o código de 47 para 24 linhas mantendo o comportamento correto

Antes:

has_publisher = PublisherHistory.objects.filter(
    journal=journal, institution__isnull=False
).exists()
# ... mais 3 checks similares

if has_publisher or has_owner or has_sponsor or has_copyright_holder:
    if has_publisher:
        migrated = journal.migrate_publisher_history_to_raw()
        stats["publisher_history_migrated"] += len(migrated or [])
    # ... mais condicionais aninhadas

Depois:

total_migrated = 0
migrated = journal.migrate_publisher_history_to_raw()
stats["publisher_history_migrated"] += len(migrated or [])
total_migrated += len(migrated or [])
# ... chamadas diretas para os outros métodos

if total_migrated > 0:
    stats["journals_processed"] += 1

Onde a revisão poderia começar?

journal/tasks.py linhas 15-21 (imports removidos) e 697-719 (lógica simplificada).

Como este poderia ser testado manualmente?

Executar os testes existentes:

make django_test
# ou especificamente:
python manage.py test journal.tests.TestTaskMigrateInstitutionHistoryToRawInstitution

Os testes validam que apenas journals com histórico contendo instituições são contados como processados.

Algum cenário de contexto que queira dar?

O método _migrate_history_to_raw() (journal/models.py:1252) já verifica se history_item.institution existe, retornando early se for None. As verificações .exists() na task eram redundantes e geravam queries extras desnecessárias.

Screenshots

N/A - mudança de código backend.

Quais são tickets relevantes?

Issue relacionada ao problema de simplificação da task de migração.

Referências

  • journal/models.py linhas 1239-1375: implementação dos métodos de migração
  • journal/tests.py linhas 680-907: testes da funcionalidade
Original prompt

Problem Statement

The task task_migrate_institution_history_to_raw_institution in journal/tasks.py can be simplified by removing unnecessary model imports and existence checks. The migration methods on the Journal model (migrate_publisher_history_to_raw(), migrate_owner_history_to_raw(), etc.) already handle checking if there's data to migrate internally, so we don't need to:

  1. Import the History models (CopyrightHolderHistory, OwnerHistory, PublisherHistory, SponsorHistory)
  2. Perform .exists() checks before calling the migration methods

Changes Required

1. Remove unnecessary imports (lines 17, 20, 21, 24)

Remove these imports from the imports section:

  • CopyrightHolderHistory
  • OwnerHistory
  • PublisherHistory
  • SponsorHistory

The import block should change from:

from journal.models import (
    AMJournal,
    CopyrightHolderHistory,
    Journal,
    JournalLicense,
    JournalLogo,
    OwnerHistory,
    PublisherHistory,
    SciELOJournal,
    SponsorHistory,
)

To:

from journal.models import (
    AMJournal,
    Journal,
    JournalLicense,
    JournalLogo,
    SciELOJournal,
)

2. Simplify the migration logic (approximately lines 698-742)

Replace the current logic that checks for existence before migrating:

# Check if journal has any history items with institution set
has_publisher = PublisherHistory.objects.filter(
    journal=journal,
    institution__isnull=False
).exists()
has_owner = OwnerHistory.objects.filter(
    journal=journal,
    institution__isnull=False
).exists()
has_sponsor = SponsorHistory.objects.filter(
    journal=journal,
    institution__isnull=False
).exists()
has_copyright_holder = CopyrightHolderHistory.objects.filter(
    journal=journal,
    institution__isnull=False
).exists()

# Only process if there are items to migrate
if has_publisher or has_owner or has_sponsor or has_copyright_holder:
    # Process PublisherHistory
    if has_publisher:
        migrated = journal.migrate_publisher_history_to_raw()
        stats["publisher_history_migrated"] += len(migrated or [])
    
    # Process OwnerHistory
    if has_owner:
        migrated = journal.migrate_owner_history_to_raw()
        stats["owner_history_migrated"] += len(migrated or [])
    
    # Process SponsorHistory
    if has_sponsor:
        migrated = journal.migrate_sponsor_history_to_raw()
        stats["sponsor_history_migrated"] += len(migrated or [])
    
    # Process CopyrightHolderHistory
    if has_copyright_holder:
        migrated = journal.migrate_copyright_holder_history_to_raw()
        stats["copyright_holder_history_migrated"] += len(migrated or [])
    
    stats["journals_processed"] += 1

With simpler logic that calls the migration methods directly:

# Process all history types directly - the methods handle checking if there's data to migrate
migrated = journal.migrate_publisher_history_to_raw()
stats["publisher_history_migrated"] += len(migrated or [])

migrated = journal.migrate_owner_history_to_raw()
stats["owner_history_migrated"] += len(migrated or [])

migrated = journal.migrate_sponsor_history_to_raw()
stats["sponsor_history_migrated"] += len(migrated or [])

migrated = journal.migrate_copyright_holder_history_to_raw()
stats["copyright_holder_history_migrated"] += len(migrated or [])

stats["journals_processed"] += 1

Rationale

This simplification:

  • Reduces code complexity
  • Removes duplicate logic (existence checking is already handled in the migration methods)
  • Reduces database queries (no need for separate .exists() checks)
  • Makes the code more maintainable by trusting the migration methods to handle their own logic

File to Modify

journal/tasks.py

This pull request was created from Copilot chat.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

…ory_to_raw_institution

Co-authored-by: robertatakenaka <505143+robertatakenaka@users.noreply.github.com>
Copilot AI changed the title [WIP] Remove unnecessary model imports from task migration Simplify task_migrate_institution_history_to_raw_institution by removing duplicate existence checks Feb 6, 2026
Copilot AI requested a review from robertatakenaka February 6, 2026 16:31
@robertatakenaka robertatakenaka marked this pull request as ready for review February 6, 2026 16:35
Copilot AI review requested due to automatic review settings February 6, 2026 16:35
@robertatakenaka robertatakenaka merged commit c120405 into copilot/create-data-migration-task Feb 6, 2026
5 checks passed
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR simplifies task_migrate_institution_history_to_raw_institution by removing redundant History-model imports and pre-migration .exists() checks, relying on the Journal migration methods instead.

Changes:

  • Removed imports of History models previously used only for .exists() checks.
  • Replaced conditional .exists()-gated migration calls with unconditional migration calls and a total_migrated counter.
  • Updated the logic for incrementing journals_processed based on total_migrated.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +698 to 719
# Process all history types directly - the methods handle checking if there's data to migrate
total_migrated = 0

# Only process if there are items to migrate
if has_publisher or has_owner or has_sponsor or has_copyright_holder:
# Process PublisherHistory
if has_publisher:
migrated = journal.migrate_publisher_history_to_raw()
stats["publisher_history_migrated"] += len(migrated or [])

# Process OwnerHistory
if has_owner:
migrated = journal.migrate_owner_history_to_raw()
stats["owner_history_migrated"] += len(migrated or [])

# Process SponsorHistory
if has_sponsor:
migrated = journal.migrate_sponsor_history_to_raw()
stats["sponsor_history_migrated"] += len(migrated or [])

# Process CopyrightHolderHistory
if has_copyright_holder:
migrated = journal.migrate_copyright_holder_history_to_raw()
stats["copyright_holder_history_migrated"] += len(migrated or [])

migrated = journal.migrate_publisher_history_to_raw()
stats["publisher_history_migrated"] += len(migrated or [])
total_migrated += len(migrated or [])

migrated = journal.migrate_owner_history_to_raw()
stats["owner_history_migrated"] += len(migrated or [])
total_migrated += len(migrated or [])

migrated = journal.migrate_sponsor_history_to_raw()
stats["sponsor_history_migrated"] += len(migrated or [])
total_migrated += len(migrated or [])

migrated = journal.migrate_copyright_holder_history_to_raw()
stats["copyright_holder_history_migrated"] += len(migrated or [])
total_migrated += len(migrated or [])

# Only count as processed if at least one history item was migrated
if total_migrated > 0:
stats["journals_processed"] += 1
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

total_migrated is computed from len(migrated) where each migrate_*_history_to_raw() currently returns all history rows (it appends every history_item even when institution is None). This changes behavior: journals that have history rows but no institution set will be counted as journals_processed, and the *_history_migrated stats will be inflated (e.g., this will fail TestTaskMigrateInstitutionHistoryToRawInstitution.test_task_migrate_with_journal_without_institution). Consider counting only rows that actually had institution set (e.g., have the migrate_* methods return only migrated items / migrated count, or in the task use journal.<related>.filter(institution__isnull=False).count() for stats and the processed check).

Copilot uses AI. Check for mistakes.
Comment on lines +701 to +715
migrated = journal.migrate_publisher_history_to_raw()
stats["publisher_history_migrated"] += len(migrated or [])
total_migrated += len(migrated or [])

migrated = journal.migrate_owner_history_to_raw()
stats["owner_history_migrated"] += len(migrated or [])
total_migrated += len(migrated or [])

migrated = journal.migrate_sponsor_history_to_raw()
stats["sponsor_history_migrated"] += len(migrated or [])
total_migrated += len(migrated or [])

migrated = journal.migrate_copyright_holder_history_to_raw()
stats["copyright_holder_history_migrated"] += len(migrated or [])
total_migrated += len(migrated or [])
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling all four migrate_*_history_to_raw() methods unconditionally will execute four full-table SELECTs over the related history managers and iterate every row, even when nothing needs migrating (e.g., all institution fields already NULL). Previously the task short-circuited those cases with cheap .exists() checks. To avoid a potential performance regression on journals with large history tables, consider filtering the querysets inside the migrate_* methods (only rows with institution__isnull=False) or otherwise skipping migration calls when there are no rows to migrate (without requiring extra model imports).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants