[25.04.19 / TASK-157] Feature: 배치, soft delete 기능 이를 위한 테스트 코드 업데이트와 데이터 마이그레이션 추가 #26

Nuung · 2025-04-19T08:07:17Z

🔥 변경 사항

배치에서 soft delete 기능이 작동하도록 변경했습니다.
핵심은 is_active 값을 활용해 비활성/활성 처리를 토글하는 것입니다.
아주 상세한 flow chart 는 노션을 참조해 주세요!

🏷 관련 이슈

없습니다!

📸 스크린샷 (UI 변경 시 필수)

ui 와 관련된 내용이 아니며, 노션을 꼭 참조해 주세요!

📌 체크리스트

기능이 정상적으로 동작하는지 테스트 완료
코드 스타일 가이드 준수 여부 확인
관련 문서 업데이트 완료 (필요 시)

Summary by CodeRabbit

New Features
- 원격 프로덕션 데이터베이스에서 로컬 데이터베이스로 사용자, 게시글, 게시글 통계 데이터를 마이그레이션하는 스크립트가 추가되었습니다.
- 게시글 활성 상태를 API 데이터와 동기화하는 기능이 추가되어, 게시글의 활성/비활성 상태가 최신 상태로 유지됩니다.
Bug Fixes
- 게시글 일괄 업서트 시, 기존 비활성 게시글의 재활성화 및 누락된 게시글의 비활성화 처리가 명확하게 동작하도록 개선되었습니다.
Tests
- 게시글 업서트 및 활성 상태 동기화 동작을 검증하는 비동기 테스트가 새롭게 추가되었습니다.
- 게시글 일일 통계 업데이트 기능에 대한 단위 및 통합 테스트가 추가되었습니다.
- 토큰 갱신 및 사용자 처리 흐름에 대한 포괄적인 비동기 테스트가 도입되었습니다.
- 기존 scraping/tests/test_main.py의 테스트가 제거되고, 기능별로 세분화된 테스트 모듈로 대체되었습니다.

notion-workspace · 2025-04-19T08:07:19Z

삭제된 게시글, 통계에는 어떻게 처리할지 결정 / 대응 개발 필요

coderabbitai · 2025-04-19T08:07:23Z

"""

Walkthrough

이번 변경 사항은 데이터 마이그레이션과 관련된 신규 스크립트 세 개의 추가, 게시글 업서트 로직의 내부 개선, 이에 대한 테스트 코드 보완, 그리고 설정 파일에 주석 형태의 데이터베이스 설정 예시 추가로 구성되어 있습니다. 마이그레이션 스크립트들은 사용자, 게시글, 게시글 통계 데이터를 원격 데이터베이스에서 로컬로 옮기는 절차를 구현합니다. 게시글 업서트 로직은 기존 게시글의 비활성화 및 재활성화 처리를 명확히 하도록 수정되었으며, 이에 대한 비활성화/재활성화 동작을 검증하는 테스트가 추가되었습니다.

Changes

파일 경로	변경 요약
backoffice/settings/local.py	prod 데이터베이스 설정 예시를 주석으로 추가
modules/data_migrate/users_migration.py modules/data_migrate/posts_migration.py modules/data_migrate/statistics_migration.py	사용자, 게시글, 게시글 통계 데이터의 원격→로컬 마이그레이션 스크립트 신규 추가
scraping/main.py	`Scraper` 클래스에 게시글 활성/비활성 상태 동기화 메서드(`sync_post_active_status`) 추가 및 기존 `process_user`에 호출 추가
scraping/tests/test_main.py	기존 `scraping/tests/test_main.py` 전체 삭제
scraping/tests/test_main_posts.py	게시글 관련 `Scraper` 클래스 메서드들에 대한 비동기 단위 테스트 신규 추가
scraping/tests/test_main_statistics.py	게시글 일간 통계 업데이트 기능에 대한 테스트 신규 추가
scraping/tests/test_main_tokens.py	토큰 갱신 및 사용자 처리 관련 `Scraper` 클래스 메서드들에 대한 비동기 단위 테스트 신규 추가

Sequence Diagram(s)

sequenceDiagram
    participant RemoteDB as Remote DB
    participant LocalDB as Local DB
    participant MigrationScript as Migration Script

    MigrationScript->>RemoteDB: 데이터 chunk 조회
    loop 각 chunk별
        MigrationScript->>LocalDB: 기존 데이터 존재 여부 확인
        alt 데이터 존재
            MigrationScript->>LocalDB: 데이터 갱신
        else 데이터 미존재
            alt 관련 사용자/게시글 존재
                MigrationScript->>LocalDB: 신규 데이터 생성
            else 관련 데이터 없음
                MigrationScript->>MigrationScript: 건너뛰기 및 로그
            end
        end
    end
    MigrationScript->>MigrationScript: 처리 결과 요약 출력

sequenceDiagram
    participant Scraper as Scraper
    participant DB as DB

    Scraper->>DB: 해당 사용자의 모든 게시글(활성/비활성) 조회
    Scraper->>Scraper: batch 내 UUID별 분류
    loop batch 내 각 게시글
        alt 활성 게시글 존재
            Scraper->>DB: 필드 업데이트
        else 비활성 게시글 존재
            Scraper->>DB: 재활성화 및 필드 업데이트
        else
            Scraper->>DB: 신규 게시글 생성(활성)
        end
    end
    Scraper->>DB: batch에 없는 기존 활성 게시글 비활성화
    Scraper->>DB: bulk update/create/deactivate 실행

Possibly related PRs

[25.02.07 / TASK-111] Feature - post 모델링 업데이트, 대응 개발 - slug 값 추가 #13: 게시글 모델에 slug 필드 추가 및 비동기 배치 업데이트 구현과 관련, 본 PR에서 게시글 slug 필드 업데이트 및 처리 로직과 연관됨
[25.03.07 / TASK-84] Test - Scraper 테스트 코드 구현 #20: scraping/tests/test_main.py 내 _upsert_batch 관련 비활성화/재활성화 테스트 추가와 관련, 기존 테스트 스위트 확장에 기여

Suggested labels

bug, documentation

Suggested reviewers

cheonHwi
Jihyun3478

Poem

( )
( . .) 데이터 토끼가 폴짝 뛰네
c(")(") 마이그레이션 스크립트 세 개나 추가!
오래된 글은 잠시 쉬고,
다시 돌아오면 반갑게 맞이해주고,
테스트도 꼼꼼히 챙겼지요.
오늘도 DB는 한층 더 튼튼해졌답니다!
🥕
"""

Tip

⚡💬 Agentic Chat (Pro Plan, General Availability)

We're introducing multi-step agentic chat in review comments and issue comments, within and outside of PR's. This feature enhances review and issue discussions with the CodeRabbit agentic chat by enabling advanced interactions, including the ability to create pull requests directly from comments and add commits to existing pull requests.

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4b2b311 and 1303549.

📒 Files selected for processing (8)

modules/data_migrate/posts_migration.py (1 hunks)
modules/data_migrate/statistics_migration.py (1 hunks)
modules/data_migrate/users_migration.py (1 hunks)
scraping/main.py (2 hunks)
scraping/tests/test_main.py (0 hunks)
scraping/tests/test_main_posts.py (1 hunks)
scraping/tests/test_main_statistics.py (1 hunks)
scraping/tests/test_main_tokens.py (1 hunks)

💤 Files with no reviewable changes (1)

scraping/tests/test_main.py

🚧 Files skipped from review as they are similar to previous changes (3)

modules/data_migrate/statistics_migration.py
modules/data_migrate/users_migration.py
modules/data_migrate/posts_migration.py

🧰 Additional context used

🧬 Code Graph Analysis (3)

scraping/main.py (2)

users/models.py (1)

User (10-52)

posts/models.py (1)

Post (8-47)

scraping/tests/test_main_statistics.py (4)

posts/models.py (2)

Post (8-47)

PostDailyStatistics (50-88)

scraping/main.py (1)

update_daily_statistics (211-280)

users/models.py (1)

User (10-52)

utils/utils.py (1)

get_local_now (11-18)

scraping/tests/test_main_posts.py (4)

posts/models.py (1)

Post (8-47)

scraping/main.py (4)

Scraper (24-425)

bulk_upsert_posts (75-95)

_upsert_batch (97-141)

sync_post_active_status (143-209)

users/models.py (1)

User (10-52)

utils/utils.py (1)

get_local_now (11-18)

🔇 Additional comments (6)

scraping/tests/test_main_statistics.py (1)

1-87: 테스트 코드가 잘 작성되었습니다!

테스트 케이스가 명확하게 구성되어 있고, 단위 테스트와 통합 테스트 모두 구현되어 있습니다. 특히 pytest.mark.asyncio와 pytest.mark.django_db를 적절하게 사용하여 비동기 테스트와 데이터베이스 접근을 처리한 점이 좋습니다.

scraping/main.py (2)

143-209: 소프트 삭제 로직이 체계적으로 구현되었습니다.

sync_post_active_status 메서드가 다음과 같은 안전 장치를 포함하여 잘 설계되었습니다:

최소 게시글 개수 검증으로 API 오류 대응

50% 이상의 게시글이 비활성화될 경우 방어 로직 구현

로깅을 통한 모니터링 지원

게시글의 영구 삭제 대신 is_active 플래그를 토글하는 방식의 소프트 삭제 구현이 적절합니다.

361-372: 게시글 활성 상태 동기화 로직이 적절하게 구현되었습니다.

process_user 메서드에서 게시글 목록을 가져온 후 즉시 sync_post_active_status를 호출하여 동기화하는 흐름이 자연스럽고 효율적입니다. 사용자별 게시글 수를 로깅하는 것도 모니터링에 유용합니다.

scraping/tests/test_main_tokens.py (1)

1-328: 토큰 관련 테스트가 포괄적으로 구현되었습니다.

다양한 시나리오(토큰 업데이트 성공, 변경 없음, 만료, 복호화 실패 등)를 모두 테스트하여 견고성을 확보했습니다. 특히 mock 객체를 효과적으로 활용하여 외부 의존성을 제거한 점이 인상적입니다.

run 메서드 테스트에서 비동기 이터레이터를 모킹하는 접근 방식이 훌륭합니다.

scraping/tests/test_main_posts.py (2)

160-236: 게시글 활성 상태 동기화 테스트가 잘 설계되었습니다.

다양한 상태 변경 시나리오(활성 유지, 비활성화, 재활성화, 비활성 유지)를 모두 검증하는 테스트 케이스가 철저합니다. sync_to_async를 적절히 활용하여 Django ORM을 비동기적으로 사용한 점도 좋습니다.

237-277: 안전 임계값 테스트가 중요한 안전 장치를 검증합니다.

대규모 비활성화 시나리오에서 보호 메커니즘이 제대로 작동하는지 확인하는 테스트가 추가된 것은 매우 중요합니다. 이 테스트는 실수로 인한 데이터 손실을 방지하는 안전장치가 제대로 기능하는지 확인합니다.

✨ Finishing Touches

📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (3)

backoffice/settings/local.py (1)

10-12: 개발 환경이라도 CORS_ALLOW_ALL_ORIGINS = True 는 주의 필요

DEBUG=True‑인 로컬 설정이더라도 전체 오리진 허용 설정이 Git 에 커밋되면,
간혹 운영 설정으로 잘못 머지될 위험이 있습니다.

.env 플래그로 제어하거나

CORS_ALLOWED_ORIGINS 에서만 로컬 호스트들을 명시하는 편이 안전합니다.

# DATABASES["prod"] … 주석 또한 무심코 주석 해제될 수 있으므로,
운영 DB 설정은 별도 settings/prod.py 로 분리해 두면 사고를 줄일 수 있습니다.
scraping/main.py (1)
176-186: updated_at 가 갱신되지 않습니다

TimeStampedModel 을 상속한 경우 bulk_update 는 자동으로
updated_at 을 갱신하지 않습니다.
활성/재활성화 배치 모두 update 필드에 updated_at 을 포함해야
정확한 수정 시각이 남습니다.
- ["title", "slug", "released_at"]
+ ["title", "slug", "released_at", "updated_at"]
같은 이유로 비활성화 배치도 updated_at 를 넣어 두는 편이 좋습니다.
modules/data_migrate/users_migration.py (1)

22-35: 대량 데이터에 대비한 커서 / 메모리 최적화 제안

fetchall() 은 청크당 전체 레코드를 메모리에 적재합니다.
cursor.fetchmany(chunk_size) 또는 Django 의 iterator() 패턴을
사용하면 메모리 피크를 더 낮출 수 있습니다.

또한 SELECT … LIMIT … OFFSET … 는 데이터가 커질수록
OFFSET 비용이 급격히 증가합니다.
WHERE id > last_id ORDER BY id LIMIT … 방식이 더 효율적입니다.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 27b4b2f and 00b3a7e.

📒 Files selected for processing (6)

backoffice/settings/local.py (1 hunks)
modules/data_migrate/posts_migration.py (1 hunks)
modules/data_migrate/statistics_migration.py (1 hunks)
modules/data_migrate/users_migration.py (1 hunks)
scraping/main.py (1 hunks)
scraping/tests/test_main.py (1 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (3)

modules/data_migrate/users_migration.py (1)

users/models.py (1)

User (10-52)

modules/data_migrate/posts_migration.py (2)

posts/models.py (1)

Post (8-47)

users/models.py (1)

User (10-52)

modules/data_migrate/statistics_migration.py (1)

posts/models.py (2)

Post (8-47)

PostDailyStatistics (50-88)

🔇 Additional comments (15)

scraping/tests/test_main.py (2)

269-341: 비활성화 테스트 로직이 명확합니다.

test_upsert_batch_deactivates_old_posts 메서드는 배치 처리 로직이 이전에 존재하던 게시글 중 새 배치에 포함되지 않은 게시글을 적절히 비활성화하는지 검증합니다. 활성 상태인 게시글 3개를 생성하고 그 중 하나만 배치에 포함시켜 나머지 게시글이 비활성화되는지 확인하는 테스트 흐름이 명확하게 작성되었습니다.

342-388: 재활성화 테스트 로직이 명확합니다.

test_upsert_batch_reactivates_inactive_posts 메서드는 비활성화된 게시글이 새 배치에 포함될 경우 재활성화되는지 검증합니다. 비활성 상태인 게시글을 생성하고 이를 배치에 포함시켜 활성화되는지 확인하는 과정이 잘 구현되어 있습니다. 게시글의 제목, 슬러그 등의 정보도 올바르게 업데이트되는지 확인하는 검증 로직도 적절합니다.

modules/data_migrate/statistics_migration.py (6)

1-6: 스크립트 목적이 명확합니다.

스크립트의 목적과 기능이 주석을 통해 명확하게 설명되어 있습니다. 게시글 통계 데이터를 원격 데이터베이스에서 로컬 데이터베이스로 마이그레이션하는 역할을 수행합니다.

15-22: 초기 설정 및 카운터 변수들이 적절합니다.

마이그레이션 처리를 위한 청크 크기와 다양한 카운터 변수들이 명확하게 정의되어 있습니다. 새로 생성된 레코드, 업데이트된 레코드, 건너뛴 레코드를 추적하는 변수들이 포함되어 있어 마이그레이션 진행 상황을 모니터링하기에 좋습니다.

23-40: 청크 단위 데이터 조회 로직이 효율적입니다.

대용량 데이터 처리를 위해 청크 단위로 데이터를 조회하는 방식을 사용하고 있어 메모리 효율성이 좋습니다. SQL 쿼리에서 posts_postdailystatistics와 posts_post 테이블을 조인하여 필요한 정보를 한 번에 가져오는 것도 효율적입니다.

43-81: 데이터 마이그레이션 로직이 적절합니다.

트랜잭션을 사용하여 데이터 일관성을 유지하며, 각 통계 데이터에 대해:

post_uuid를 이용해 로컬 게시글을 찾고

이미 존재하는 통계 정보는 업데이트하고

없는 경우 새로 생성하는 로직이 잘 구현되어 있습니다.

예외 처리와 로깅도 적절히 포함되어 있어 마이그레이션 중 발생할 수 있는 문제를 파악하기 쉽게 되어 있습니다.

88-92: 결과 요약 로그가 유용합니다.

마이그레이션 완료 후 처리 결과에 대한 요약 정보(새로 생성된 개수, 업데이트된 개수, 건너뛴 개수, 총 처리된 레코드 수)를 출력하여 마이그레이션 결과를 한눈에 파악할 수 있게 해줍니다.

93-95: 예외 처리가 적절합니다.

전체 마이그레이션 과정을 try-except 블록으로 감싸서 예외 발생 시 스크립트가 중단되지 않고 오류 메시지를 출력하도록 하여 문제 진단이 용이합니다.

modules/data_migrate/posts_migration.py (7)

1-6: 스크립트 목적이 명확합니다.

스크립트의 목적과 기능이 주석을 통해 명확하게 설명되어 있습니다. 게시글 데이터를 원격 데이터베이스에서 로컬 데이터베이스로 마이그레이션하는 역할을 수행합니다.

15-23: 초기 설정 및 카운터 변수들이 적절합니다.

마이그레이션 처리를 위한 청크 크기와 다양한 카운터 변수들이 명확하게 정의되어 있습니다. 새로 생성된 레코드, 업데이트된 레코드, 건너뛴 레코드를 추적하는 변수들이 포함되어 있어 마이그레이션 진행 상황을 모니터링하기에 좋습니다.

24-41: 청크 단위 데이터 조회 로직이 효율적입니다.

대용량 데이터 처리를 위해 청크 단위로 데이터를 조회하는 방식을 사용하고 있어 메모리 효율성이 좋습니다. SQL 쿼리에서 posts_post와 users_user 테이블을 조인하여 필요한 정보를 한 번에 가져오는 것도 효율적입니다.

44-88: 데이터 마이그레이션 로직이 적절합니다.

트랜잭션을 사용하여 데이터 일관성을 유지하며, 각 게시글 데이터에 대해:

post_uuid를 이용해 로컬에 이미 존재하는 게시글을 확인하고

velog_uuid를 이용해 로컬 사용자를 찾고

사용자가 존재하는 경우에만 게시글을 처리하는 흐름이 적절합니다.

is_active 필드를 포함하여 게시글의 활성 상태를 유지하는 것도 soft delete 기능에 부합합니다.

로깅도 풍부하게 포함되어 있어 마이그레이션 진행 상황을 모니터링하기 쉽습니다.

95-99: 결과 요약 로그가 유용합니다.

마이그레이션 완료 후 처리 결과에 대한 요약 정보(새로 생성된 개수, 업데이트된 개수, 건너뛴 개수, 총 처리된 레코드 수)를 출력하여 마이그레이션 결과를 한눈에 파악할 수 있게 해줍니다.

100-102: 예외 처리가 적절합니다.

전체 마이그레이션 과정을 try-except 블록으로 감싸서 예외 발생 시 스크립트가 중단되지 않고 오류 메시지를 출력하도록 하여 문제 진단이 용이합니다.

48-72: 기존 게시글 업데이트 로직이 soft delete 기능을 잘 지원합니다.

기존에 존재하는 게시글의 정보를 업데이트할 때 is_active 필드를 포함하여 게시글의 활성화 상태를 유지하거나 변경함으로써 PR 목표인 soft delete 기능을 적절히 지원합니다. 이는 scraping/tests/test_main.py에 추가된 테스트와도 일관성이 있습니다.

scraping/main.py

modules/data_migrate/users_migration.py

ooheunda

코드 잘 봤습니다. migration은 로컬 테스트시 정말 유용하게 쓸 것 같아요!
큰 문제는 없어 보이고, 개인적으로 궁금해서 남긴 것 하나만 답변 부탁드릴게요.
고생하셨어요! 👍 🔥

좋았던 점

/data_migrate
- 주석과 print 내용이 간결하고 명확해 이해하기 쉬웠습니다.
- 로컬 데이터와 확실히 비교하고 청크별 하나의 트랜잭션으로 동기화하고 있어 문제 없을 것 같아요.
- 노션에 따로 작성하신 문서도 이해하기 쉽게 써져 있어서 너무 좋았습니다.
게시글 soft_delete 관련
- 방어 로직 추가하신게 섬세하게 핸들링 되는 것 같아 좋았습니다.
- deactivate, reactivate 두 작업이 명확하고 보기 좋게 분리되어 있어요.
- 테스트 코드는 새로 추가하신 두 케이스만 보았는데요, 이도 주석이 상세하고 읽기 좋았습니다.

ooheunda · 2025-04-22T07:57:48Z

scraping/main.py

+            # 너무 많은 게시글이 비활성화되는 경우 방어 로직
+            active_posts_count = Post.objects.filter(
+                user=user, is_active=True
+            ).count()
+
+            if (
+                active_posts_count > 0
+                and deactivation_count / active_posts_count > 0.5
+            ):
+                logger.warning(
+                    f"Suspicious deactivation detected for user {user.velog_uuid}: "
+                    f"Would deactivate {deactivation_count} out of {active_posts_count} posts. "
+                    f"Skipping post status sync as a safety measure."
+                )
+                return


비율로 방어 로직 추가하신게 섬세해서 너무 좋은 것 같아요.
예를 들어 유저가 3개의 글 중 2개를 직접 비활성화 했다면, 그 유저는 글을 하나 더 발행하기 전까진 이 작업이 계속 skip 될 것 같은데 제가 이해한 게 맞을까요? (+추가: 테스트 코드에 작성하신 주석을 보니 맞는 것 같네요!) 이런 케이스에 대해선 어떻게 처리하려고 하셨는지 궁금합니다.
VD에 헤비 유저가 많고, 저런 경우도 한 배치 사이클(45분) 내에 아주아주아주아주 드물 것 같긴 한데, 이런 경우는 로그 모니터링 정도로 대응해도 괜찮은 걸까요?

흐흐 아주 중요한 부분 짚어주셔서 감사해요. 알아봐주셔서 감격 ㅎ
" 3개의 글 중 2개를 직접 비활성화 했다면, 그 유저는 글을 하나 더 발행하기 전까진 이 작업이 계속 skip" -> 아주 정확하게 맞습니다! 일부러 좀 노렸어요. 정확하게는 "방어 로직" 을 생각하다 보니까 애초에 글이 적은 사람이 불리한 방어 로직이네? 싶었어요.

근데 이제 서비스 관점에서 언급한 것과 같이 3개 쓰고 2개 비활성화 하는 사람이 정말.. 대시보드 통계가 궁금할까..? 쪽이었고, 일단 주관으로 "아니다" 라고 결론을 내렸네요.

근데 [ 만약 3개 쓰고 2개 비활성화 하는 유저가 나타난다면! 그때 바꾸자! ] 에 가까운 것 같아요 :)

결론 내리신 부분 이해 했습니다! 저도 같은 생각입니다ㅎㅎ 👍

six-standard

코드 잘 읽었습니다!

좋았던 점

Migration 파일이 생겨서 이전보다 테스트 환경이 훨씬 개선될 듯 합니다.
(프론트 쪽은 실제 뷰를 봐야 하다 보니 프로덕션 서버에 붙이는 일이 많았는데, 이런 개선점은 정말 환영할 만한 부분입니다!)
유저 글 비활성화에 대한 안전 영역이 설정되어있던데, 개인적으로 생각해보지 못한 부분이라 놀랍네요.
- 다만 이 안전 영역 부분은 조금 더 변경하기 쉽도록 따로 상수나 환경 변수 값으로 빼는 것도 괜찮지 않을까 싶습니다. (유저들의 활동이 저희의 예측과 조금 다를 수 있으니까요)

Jihyun3478

좋았던 점

신규 데이터와 과거 데이터를 비교해서 soft delete를 구현하신 부분이 인상깊었습니다. 노션과 함께 코드를 살펴보니 여러 문제 해결 방법 중 해당 방법을 택하신 이유가 이해가 되는 것 같습니다! 신입 개발자로서 배워야 할 부분인 것 같네요:)
이제 운영계 데이터로 로컬 테스팅이 가능한 점이 신기하네요! 항상 자동화 설정 세팅을 잘 해두신 것 같아 이 부분은 빠르게 f/u 해야겠습니다!!

Jihyun3478 · 2025-04-24T02:10:10Z

scraping/main.py

+            logger.warning(
+                f"Skipping post status sync for user {user.velog_uuid} - Too few posts returned ({len(current_post_ids)})"
+            )
+            return


API 응답이 너무 적을 경우, is_active를 변경하지 않는다는 말씀이신가요?
최소 개수 이상의 기준을 세우신 이유가 어떤 상황을 고려하신건지 설명해주실 수 있으신가요?!

#26 (comment)

@ooheunda 님과 같은 질문 주셔서 영광임다! 위 코멘트로 대체 할게요!

Jihyun3478 · 2025-04-24T02:12:20Z

scraping/main.py

+                )
+                posts_to_reactivate.update(is_active=True)
+
+        await _execute_sync()


신규 데이터에 없는데 과거 데이터에는 있을 경우만 고려하신게 아니라, 반대의 상황 즉 활성화까지 고려하신 부분이 놀라웠습니다!
비즈니스 로직 설계 특히 배치 쪽은 다른 로직보다 복잡한데 이런 설계의 구조와 방향을 정하실 때, 현우님만의 기준이 있을까요??

사실 비즈니스로직 설계 때 보다 다루는 데이터의양, 그리고 사실 이게 핵심 비즈니스 로직에 가깝기때문에 저는 flow chart 와 시퀀스 다이어그램에 진심으로 힘을 쏟습니다!

저희 특성상 아직까진 API 에 엄청 복잡한 로직이 없을 뿐이지, 앞으로 만약 생긴다면 똑같이 갈거에요!! 제 기준이 있다기 보단 계속 엣지 케이스에 집착하다보니까 이렇게 까지 된 것 같아요.

이 배치가 잘 못 되었을 경우 사실 API 는 죄없이 다 터질꺼고, 그러면 client side 도 억울하게 다 터지니까요 ㅎㅎ

Jihyun3478 · 2025-04-24T02:14:13Z

scraping/tests/test_main_posts.py

테스트 파일을 3개로 분리하셨군요!
안그래도 main 파일 테스트코드 구현하면서 코드 길이가 다소 길다고 느꼈었는데 분리하니 역할이 명확히 보여 좋은 것 같습니다!

Nuung added 2 commits April 19, 2025 16:05

feature: 배치 과정 중 삭제 판단되는 게시글을 비활성 처리, 오판단된 비활성은 다시 살릴 수 있는 로직도 추가

f64e02b

feature: 운영계 에서 로컬로, 데이터 마이그레이션 하는 코드 추가

00b3a7e

Nuung added the enhancement New feature or request label Apr 19, 2025

Nuung requested review from Jihyun3478, ooheunda and six-standard April 19, 2025 08:07

Nuung self-assigned this Apr 19, 2025

coderabbitai bot reviewed Apr 19, 2025

View reviewed changes

scraping/main.py Outdated Show resolved Hide resolved

scraping/main.py Outdated Show resolved Hide resolved

modules/data_migrate/users_migration.py Show resolved Hide resolved

Nuung added 2 commits April 19, 2025 17:12

modify: 마이그레이션, 통계는 7일로 컷팅

4b2b311

feature: 테스트 규모가 커짐에 따라 집중 테스팅 섹션과 파일 분리

1303549

ooheunda mentioned this pull request Apr 21, 2025

[25.04.22 / TASK-100] Feature - Leaderboard 도메인 추가 및 조회 API 구현 Check-Data-Out/velog-dashboard-v2-api#27

Merged

3 tasks

ooheunda approved these changes Apr 22, 2025

View reviewed changes

six-standard approved these changes Apr 23, 2025

View reviewed changes

Jihyun3478 approved these changes Apr 24, 2025

View reviewed changes

Nuung merged commit 9ffde97 into main Apr 24, 2025
2 checks passed

Nuung deleted the feature/batch-post-soft-delete branch April 24, 2025 14:23

[25.04.19 / TASK-157] Feature: 배치, soft delete 기능 이를 위한 테스트 코드 업데이트와 데이터 마이그레이션 추가 #26

[25.04.19 / TASK-157] Feature: 배치, soft delete 기능 이를 위한 테스트 코드 업데이트와 데이터 마이그레이션 추가 #26

Uh oh!

Conversation

Nuung commented Apr 19, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔥 변경 사항

🏷 관련 이슈

📸 스크린샷 (UI 변경 시 필수)

📌 체크리스트

Summary by CodeRabbit

Uh oh!

notion-workspace bot commented Apr 19, 2025

Uh oh!

coderabbitai bot commented Apr 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ooheunda left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

좋았던 점

Uh oh!

ooheunda Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Nuung Apr 22, 2025

Choose a reason for hiding this comment

Uh oh!

ooheunda Apr 23, 2025

Choose a reason for hiding this comment

Uh oh!

six-standard left a comment

Choose a reason for hiding this comment

좋았던 점

Uh oh!

Jihyun3478 left a comment

Choose a reason for hiding this comment

좋았던 점

Uh oh!

Jihyun3478 Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

Nuung Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

Jihyun3478 Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

Nuung Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

Jihyun3478 Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Nuung commented Apr 19, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 19, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

ooheunda left a comment •

edited

Loading

ooheunda Apr 22, 2025 •

edited

Loading