feat: 가정통신문 라벨링 데이터 평가 스크립트 추가 by deli-minju · Pull Request #8 · GACHI-Project/GACHI-AI

deli-minju · 2026-05-25T03:47:02Z

📌 작업 요약

요약:
- 가정통신문 라벨링 JSON 30개를 평가용 정답셋으로 추가
- 라벨링 데이터 기반 /ai/newsletters/analyze 평가 스크립트 추가
- baseline/OpenAI 모드 선택 실행 옵션 추가
- item precision/recall/F1, title/summary/type/date/dateStatus 평가 지표 출력 구현
- mismatch sample 요약 및 JSON 리포트 저장 옵션 추가
- PDF/JPG/PNG 원본 파일과 실행 리포트는 커밋 제외하도록 .gitignore 정리
- OpenAI 기본 모델을 gpt-4.1-mini로 변경
- 평가 데이터 구조와 실행 방법 문서화
관련 이슈: closes [FEAT] 가정통신문 라벨링 데이터 기반 분석 평가 스크립트 추가 #7

🌿 브랜치 정보

Source: feat/#7-newsletter-label-eval
Target: develop (기본)

✅ 체크리스트

브랜치 컨벤션 준수 (feat/refac/hotfix/chore/design/bugfix)
커밋 컨벤션 준수 (feat/fix/refactor/docs/style/chore)
self-review 완료
테스트 및 로컬 실행 확인 완료

🧪 테스트 결과

PS C:\Users\mjmdm\PycharmProjects\GACHI-AI> .\.venv\Scripts\python.exe -m ruff check .

All checks passed!

Summary by CodeRabbit

릴리스 노트

새로운 기능
- 가정통신문 분석 결과를 평가하는 기능 추가 (정확도, 재현율, F1 점수 등 측정 가능)
- 기본 모드(비용 무료)와 OpenAI 모드 지원으로 평가 방식 선택 가능
문서
- 가정통신문 평가 실행 방법 및 라벨 형식 가이드 추가
- 30개 샘플 가정통신문 데이터 및 라벨링 규칙 문서화
기타
- OpenAI 기본 모델 업데이트

coderabbitai · 2026-05-25T03:47:19Z

Warning

Review limit reached

@deli-minju, we couldn't start this review because you've used your available PR reviews for now.

Your plan includes 1 review of capacity. Refill in 41 minutes and 5 seconds.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more review capacity refills, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 82dd3dbd-dded-4223-af83-2725f5592e05

📥 Commits

Reviewing files that changed from the base of the PR and between 55b5261 and 01ba340.

📒 Files selected for processing (7)

data/newsletter-labels/README.md
data/newsletter-labels/newsletter-005.json
data/newsletter-labels/newsletter-014.json
data/newsletter-labels/newsletter-025.json
data/newsletter-labels/newsletter-027.json
data/newsletter-labels/newsletter-029.json
scripts/evaluate_newsletter_labels.py

📝 Walkthrough

Walkthrough

뉴스레터 분석 품질을 평가하는 완전한 시스템을 추가합니다. 30개 라벨링 JSON 데이터셋, 평가 스크립트, 실행 문서를 포함하며, baseline 및 OpenAI 모드를 지원하고 정밀도/재현율/F1 지표를 계산합니다.

Changes

가정통신문 분석 평가 프레임워크

Layer / File(s)	Summary
평가 데이터셋 및 스키마 `data/newsletter-labels/README.md`, `data/newsletter-labels/newsletter-001.json`, `data/newsletter-labels/newsletter-002.json`, `data/newsletter-labels/newsletter-*.json`, `.gitignore`	30개 뉴스레터의 라벨링 JSON 데이터가 추가됩니다. 각 JSON은 문서 메타데이터(`documentId`, `documentTitle`, `documentDate`, `school`), 날짜 후보(`dateCandidates`), 라벨 배열(`labels`)로 구성되며, 라벨은 타입, 제목, 증거 텍스트, 날짜 상태, 대상, 필수 여부 등을 포함합니다. `README.md`에서 파일 매칭 규칙, 커밋 기준, JSON 스키마를 문서화합니다. `.gitignore`에 `reports/`와 이미지 파일 패턴을 추가합니다.
평가 스크립트 및 로직 `scripts/evaluate_newsletter_labels.py`	라벨 데이터를 로드하여 샘플 단위로 정규화하고, 분석 결과와 비교합니다. 각 샘플마다 기대/예측 아이템을 유형, 날짜 상태, 날짜, 제목 유사도 기준으로 매칭합니다. 최종 정밀도, 재현율, F1, 제목/요약 정확도, 날짜 유사도를 집계하고, 불일치 샘플을 리포트로 출력합니다. `--mode openai` 옵션으로 OpenAI 호출 여부를 제어하며, `--fail-under-f1` 옵션으로 기준 미달 시 exit code 1을 반환합니다.
환경설정 및 실행 문서 `app/config.py`, `docs/env.md`, `docs/newsletter-evaluation.md`, `README.md`	`OpenAISettings.from_env()`의 기본 모델을 `gpt-4o-mini`에서 `gpt-4.1-mini`로 변경합니다. `docs/newsletter-evaluation.md`에서 실행 방법, 라벨 형식, 평가 지표, 데이터 관리 원칙을 상세 문서화합니다. `README.md`에 분석 평가 섹션을 추가하고, `docs/env.md`에서 환경변수 기본값을 업데이트합니다.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

GACHI-Project/GACHI-AI#6: app/config.py에서 OpenAI 모델 기본값 변경(gpt-4o-mini → gpt-4.1-mini)이 두 PR 모두 동일하게 적용됩니다.
GACHI-Project/GACHI-AI#5: 평가 스크립트가 NewsletterAnalysisResponse 스키마에 의존하므로, PR #5의 요청/응답 스키마 재구성과 POST /ai/newsletters/analyze 계약 변경과 직접 코드 레벨 관계가 있습니다.

Suggested labels

feat

Suggested reviewers

Hminkyung

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	PR 제목이 주요 변경사항을 명확하게 요약합니다. 가정통신문 라벨링 데이터 평가 스크립트 추가라는 핵심 기능을 간결하게 표현했습니다.
Description check	✅ Passed	PR 설명이 요약, 브랜치 정보, 체크리스트, 테스트 결과를 포함하여 템플릿 구조를 대부분 따릅니다.
Linked Issues check	✅ Passed	PR의 모든 주요 변경사항이 이슈 `#7의` 요구사항과 일치합니다: 30개 라벨링 JSON 추가, 평가 스크립트 구현(baseline/OpenAI 모드 선택), item 단위 precision/recall/F1 지표, mismatch 리포트, OpenAI 호출 제어, 문서화 완료.
Out of Scope Changes check	✅ Passed	모든 변경사항이 이슈 `#7의` 범위 내에 포함됩니다: .gitignore 업데이트(라벨 파일 관리용), OpenAI 기본 모델 변경(평가 스크립트 실행 시 사용), README/docs 확장(평가 방법 문서화).

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/#7-newsletter-label-eval

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

scripts/evaluate_newsletter_labels.py (1)
1-562: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

파이프라인 실패: ruff 포맷팅 필수

파이프라인에서 ruff format --check 검사가 실패했습니다. 머지 전에 다음 명령을 실행해 포맷을 적용해야 합니다.
ruff format scripts/evaluate_newsletter_labels.py
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/evaluate_newsletter_labels.py` around lines 1 - 562, The file fails
the ruff formatting check; run the project formatter and commit the changes: run
`ruff format` on this module (which will reformat imports, spacing, line breaks
and any style issues in functions like main, load_samples, parse_sample,
evaluate_samples, etc.), verify no remaining formatting errors by re-running
`ruff format --check` (or `ruff check`), and commit the resulting changes before
pushing.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@data/newsletter-labels/newsletter-005.json`:
- Around line 1-98: The file's top-level is an array causing inconsistent
structure; update newsletter-005.json to match others by removing the
surrounding array and make the top-level a single object (the existing object
containing "documentId": "doc_005", "documentTitle", "labels", etc.), ensuring
the document object remains intact so downstream loaders expecting an object
(not an array) will parse it consistently.

In `@data/newsletter-labels/newsletter-014.json`:
- Line 2: documentId 값이 다른 파일들과 형식 불일치(doc_0014 vs. doc_008/doc_009 등)를 일으키므로
newsletter-014.json의 "documentId" 값을 패턴에 맞게 수정하세요: 변경 대상은 JSON 키 "documentId"이며
값 "doc_0014"를 "doc_014"가 아닌 프로젝트의 기존 형식(예: "doc_014"이 아니라 다른 파일들과 동일한 자리수 규칙—대부분
"doc_014" 대신 "doc_14"가 아니라면 프로젝트 관례에 맞춘 "doc_014"→"doc_014" 대신 실제 패턴에 일치하도록
수정)으로 통일해야 합니다; 파일 내 "documentId" 값을 프로젝트의 다른 파일들(예:
"doc_008","doc_009","doc_010")과 동일한 숫자 자리수 형식으로 일괄 변경해 주세요.

In `@data/newsletter-labels/newsletter-019.json`:
- Line 4: The documentDate field in this label is "2024-09-04" which differs
from most other labels (2025–2026); verify whether this is intentional or a typo
and, if it’s a typo, update the documentDate value to the correct year (e.g.,
"2026-09-04") in newsletter-019.json; to validate across the dataset, run the
year-distribution check suggested in the review (the provided script that
extracts .documentDate from all JSON labels) and correct any other inconsistent
dates found.

In `@data/newsletter-labels/newsletter-021.json`:
- Around line 14-18: The labels file has mismatched time granularity:
dateCandidates[].resolved includes time (e.g., "2025-06-19T08:00") while
labels[].date is date-only ("2025-06-19"), and
scripts/evaluate_newsletter_labels.py currently truncates values with value[:10]
when comparing; update the project guideline/schema to state a single canonical
rule (either: resolved may include time but labels[].date must be date-only, or
require both fields to be date-only) and make a corresponding change to the
dataset or docs so they match that rule; reference the keys
dateCandidates[].resolved and labels[].date in newsletter-021.json and the
comparison logic in scripts/evaluate_newsletter_labels.py to ensure consistency
(and, if choosing the date-only rule, normalize resolved to date-only across
files or update the evaluator to parse ISO datetimes consistently).

In `@data/newsletter-labels/newsletter-025.json`:
- Around line 6-19: The dateCandidates IDs do not match the documentId: update
the two entries with "id": "dc_024_1" and "id": "dc_024_2" to "dc_025_1" and
"dc_025_2" to align with "documentId": "doc_025", and also update any other
references to those IDs (the reference mentioned around line 36) so all usages
of dc_024_1/dc_024_2 are replaced with dc_025_1/dc_025_2 (ensure you update both
the dateCandidates array and any pointer/lookup that references those IDs).

In `@data/newsletter-labels/newsletter-027.json`:
- Around line 6-13: The date candidate ID "dc_028_1" is inconsistent with the
documentId "doc_027"; rename the dateCandidates entry id to "dc_027_1" and
update any other occurrences or references of "dc_028_1" in this JSON (e.g., the
reference later in the file) to "dc_027_1" so the ID naming is consistent with
documentId ("doc_027").

In `@data/newsletter-labels/newsletter-029.json`:
- Around line 6-19: The dateCandidates IDs are using the wrong document prefix
("dc_027_1", "dc_027_2") while documentId is "doc_029"; update these to
"dc_029_1" and "dc_029_2" respectively and also update every place that
references those old IDs (the references tied to the same date candidate IDs
elsewhere in this JSON: the three other fields that currently point to
"dc_027_1"/"dc_027_2") so all references match the new "dc_029_*" IDs; ensure
you change the values for the id fields inside dateCandidates and any references
that previously pointed to "dc_027_1" or "dc_027_2".

In `@data/newsletter-labels/README.md`:
- Around line 22-31: Add a concrete example object inside the labels array in
the README schema showing all expected fields so validators and devs know the
shape; update the example JSON under "labels" to include a sample label object
with keys: type, title, evidenceText, selectedDateCandidateId, dateStatus, date,
target, actionRequired, and schoolContext, and ensure their sample values
illustrate formats (e.g., ISO date string for date, enum-like strings for
type/dateStatus) so the schema users can implement and validate correctly.

---

Outside diff comments:
In `@scripts/evaluate_newsletter_labels.py`:
- Around line 1-562: The file fails the ruff formatting check; run the project
formatter and commit the changes: run `ruff format` on this module (which will
reformat imports, spacing, line breaks and any style issues in functions like
main, load_samples, parse_sample, evaluate_samples, etc.), verify no remaining
formatting errors by re-running `ruff format --check` (or `ruff check`), and
commit the resulting changes before pushing.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: cee034ea-d0f9-4efe-9940-10367ff32a03

📥 Commits

Reviewing files that changed from the base of the PR and between 6399efe and 55b5261.

📒 Files selected for processing (37)

.gitignore
README.md
app/config.py
data/newsletter-labels/README.md
data/newsletter-labels/newsletter-001.json
data/newsletter-labels/newsletter-002.json
data/newsletter-labels/newsletter-003.json
data/newsletter-labels/newsletter-004.json
data/newsletter-labels/newsletter-005.json
data/newsletter-labels/newsletter-006.json
data/newsletter-labels/newsletter-007.json
data/newsletter-labels/newsletter-008.json
data/newsletter-labels/newsletter-009.json
data/newsletter-labels/newsletter-010.json
data/newsletter-labels/newsletter-011.json
data/newsletter-labels/newsletter-012.json
data/newsletter-labels/newsletter-013.json
data/newsletter-labels/newsletter-014.json
data/newsletter-labels/newsletter-015.json
data/newsletter-labels/newsletter-016.json
data/newsletter-labels/newsletter-017.json
data/newsletter-labels/newsletter-018.json
data/newsletter-labels/newsletter-019.json
data/newsletter-labels/newsletter-020.json
data/newsletter-labels/newsletter-021.json
data/newsletter-labels/newsletter-022.json
data/newsletter-labels/newsletter-023.json
data/newsletter-labels/newsletter-024.json
data/newsletter-labels/newsletter-025.json
data/newsletter-labels/newsletter-026.json
data/newsletter-labels/newsletter-027.json
data/newsletter-labels/newsletter-028.json
data/newsletter-labels/newsletter-029.json
data/newsletter-labels/newsletter-030.json
docs/env.md
docs/newsletter-evaluation.md
scripts/evaluate_newsletter_labels.py

coderabbitai · 2026-05-25T04:00:02Z

+{
+    "documentId": "doc_019",
+    "documentTitle": "학교폭력 예방을 위한 안내문",
+    "documentDate": "2024-09-04",


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

문서 날짜가 과거 연도를 참조하는지 확인이 필요합니다.

documentDate가 "2024-09-04"로 설정되어 있어 대부분의 다른 문서(2025~2026년)와 연도가 다릅니다. 이것이 의도된 과거 데이터인지, 아니면 2026-09-04의 오타인지 확인이 필요합니다.

다음 스크립트로 모든 라벨 데이터의 연도 분포를 확인하세요:

#!/bin/bash # Description: 모든 newsletter 라벨 파일의 documentDate 연도 분포 확인 fd -e json . data/newsletter-labels/ -x jq -r '.documentId + ": " + .documentDate' | sort

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@data/newsletter-labels/newsletter-019.json` at line 4, The documentDate field in this label is "2024-09-04" which differs from most other labels (2025–2026); verify whether this is intentional or a typo and, if it’s a typo, update the documentDate value to the correct year (e.g., "2026-09-04") in newsletter-019.json; to validate across the dataset, run the year-distribution check suggested in the review (the provided script that extracts .documentDate from all JSON labels) and correct any other inconsistent dates found.

coderabbitai · 2026-05-25T04:00:02Z

+            "id": "dc_020_2",
+            "raw": "2025.06.19.(목) 08:00~17:30",
+            "resolved": "2025-06-19T08:00",
+            "note": "체험학습 당일 일정"
+        }


🧹 Nitpick | 🔵 Trivial | 💤 Low value

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Description: resolved 필드에 시간이 포함된 dateCandidate 찾기 fd -e json . data/newsletter-labels/ -x jq -r 'select(.dateCandidates != null) | .documentId + ": " + (.dateCandidates[] | select(.resolved | contains("T")) | .id + " -> " + .resolved)'

Repository: GACHI-Project/GACHI-AI

Length of output: 870

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="data/newsletter-labels/newsletter-021.json" echo "---- file header (first 80 lines) ----" nl -ba "$FILE" | sed -n '1,90p' echo "---- around lines 10-30 ----" nl -ba "$FILE" | sed -n '10,30p' echo "---- around lines 40-60 ----" nl -ba "$FILE" | sed -n '40,70p' echo "---- occurrences of dc_020_2 ----" rg -n "dc_020_2" "$FILE" || true

Repository: GACHI-Project/GACHI-AI

Length of output: 166

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="data/newsletter-labels/newsletter-021.json" echo "---- lines 1-90 (with line numbers via awk) ----" awk 'NR>=1 && NR<=90 {printf "%5d:%s\n", NR, $0}' "$FILE" echo "---- exact lines 14-18 ----" awk 'NR>=14 && NR<=18 {printf "%5d:%s\n", NR, $0}' "$FILE" echo "---- exact lines 47-49 ----" awk 'NR>=47 && NR<=49 {printf "%5d:%s\n", NR, $0}' "$FILE" echo "---- occurrences of dc_020_2 (with surrounding lines) ----" rg -n -C 3 "dc_020_2" "$FILE" || true

Repository: GACHI-Project/GACHI-AI

Length of output: 3610

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "---- search for usage of labels.date ----" rg -n --hidden --glob='!**/node_modules/**' '"date"\s*:\s*' data/newsletter-labels -S || true rg -n --hidden --glob='!**/node_modules/**' 'labels[^\\n\\r]{0,80}\\.(date|selectedDateCandidateId|dateStatus)' . -S || true rg -n --hidden --glob='!**/node_modules/**' 'selectedDateCandidateId' . -S || true rg -n --hidden --glob='!**/node_modules/**' 'dateCandidates' . -S || true echo "---- search for usage of dateCandidates[].resolved ----" rg -n --hidden --glob='!**/node_modules/**' 'resolved' . -S || true echo "---- search for any date parsing/validation patterns ----" rg -n --hidden --glob='!**/node_modules/**' '(ISO|datetime|T\\d\\d:\\d\\d|parse.*date|날짜|시간|timezone|date\\s*format|strict)' . -S || true

Repository: GACHI-Project/GACHI-AI

Length of output: 46797

🏁 Script executed:

#!/bin/bash set -euo pipefail files=( "app/schemas.py" "scripts/evaluate_newsletter_labels.py" "docs/newsletter-labeling-guide.md" "app/services/newsletter_prompt.py" "app/services/newsletter_extractor.py" ) for f in "${files[@]}"; do echo "---- $f (line count) ----" wc -l "$f" done echo "---- app/schemas.py: around Date/Label definitions ----" awk 'NR>=1 && NR<=140 {print NR ":" $0}' app/schemas.py echo "---- scripts/evaluate_newsletter_labels.py: around where datetime/dateCandidates are normalized (lines ~240-360) ----" awk 'NR>=160 && NR<=380 {print NR ":" $0}' scripts/evaluate_newsletter_labels.py echo "---- docs/newsletter-labeling-guide.md: date format rules ----" awk 'NR>=1 && NR<=120 {print NR ":" $0}' docs/newsletter-labeling-guide.md echo "---- app/services/newsletter_prompt.py: dateCandidates/date usage hints ----" awk 'NR>=1 && NR<=180 {print NR ":" $0}' app/services/newsletter_prompt.py

Repository: GACHI-Project/GACHI-AI

Length of output: 20258

🏁 Script executed:

#!/bin/bash set -euo pipefail # Show helper date parsing / normalization and matching logic awk 'NR>=240 && NR<=340 {print NR ":" $0}' scripts/evaluate_newsletter_labels.py echo "---- find and show _date_part definition ----" rg -n "def _date_part" scripts/evaluate_newsletter_labels.py # print around the match line=$(rg -n "def _date_part" scripts/evaluate_newsletter_labels.py | head -n1 | cut -d: -f1) start=$((line-20)); end=$((line+80)) awk -v s=$start -v e=$end 'NR>=s && NR<=e {print NR ":" $0}' scripts/evaluate_newsletter_labels.py echo "---- show _normalize_date and datetime_match logic (match_items) ----" rg -n "def _normalize_date|datetime_match|_match_accuracy|normalize_date" scripts/evaluate_newsletter_labels.py # print around key areas roughly awk 'NR>=380 && NR<=520 {print NR ":" $0}' scripts/evaluate_newsletter_labels.py echo "---- show prediction_to_label_item which fills expected datetime from raw label ----" rg -n "def prediction_to_label_item" scripts/evaluate_newsletter_labels.py line=$(rg -n "def prediction_to_label_item" scripts/evaluate_newsletter_labels.py | head -n1 | cut -d: -f1) start=$((line-20)); end=$((line+140)) awk -v s=$start -v e=$end 'NR>=s && NR<=e {print NR ":" $0}' scripts/evaluate_newsletter_labels.py echo "---- show label item -> expected datetime usage when comparing ----" rg -n "expected\.datetime|predicted.*datetime|datetime_match" scripts/evaluate_newsletter_labels.py awk 'NR>=520 && NR<=620 {print NR ":" $0}' scripts/evaluate_newsletter_labels.py

Repository: GACHI-Project/GACHI-AI

Length of output: 21876

🏁 Script executed:

#!/bin/bash set -euo pipefail awk 'NR>=1 && NR<=140 {print NR ":" $0}' scripts/evaluate_newsletter_labels.py

Repository: GACHI-Project/GACHI-AI

Length of output: 4289

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n "resolved|normalizedDate|datetime|T\\d\\d:\\d\\d|ISO|format.*date" docs scripts app -S

Repository: GACHI-Project/GACHI-AI

Length of output: 3255

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n '"date"\s*:\s*"\d{4}-\d{2}-\d{2}T' data/newsletter-labels -S || true rg -n '"resolved"\s*:\s*"\d{4}-\d{2}-\d{2}T' data/newsletter-labels -S || true

Repository: GACHI-Project/GACHI-AI

Length of output: 1056

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n '"date"\s*:\s*"\d{4}-\d{2}-\d{2}T' data/newsletter-labels -S || true

Repository: GACHI-Project/GACHI-AI

Length of output: 48

newsletter-021.json 날짜/시간 포맷 규칙 정리

dateCandidates[].resolved는 "2025-06-19T08:00"처럼 시간 포함인데, 이를 참조하는 labels[].date는 "2025-06-19"처럼 날짜만 들어갑니다(14-18 vs 47-49).
다만 평가 로직(scripts/evaluate_newsletter_labels.py)은 resolved/date 모두 value[:10]로 잘라 시간은 비교에 쓰지 않으므로, 이 불일치가 “평가로 인한 데이터 손실”로 이어지진 않습니다.
스키마/가이드에 “resolved는 시간 포함 문자열이어도 라벨 date는 날짜만 사용한다(또는 resolved도 date-only 권장)” 같은 규칙을 명확히 해 혼동을 줄이는 쪽을 권장합니다.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@data/newsletter-labels/newsletter-021.json` around lines 14 - 18, The labels file has mismatched time granularity: dateCandidates[].resolved includes time (e.g., "2025-06-19T08:00") while labels[].date is date-only ("2025-06-19"), and scripts/evaluate_newsletter_labels.py currently truncates values with value[:10] when comparing; update the project guideline/schema to state a single canonical rule (either: resolved may include time but labels[].date must be date-only, or require both fields to be date-only) and make a corresponding change to the dataset or docs so they match that rule; reference the keys dateCandidates[].resolved and labels[].date in newsletter-021.json and the comparison logic in scripts/evaluate_newsletter_labels.py to ensure consistency (and, if choosing the date-only rule, normalize resolved to date-only across files or update the evaluator to parse ISO datetimes consistently).

Hminkyung

확인 완료했씁니다!!! 고생하셨습니다~~

feat: 가정통신문 라벨링 데이터 평가 스크립트 추가

55b5261

deli-minju self-assigned this May 25, 2026

deli-minju added the feat 새로운 기능 추가 작업 label May 25, 2026

coderabbitai Bot reviewed May 25, 2026

View reviewed changes

fix: 가정통신문 라벨 데이터 형식 정리

01ba340

deli-minju requested a review from Hminkyung May 25, 2026 04:07

Hminkyung approved these changes May 25, 2026

View reviewed changes

deli-minju merged commit 64c91cd into develop May 25, 2026
3 checks passed

deli-minju deleted the feat/#7-newsletter-label-eval branch May 25, 2026 04:21

deli-minju mentioned this pull request May 25, 2026

[TASK] 릴리즈 배포: develop → main #10

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: 가정통신문 라벨링 데이터 평가 스크립트 추가#8

feat: 가정통신문 라벨링 데이터 평가 스크립트 추가#8
deli-minju merged 2 commits into
developfrom
feat/#7-newsletter-label-eval

deli-minju commented May 25, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 25, 2026 •

edited

Loading

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot May 25, 2026

Uh oh!

coderabbitai Bot May 25, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Hminkyung left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

deli-minju commented May 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📌 작업 요약

🌿 브랜치 정보

✅ 체크리스트

🧪 테스트 결과

Summary by CodeRabbit

릴리스 노트

Uh oh!

coderabbitai Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Hminkyung left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

deli-minju commented May 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 25, 2026 •

edited

Loading