Skip to content

[Feat] : J-POP 곡 한국어 번역 크론 작업 추가 (#186)#187

Merged
GulSam00 merged 3 commits into
developfrom
feat/186-translateSongKo
Apr 10, 2026
Merged

[Feat] : J-POP 곡 한국어 번역 크론 작업 추가 (#186)#187
GulSam00 merged 3 commits into
developfrom
feat/186-translateSongKo

Conversation

@GulSam00
Copy link
Copy Markdown
Owner

@GulSam00 GulSam00 commented Apr 10, 2026

User description

📌 PR 제목

[Feat] : J-POP 곡 한국어 번역 크론 작업 추가

📌 변경 사항

  • J-POP 곡 제목/아티스트 한국어 번역 스크립트 추가 (translationJpn.ts, translateJpnToKo.ts)
    • OpenAI gpt-4o-mini를 사용하여 일본어 → 한국어 번역
    • 일본어가 포함되지 않은 곡(영어 등)은 번역 스킵
    • 이미 번역된 곡 스킵 처리
  • Supabase DB 함수 추가
    • getJpopSongsForTranslationDB(): tag_id=101(J-POP) 곡 조회
    • updateSongKoTranslationDB(): title_ko, artist_ko 업데이트
  • GitHub Actions 워크플로우 추가 (translation_jpn.yml): 매일 자동 실행
  • tagging 크론 작업 재활성화 (tagging_song.yml)
  • pnpm trans-jpn 스크립트 명령어 추가
  • CLAUDE.md 문서 업데이트

💬 추가 참고 사항


PR Type

Enhancement


Description

  • Add J-POP song Korean translation cron job using OpenAI GPT-4o-mini

  • Implement database functions for querying and updating song translations

  • Create GitHub Actions workflow for daily automatic translation execution

  • Re-enable tagging songs workflow with updated schedule

  • Add pnpm trans-jpn script command and documentation


Diagram Walkthrough

flowchart LR
  A["J-POP Songs<br/>tag_id=101"] -->|getJpopSongsForTranslationDB| B["Translation Script<br/>translationJpn.ts"]
  B -->|containsJapanese check| C["Filter Japanese<br/>Content"]
  C -->|translateJpnToKo| D["OpenAI GPT-4o-mini<br/>Translation"]
  D -->|updateSongKoTranslationDB| E["Update DB<br/>title_ko, artist_ko"]
  F["GitHub Actions<br/>translation_jpn.yml"] -->|Daily 14:00 UTC| B
Loading

File Walkthrough

Relevant files
Enhancement
4 files
translationJpn.ts
New J-POP song translation cron script                                     
+75/-0   
translateJpnToKo.ts
OpenAI-based Japanese to Korean translation utility           
+55/-0   
getDB.ts
Add J-POP songs query function for translation                     
+14/-0   
updateDB.ts
Add Korean translation update database function                   
+16/-0   
Configuration changes
3 files
translation_jpn.yml
New GitHub Actions workflow for J-POP translation               
+43/-0   
tagging_song.yml
Re-enable tagging workflow with updated schedule                 
+3/-3     
package.json
Add trans-jpn npm script command                                                 
+1/-0     
Miscellaneous
2 files
taggingSongs.ts
Remove test comment from tagging script                                   
+0/-1     
sitemap-0.xml
Update sitemap timestamp                                                                 
+1/-1     
Documentation
1 files
CLAUDE.md
Update documentation with translation workflow info           
+8/-6     
Formatting
1 files
getSongTag.ts
Reorder import statements for consistency                               
+1/-1     

GulSam00 and others added 2 commits April 11, 2026 01:36
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@GulSam00
Copy link
Copy Markdown
Owner Author

/describe

@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented Apr 10, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
singcode Ready Ready Preview, Comment Apr 10, 2026 4:47pm

@GulSam00
Copy link
Copy Markdown
Owner Author

/review

@GulSam00
Copy link
Copy Markdown
Owner Author

/improve

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented Apr 10, 2026

Code Review by Qodo

🐞 Bugs (8)   📘 Rule violations (0)   📎 Requirement gaps (3)   🎨 UX Issues (0)
🐞\ ≡ Correctness (1) ☼ Reliability (1) ⛨ Security (1) ⚙ Maintainability (3) ➹ Performance (1) ◔ Observability (1) ⭐ New (4)
📎\ ≡ Correctness (1) ➹ Performance (2) ⭐ New (2)

Grey Divider


Action required

1. translationJpn.ts doesn’t update DB 📎
Description
The new translation cron computes Korean translations but never persists them because the DB update
call is commented out. This fails the requirement to write translated values back to
songs.title_ko/songs.artist_ko.
Code

packages/crawling/src/cron/translationJpn.ts[R50-57]

+    console.log('result : ', result);
+    // const success = await updateSongKoTranslationDB(song.id, result.title_ko, result.artist_ko);
+    // if (success) {
+    //   resultsLog.success++;
+    //   console.log(`[OK] ${song.title} → ${result.title_ko} / ${song.artist} → ${result.artist_ko}`);
+    // } else {
+    //   resultsLog.failed++;
+    // }
Evidence
PR Compliance ID 1 requires persisting translated title_ko/artist_ko back to the Song table, but
the only update call is commented out in the new cron script.

Add translation script to update Song.title_ko and Song.artist_ko for missing values
packages/crawling/src/cron/translationJpn.ts[50-57]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`translationJpn.ts` translates titles/artists but does not write results back to Supabase because `updateSongKoTranslationDB(...)` is commented out.

## Issue Context
Compliance requires the script to persist translated values into `songs.title_ko` and `songs.artist_ko`.

## Fix Focus Areas
- packages/crawling/src/cron/translationJpn.ts[41-57]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. J-POP query lacks ko filter 📎
Description
getJpopSongsForTranslationDB() fetches all J-POP songs by tag without filtering to only rows
missing title_ko or artist_ko, then relies on in-script skipping. This violates the requirement
to efficiently target only records needing backfill and may cause unnecessary reads/work.
Code

packages/crawling/src/supabase/getDB.ts[R102-110]

+export async function getJpopSongsForTranslationDB() {
+  const supabase = getClient();
+
+  const { data, error } = await supabase
+    .from('songs')
+    .select('id, title, artist, title_ko, artist_ko, song_tags!inner(tag_id)')
+    .eq('song_tags.tag_id', 101)
+    .limit(50000);
+
Evidence
PR Compliance ID 2 requires querying only Song rows where title_ko or artist_ko is empty. The
added DB query filters only by song_tags.tag_id = 101 and does not add any title_ko/artist_ko
missing/empty filter; the cron script then skips already-translated rows after fetching them.

Translation script only targets Song records missing title_ko or artist_ko
packages/crawling/src/supabase/getDB.ts[102-114]
packages/crawling/src/cron/translationJpn.ts[18-39]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`getJpopSongsForTranslationDB()` currently selects all J-POP songs (tag_id=101) and relies on application-side skipping. Compliance requires filtering at the DB query level to only return rows where `title_ko` OR `artist_ko` is missing/empty.
## Issue Context
The cron job should efficiently backfill only needed records; fetching already-translated rows increases read volume and processing time.
## Fix Focus Areas
- packages/crawling/src/supabase/getDB.ts[102-114]
- packages/crawling/src/cron/translationJpn.ts[18-39]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Translation not persisted 🐞
Description
translationJpn.ts generates translations but never updates songs.title_ko/artist_ko because the
DB update call is commented out, so the scheduled workflow does no effective work. This also
guarantees resultsLog.success remains 0, making the job output misleading.
Code

packages/crawling/src/cron/translationJpn.ts[R50-57]

+    console.log('result : ', result);
+    // const success = await updateSongKoTranslationDB(song.id, result.title_ko, result.artist_ko);
+    // if (success) {
+    //   resultsLog.success++;
+    //   console.log(`[OK] ${song.title} → ${result.title_ko} / ${song.artist} → ${result.artist_ko}`);
+    // } else {
+    //   resultsLog.failed++;
+    // }
Evidence
The cron script imports updateSongKoTranslationDB but the call and the success/failure accounting
are commented out, so no persistence happens. The update function itself is implemented and returns
a boolean, indicating the intended behavior is to write back to DB.

packages/crawling/src/cron/translationJpn.ts[1-3]
packages/crawling/src/cron/translationJpn.ts[50-57]
packages/crawling/src/supabase/updateDB.ts[44-58]
.github/workflows/translation_jpn.yml[41-43]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`packages/crawling/src/cron/translationJpn.ts` computes a translation result but never writes it to Supabase because the update logic is commented out.
### Issue Context
- The workflow `.github/workflows/translation_jpn.yml` runs `pnpm run trans-jpn` daily.
- `updateSongKoTranslationDB()` already exists and returns `true/false`.
### Fix Focus Areas
- packages/crawling/src/cron/translationJpn.ts[50-57]
- packages/crawling/src/supabase/updateDB.ts[44-58]
### What to change
- Uncomment (or re-implement) the `updateSongKoTranslationDB(song.id, ...)` call.
- Increment `resultsLog.success` on successful update, and `resultsLog.failed` otherwise.
- Ensure the final summary reflects actual counts.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (1)
4. Delay/cap bypass on failure 🐞
Description
When translateJpnToKo returns null, the loop continues before processedCount++ and the 200ms
delay, so both the 5,000-item cap and rate-limit protection are skipped for failures. This can cause
the job to hammer OpenAI rapidly and exceed intended per-run limits.
Code

packages/crawling/src/cron/translationJpn.ts[R44-67]

+    if (!result) {
+      resultsLog.failed++;
+      console.log(`[FAIL] ${song.title} - ${song.artist}: 번역 실패`);
+      continue;
+    }
+
+    console.log('result : ', result);
+    // const success = await updateSongKoTranslationDB(song.id, result.title_ko, result.artist_ko);
+    // if (success) {
+    //   resultsLog.success++;
+    //   console.log(`[OK] ${song.title} → ${result.title_ko} / ${song.artist} → ${result.artist_ko}`);
+    // } else {
+    //   resultsLog.failed++;
+    // }
+  } catch (error) {
+    resultsLog.failed++;
+    console.error(`[ERROR] ${song.title} - ${song.artist}:`, error);
+  }
+
+  processedCount++;
+
+  // OpenAI rate limit 대비 딜레이
+  await new Promise(resolve => setTimeout(resolve, 200));
+}
Evidence
The failure branch returns early via continue, skipping the counter increment and delay that occur
later in the loop body. As a result, repeated failures can result in far more than 5,000 OpenAI
calls and effectively no throttling on the failure path.

packages/crawling/src/cron/translationJpn.ts[23-26]
packages/crawling/src/cron/translationJpn.ts[41-48]
packages/crawling/src/cron/translationJpn.ts[63-67]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
On `!result`, the code `continue`s before `processedCount++` and the 200ms delay, bypassing both the per-run limit and throttling on failure.
### Issue Context
This job calls OpenAI and should throttle consistently regardless of success/failure to avoid rate limits and unexpected cost.
### Fix Focus Areas
- packages/crawling/src/cron/translationJpn.ts[41-48]
- packages/crawling/src/cron/translationJpn.ts[63-67]
### What to change
- Restructure the loop so that **every OpenAI attempt** increments `processedCount` and awaits the delay.
- Example: move `processedCount++` and the delay into a `finally` block that runs for all non-skip paths.
- Alternatively: increment/count and delay immediately after the OpenAI call, before any `continue`.
- Keep skip paths (already-translated / no-Japanese) fast (no delay) if desired, but ensure failure paths are throttled.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

5. getJpopSongsForTranslationDB() lacks missing filter 📎
Description
The new DB query pulls all J-POP songs (tag 101) without filtering to rows missing title_ko and/or
artist_ko. This does not meet the requirement to specifically query songs with missing Korean
translation fields.
Code

packages/crawling/src/supabase/getDB.ts[R102-110]

+export async function getJpopSongsForTranslationDB() {
+  const supabase = getClient();
+
+  const { data, error } = await supabase
+    .from('songs')
+    .select('id, title, artist, title_ko, artist_ko, song_tags!inner(tag_id)')
+    .eq('song_tags.tag_id', 101)
+    .limit(50000);
+
Evidence
PR Compliance ID 1 requires querying Song rows where title_ko or artist_ko is missing/empty, but
the added query only filters by tag and limit.

Add translation script to update Song.title_ko and Song.artist_ko for missing values
packages/crawling/src/supabase/getDB.ts[102-110]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`getJpopSongsForTranslationDB()` currently selects all J-POP songs and relies on in-code skipping, instead of querying only rows with missing/empty `title_ko` and/or `artist_ko`.

## Issue Context
Compliance requires the script to query songs missing Korean translation values.

## Fix Focus Areas
- packages/crawling/src/supabase/getDB.ts[102-110]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


6. Skipped count mislabeled 🐞
Description
translationJpn.ts increments resultsLog.skipped for both "already translated" and "non-Japanese"
skip cases, but the final report labels all skipped songs as "이미 번역됨", producing incorrect
operational metrics. This makes it impossible to tell whether skipping is due to prior translations
or due to language filtering.
Code

packages/crawling/src/cron/translationJpn.ts[R69-75]

+// 결과 출력
+console.log(`
+  총 ${songs.length}곡 중:
+  - 스킵 (이미 번역됨): ${resultsLog.skipped}곡
+  - 성공: ${resultsLog.success}곡
+  - 실패: ${resultsLog.failed}곡
+`);
Evidence
Two distinct skip branches both increment resultsLog.skipped, but the final summary prints a
single label implying only one reason (already translated).

packages/crawling/src/cron/translationJpn.ts[27-39]
packages/crawling/src/cron/translationJpn.ts[69-75]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`resultsLog.skipped` merges multiple skip reasons, but the final log labels them only as "already translated".

### Issue Context
This script is intended to run unattended (GitHub Actions), so accurate counters are important for monitoring.

### Fix Focus Areas
- packages/crawling/src/cron/translationJpn.ts[5-9]
- packages/crawling/src/cron/translationJpn.ts[27-39]
- packages/crawling/src/cron/translationJpn.ts[69-75]

### What to change
- Replace `skipped` with at least two counters (e.g., `skippedAlreadyTranslated`, `skippedNonJapanese`).
- Update the final summary output to print both counts with correct labels.
- (Optional) keep `skippedTotal` as a derived value if desired.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


7. Overprivileged workflow token 🐞
Description
translation_jpn.yml grants contents: write permissions even though the job only checks out code,
installs dependencies, writes a local .env, and runs a script; no repository writes are performed.
Unnecessary write permissions increase the blast radius if a dependency or script is compromised.
Code

.github/workflows/translation_jpn.yml[R8-10]

+permissions:
+  contents: write
+
Evidence
The workflow explicitly requests write access to repository contents, but its steps do not include
any commit/push/release operations that require that permission.

.github/workflows/translation_jpn.yml[8-43]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`translation_jpn.yml` requests `contents: write` but does not write to the repo. This is unnecessary privilege.

### Issue Context
GitHub Actions permissions should follow least-privilege to reduce impact of supply-chain or script compromise.

### Fix Focus Areas
- .github/workflows/translation_jpn.yml[8-10]

### What to change
- Remove the `permissions:` block entirely (defaults are usually sufficient), or set:
 - `permissions: { contents: read }`
- Ensure the workflow still runs successfully after tightening permissions.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (2)
8. Doc schedule mismatch 🐞
Description
tagging_song.yml was changed to run at 10:00 UTC, but packages/crawling/CLAUDE.md still documents it
as running at 14:00 UTC. This will mislead anyone debugging cron timing or operational expectations.
Code

.github/workflows/tagging_song.yml[R3-6]

on:
-  # schedule:
-  #   - cron: "0 14 * * *" # 한국 시간 23:00 실행 (UTC+9 → UTC 14:00)
-  # workflow_dispatch:
+  schedule:
+    - cron: "0 10 * * *" # 한국 시간 19:00 실행 (UTC+9 → UTC 10:00)
+  workflow_dispatch:
Evidence
The workflow cron is 0 10 * * * (10:00 UTC), but the documentation table still states the tagging
workflow runs daily at 14:00 UTC.

.github/workflows/tagging_song.yml[3-6]
packages/crawling/CLAUDE.md[120-128]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
Workflow schedule changed, but CLAUDE.md still documents the old schedule for `tagging_song.yml`.

### Issue Context
Operators will rely on CLAUDE.md to understand when cron workflows run.

### Fix Focus Areas
- .github/workflows/tagging_song.yml[4-6]
- packages/crawling/CLAUDE.md[122-126]

### What to change
- Update the CLAUDE.md workflow table row for `tagging_song.yml` to reflect `매일 10:00` (UTC) instead of `매일 14:00`.
- Double-check other rows are still accurate after this PR’s schedule changes.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


9. Tagging schedule docs mismatch 🐞
Description
CLAUDE.md states tagging_song.yml runs daily at 14:00 UTC, but the workflow cron was changed to
10:00 UTC, so the documentation is now wrong and will mislead operators.
Code

packages/crawling/CLAUDE.md[R124-126]

+| `crawl_recent_tj.yml`   | 매일 14:00   | `pnpm recent-tj`  |
+| `tagging_song.yml`      | 매일 14:00   | `pnpm tag-songs`  |
+| `translation_jpn.yml`   | 매일 14:00   | `pnpm trans-jpn`  |
Evidence
The documentation table lists tagging_song.yml as 14:00 UTC, while the actual workflow cron is `0
10 * * *` (10:00 UTC).

packages/crawling/CLAUDE.md[122-126]
.github/workflows/tagging_song.yml[4-6]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`packages/crawling/CLAUDE.md` lists an outdated schedule for `tagging_song.yml`.
### Issue Context
The workflow cron was updated to 10:00 UTC but the table still says 14:00 UTC.
### Fix Focus Areas
- packages/crawling/CLAUDE.md[122-126]
- .github/workflows/tagging_song.yml[4-6]
### What to change
- Update the `tagging_song.yml` row to `매일 10:00` (UTC) or revert the workflow cron to match the doc—pick one source of truth and make both consistent.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Advisory comments

10. Excessive per-song logging 🐞
Description
translationJpn.ts logs the full song object and translation result for every processed item, which
can flood GitHub Actions logs and slow execution when translating up to 5000 songs. This makes it
harder to find actual failures and can cause log truncation.
Code

packages/crawling/src/cron/translationJpn.ts[27]

+  console.log('song : ', song);
Evidence
The script prints verbose logs inside the main loop for each song (and again for each translation
result). At the configured upper bound (5000 processed songs), this can generate a very large volume
of logs.

packages/crawling/src/cron/translationJpn.ts[23-67]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
Per-item `console.log` calls inside the translation loop can generate extremely large GitHub Actions logs.

### Issue Context
This script is designed to run on a schedule; logs should be high-signal (errors + periodic progress + summary).

### Fix Focus Areas
- packages/crawling/src/cron/translationJpn.ts[23-67]

### What to change
- Remove `console.log('song :', song)` and `console.log('result :', result)` or replace with concise logs (e.g., song id + title).
- Optionally log progress every N items (e.g., every 50/100 translations).
- Keep error logs and the final summary.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


11. Generated sitemap timestamp committed 🐞
Description
sitemap-0.xml was changed only to update the lastmod timestamp, which is likely output from
next-sitemap (runs on postbuild) and can create noisy diffs/merge conflicts. If sitemap files
are meant to be generated during build/deploy, this file should not be manually updated in PRs.
Code

apps/web/public/sitemap-0.xml[3]

+<url><loc>https://www.singcode.kr</loc><lastmod>2026-04-10T16:34:54.763Z</lastmod><changefreq>weekly</changefreq><priority>0.7</priority></url>
Evidence
apps/web runs next-sitemap in postbuild, which typically generates/updates sitemap files under
public/. The only change here is a timestamp bump in lastmod.

apps/web/package.json[6-13]
apps/web/public/sitemap-0.xml[1-4]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
A sitemap file appears to have been regenerated (timestamp-only change), creating noisy diffs.
### Issue Context
`apps/web/package.json` runs `next-sitemap` on `postbuild`, which commonly generates these files.
### Fix Focus Areas
- apps/web/public/sitemap-0.xml[1-4]
- apps/web/package.json[6-13]
### What to change
- If sitemaps are generated in CI/CD: revert the sitemap file change from this PR and ensure generation happens in the deployment pipeline.
- If sitemaps must be committed: ensure updates are intentional (and consider a deterministic `lastmod` strategy if possible) and keep file formatting consistent (including newline at EOF).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented Apr 10, 2026

Code Review by Qodo

Grey Divider

New Review Started

This review has been superseded by a new analysis

Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Add J-POP song Korean translation automation with OpenAI integration

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Add J-POP song Korean translation cron job using OpenAI GPT-4o-mini
  - Translates Japanese song titles and artist names to Korean
  - Skips already-translated and non-Japanese songs
• Create new database functions for J-POP song retrieval and translation updates
• Add GitHub Actions workflow for daily automatic translation execution
• Re-enable tagging song cron workflow with updated schedule
• Add pnpm trans-jpn script command and documentation
Diagram
flowchart LR
  A["J-POP Songs<br/>tag_id=101"] -->|getJpopSongsForTranslationDB| B["Translation Script<br/>translationJpn.ts"]
  B -->|Check Japanese<br/>& Skip Logic| C["OpenAI GPT-4o-mini<br/>translateJpnToKo"]
  C -->|title_ko, artist_ko| D["Update Database<br/>updateSongKoTranslationDB"]
  E["GitHub Actions<br/>translation_jpn.yml"] -->|Daily 14:00 UTC| B
Loading

Grey Divider

File Changes

1. packages/crawling/src/cron/translationJpn.ts ✨ Enhancement +75/-0

New J-POP song Korean translation cron script

packages/crawling/src/cron/translationJpn.ts


2. packages/crawling/src/utils/translateJpnToKo.ts ✨ Enhancement +55/-0

OpenAI-based Japanese to Korean translation utility

packages/crawling/src/utils/translateJpnToKo.ts


3. packages/crawling/src/supabase/getDB.ts ✨ Enhancement +14/-0

Add J-POP songs query function for translation

packages/crawling/src/supabase/getDB.ts


View more (8)
4. packages/crawling/src/supabase/updateDB.ts ✨ Enhancement +16/-0

Add Korean translation database update function

packages/crawling/src/supabase/updateDB.ts


5. .github/workflows/translation_jpn.yml ⚙️ Configuration changes +43/-0

New GitHub Actions workflow for daily translation

.github/workflows/translation_jpn.yml


6. .github/workflows/tagging_song.yml ⚙️ Configuration changes +3/-3

Re-enable tagging workflow with updated schedule

.github/workflows/tagging_song.yml


7. packages/crawling/package.json ⚙️ Configuration changes +1/-0

Add trans-jpn npm script command

packages/crawling/package.json


8. packages/crawling/CLAUDE.md 📝 Documentation +8/-6

Document new translation script and workflow

packages/crawling/CLAUDE.md


9. packages/crawling/src/cron/taggingSongs.ts Miscellaneous +0/-1

Remove test comment from tagging script

packages/crawling/src/cron/taggingSongs.ts


10. packages/crawling/src/utils/getSongTag.ts Formatting +1/-1

Reorder import statements for consistency

packages/crawling/src/utils/getSongTag.ts


11. apps/web/public/sitemap-0.xml Miscellaneous +1/-1

Update sitemap timestamp

apps/web/public/sitemap-0.xml


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented Apr 10, 2026

Code Review by Qodo

🐞 Bugs (5)   📘 Rule violations (0)   📎 Requirement gaps (1)   🎨 UX Issues (0)
🐞\ ≡ Correctness (1) ☼ Reliability (1) ⛨ Security (1) ⚙ Maintainability (2)
📎\ ➹ Performance (1)

Grey Divider


Action required

1. J-POP query lacks ko filter 📎
Description
getJpopSongsForTranslationDB() fetches all J-POP songs by tag without filtering to only rows
missing title_ko or artist_ko, then relies on in-script skipping. This violates the requirement
to efficiently target only records needing backfill and may cause unnecessary reads/work.
Code

packages/crawling/src/supabase/getDB.ts[R102-110]

+export async function getJpopSongsForTranslationDB() {
+  const supabase = getClient();
+
+  const { data, error } = await supabase
+    .from('songs')
+    .select('id, title, artist, title_ko, artist_ko, song_tags!inner(tag_id)')
+    .eq('song_tags.tag_id', 101)
+    .limit(50000);
+
Evidence
PR Compliance ID 2 requires querying only Song rows where title_ko or artist_ko is empty. The
added DB query filters only by song_tags.tag_id = 101 and does not add any title_ko/artist_ko
missing/empty filter; the cron script then skips already-translated rows after fetching them.

Translation script only targets Song records missing title_ko or artist_ko
packages/crawling/src/supabase/getDB.ts[102-114]
packages/crawling/src/cron/translationJpn.ts[18-39]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`getJpopSongsForTranslationDB()` currently selects all J-POP songs (tag_id=101) and relies on application-side skipping. Compliance requires filtering at the DB query level to only return rows where `title_ko` OR `artist_ko` is missing/empty.

## Issue Context
The cron job should efficiently backfill only needed records; fetching already-translated rows increases read volume and processing time.

## Fix Focus Areas
- packages/crawling/src/supabase/getDB.ts[102-114]
- packages/crawling/src/cron/translationJpn.ts[18-39]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Translation not persisted 🐞
Description
translationJpn.ts generates translations but never updates songs.title_ko/artist_ko because the
DB update call is commented out, so the scheduled workflow does no effective work. This also
guarantees resultsLog.success remains 0, making the job output misleading.
Code

packages/crawling/src/cron/translationJpn.ts[R50-57]

+    console.log('result : ', result);
+    // const success = await updateSongKoTranslationDB(song.id, result.title_ko, result.artist_ko);
+    // if (success) {
+    //   resultsLog.success++;
+    //   console.log(`[OK] ${song.title} → ${result.title_ko} / ${song.artist} → ${result.artist_ko}`);
+    // } else {
+    //   resultsLog.failed++;
+    // }
Evidence
The cron script imports updateSongKoTranslationDB but the call and the success/failure accounting
are commented out, so no persistence happens. The update function itself is implemented and returns
a boolean, indicating the intended behavior is to write back to DB.

packages/crawling/src/cron/translationJpn.ts[1-3]
packages/crawling/src/cron/translationJpn.ts[50-57]
packages/crawling/src/supabase/updateDB.ts[44-58]
.github/workflows/translation_jpn.yml[41-43]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`packages/crawling/src/cron/translationJpn.ts` computes a translation result but never writes it to Supabase because the update logic is commented out.

### Issue Context
- The workflow `.github/workflows/translation_jpn.yml` runs `pnpm run trans-jpn` daily.
- `updateSongKoTranslationDB()` already exists and returns `true/false`.

### Fix Focus Areas
- packages/crawling/src/cron/translationJpn.ts[50-57]
- packages/crawling/src/supabase/updateDB.ts[44-58]

### What to change
- Uncomment (or re-implement) the `updateSongKoTranslationDB(song.id, ...)` call.
- Increment `resultsLog.success` on successful update, and `resultsLog.failed` otherwise.
- Ensure the final summary reflects actual counts.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Delay/cap bypass on failure 🐞
Description
When translateJpnToKo returns null, the loop continues before processedCount++ and the 200ms
delay, so both the 5,000-item cap and rate-limit protection are skipped for failures. This can cause
the job to hammer OpenAI rapidly and exceed intended per-run limits.
Code

packages/crawling/src/cron/translationJpn.ts[R44-67]

+    if (!result) {
+      resultsLog.failed++;
+      console.log(`[FAIL] ${song.title} - ${song.artist}: 번역 실패`);
+      continue;
+    }
+
+    console.log('result : ', result);
+    // const success = await updateSongKoTranslationDB(song.id, result.title_ko, result.artist_ko);
+    // if (success) {
+    //   resultsLog.success++;
+    //   console.log(`[OK] ${song.title} → ${result.title_ko} / ${song.artist} → ${result.artist_ko}`);
+    // } else {
+    //   resultsLog.failed++;
+    // }
+  } catch (error) {
+    resultsLog.failed++;
+    console.error(`[ERROR] ${song.title} - ${song.artist}:`, error);
+  }
+
+  processedCount++;
+
+  // OpenAI rate limit 대비 딜레이
+  await new Promise(resolve => setTimeout(resolve, 200));
+}
Evidence
The failure branch returns early via continue, skipping the counter increment and delay that occur
later in the loop body. As a result, repeated failures can result in far more than 5,000 OpenAI
calls and effectively no throttling on the failure path.

packages/crawling/src/cron/translationJpn.ts[23-26]
packages/crawling/src/cron/translationJpn.ts[41-48]
packages/crawling/src/cron/translationJpn.ts[63-67]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
On `!result`, the code `continue`s before `processedCount++` and the 200ms delay, bypassing both the per-run limit and throttling on failure.

### Issue Context
This job calls OpenAI and should throttle consistently regardless of success/failure to avoid rate limits and unexpected cost.

### Fix Focus Areas
- packages/crawling/src/cron/translationJpn.ts[41-48]
- packages/crawling/src/cron/translationJpn.ts[63-67]

### What to change
- Restructure the loop so that **every OpenAI attempt** increments `processedCount` and awaits the delay.
 - Example: move `processedCount++` and the delay into a `finally` block that runs for all non-skip paths.
 - Alternatively: increment/count and delay immediately after the OpenAI call, before any `continue`.
- Keep skip paths (already-translated / no-Japanese) fast (no delay) if desired, but ensure failure paths are throttled.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

4. Overprivileged workflow token 🐞
Description
The new translation_jpn.yml grants permissions: contents: write even though it only checks out
the repo and runs a script; this unnecessarily increases the blast radius if a dependency/action is
compromised. The re-enabled tagging_song.yml has the same issue.
Code

.github/workflows/translation_jpn.yml[R8-10]

+permissions:
+  contents: write
+
Evidence
Neither workflow contains any step that pushes commits/releases; they only checkout, install, create
an env file, and run a Node script. Therefore, contents: write is not required for their current
behavior.

.github/workflows/translation_jpn.yml[8-9]
.github/workflows/translation_jpn.yml[15-43]
.github/workflows/tagging_song.yml[8-9]
.github/workflows/tagging_song.yml[15-43]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
Workflows grant `contents: write` without performing any git write operation.

### Issue Context
Principle of least privilege: scheduled workflows run automatically with repository token permissions.

### Fix Focus Areas
- .github/workflows/translation_jpn.yml[8-10]
- .github/workflows/tagging_song.yml[8-10]

### What to change
- Change to `permissions: { contents: read }` or remove the `permissions` block entirely (defaults to read-only in many org configurations).
- If a future step will push commits, add `contents: write` only when that step is introduced and document why.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Tagging schedule docs mismatch 🐞
Description
CLAUDE.md states tagging_song.yml runs daily at 14:00 UTC, but the workflow cron was changed to
10:00 UTC, so the documentation is now wrong and will mislead operators.
Code

packages/crawling/CLAUDE.md[R124-126]

+| `crawl_recent_tj.yml`   | 매일 14:00   | `pnpm recent-tj`  |
+| `tagging_song.yml`      | 매일 14:00   | `pnpm tag-songs`  |
+| `translation_jpn.yml`   | 매일 14:00   | `pnpm trans-jpn`  |
Evidence
The documentation table lists tagging_song.yml as 14:00 UTC, while the actual workflow cron is `0
10 * * *` (10:00 UTC).

packages/crawling/CLAUDE.md[122-126]
.github/workflows/tagging_song.yml[4-6]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`packages/crawling/CLAUDE.md` lists an outdated schedule for `tagging_song.yml`.

### Issue Context
The workflow cron was updated to 10:00 UTC but the table still says 14:00 UTC.

### Fix Focus Areas
- packages/crawling/CLAUDE.md[122-126]
- .github/workflows/tagging_song.yml[4-6]

### What to change
- Update the `tagging_song.yml` row to `매일 10:00` (UTC) or revert the workflow cron to match the doc—pick one source of truth and make both consistent.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Advisory comments

6. Generated sitemap timestamp committed 🐞
Description
sitemap-0.xml was changed only to update the lastmod timestamp, which is likely output from
next-sitemap (runs on postbuild) and can create noisy diffs/merge conflicts. If sitemap files
are meant to be generated during build/deploy, this file should not be manually updated in PRs.
Code

apps/web/public/sitemap-0.xml[3]

+<url><loc>https://www.singcode.kr</loc><lastmod>2026-04-10T16:34:54.763Z</lastmod><changefreq>weekly</changefreq><priority>0.7</priority></url>
Evidence
apps/web runs next-sitemap in postbuild, which typically generates/updates sitemap files under
public/. The only change here is a timestamp bump in lastmod.

apps/web/package.json[6-13]
apps/web/public/sitemap-0.xml[1-4]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
A sitemap file appears to have been regenerated (timestamp-only change), creating noisy diffs.

### Issue Context
`apps/web/package.json` runs `next-sitemap` on `postbuild`, which commonly generates these files.

### Fix Focus Areas
- apps/web/public/sitemap-0.xml[1-4]
- apps/web/package.json[6-13]

### What to change
- If sitemaps are generated in CI/CD: revert the sitemap file change from this PR and ensure generation happens in the deployment pipeline.
- If sitemaps must be committed: ensure updates are intentional (and consider a deterministic `lastmod` strategy if possible) and keep file formatting consistent (including newline at EOF).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

PR Description updated to latest commit (c322d51)

@GulSam00 GulSam00 merged commit 82181e5 into develop Apr 10, 2026
1 of 2 checks passed
Comment on lines +102 to +110
export async function getJpopSongsForTranslationDB() {
const supabase = getClient();

const { data, error } = await supabase
.from('songs')
.select('id, title, artist, title_ko, artist_ko, song_tags!inner(tag_id)')
.eq('song_tags.tag_id', 101)
.limit(50000);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. J-pop query lacks ko filter 📎 Requirement gap ➹ Performance

getJpopSongsForTranslationDB() fetches all J-POP songs by tag without filtering to only rows
missing title_ko or artist_ko, then relies on in-script skipping. This violates the requirement
to efficiently target only records needing backfill and may cause unnecessary reads/work.
Agent Prompt
## Issue description
`getJpopSongsForTranslationDB()` currently selects all J-POP songs (tag_id=101) and relies on application-side skipping. Compliance requires filtering at the DB query level to only return rows where `title_ko` OR `artist_ko` is missing/empty.

## Issue Context
The cron job should efficiently backfill only needed records; fetching already-translated rows increases read volume and processing time.

## Fix Focus Areas
- packages/crawling/src/supabase/getDB.ts[102-114]
- packages/crawling/src/cron/translationJpn.ts[18-39]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +50 to +57
console.log('result : ', result);
// const success = await updateSongKoTranslationDB(song.id, result.title_ko, result.artist_ko);
// if (success) {
// resultsLog.success++;
// console.log(`[OK] ${song.title} → ${result.title_ko} / ${song.artist} → ${result.artist_ko}`);
// } else {
// resultsLog.failed++;
// }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. Translation not persisted 🐞 Bug ≡ Correctness

translationJpn.ts generates translations but never updates songs.title_ko/artist_ko because the
DB update call is commented out, so the scheduled workflow does no effective work. This also
guarantees resultsLog.success remains 0, making the job output misleading.
Agent Prompt
### Issue description
`packages/crawling/src/cron/translationJpn.ts` computes a translation result but never writes it to Supabase because the update logic is commented out.

### Issue Context
- The workflow `.github/workflows/translation_jpn.yml` runs `pnpm run trans-jpn` daily.
- `updateSongKoTranslationDB()` already exists and returns `true/false`.

### Fix Focus Areas
- packages/crawling/src/cron/translationJpn.ts[50-57]
- packages/crawling/src/supabase/updateDB.ts[44-58]

### What to change
- Uncomment (or re-implement) the `updateSongKoTranslationDB(song.id, ...)` call.
- Increment `resultsLog.success` on successful update, and `resultsLog.failed` otherwise.
- Ensure the final summary reflects actual counts.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +44 to +67
if (!result) {
resultsLog.failed++;
console.log(`[FAIL] ${song.title} - ${song.artist}: 번역 실패`);
continue;
}

console.log('result : ', result);
// const success = await updateSongKoTranslationDB(song.id, result.title_ko, result.artist_ko);
// if (success) {
// resultsLog.success++;
// console.log(`[OK] ${song.title} → ${result.title_ko} / ${song.artist} → ${result.artist_ko}`);
// } else {
// resultsLog.failed++;
// }
} catch (error) {
resultsLog.failed++;
console.error(`[ERROR] ${song.title} - ${song.artist}:`, error);
}

processedCount++;

// OpenAI rate limit 대비 딜레이
await new Promise(resolve => setTimeout(resolve, 200));
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

3. Delay/cap bypass on failure 🐞 Bug ☼ Reliability

When translateJpnToKo returns null, the loop continues before processedCount++ and the 200ms
delay, so both the 5,000-item cap and rate-limit protection are skipped for failures. This can cause
the job to hammer OpenAI rapidly and exceed intended per-run limits.
Agent Prompt
### Issue description
On `!result`, the code `continue`s before `processedCount++` and the 200ms delay, bypassing both the per-run limit and throttling on failure.

### Issue Context
This job calls OpenAI and should throttle consistently regardless of success/failure to avoid rate limits and unexpected cost.

### Fix Focus Areas
- packages/crawling/src/cron/translationJpn.ts[41-48]
- packages/crawling/src/cron/translationJpn.ts[63-67]

### What to change
- Restructure the loop so that **every OpenAI attempt** increments `processedCount` and awaits the delay.
  - Example: move `processedCount++` and the delay into a `finally` block that runs for all non-skip paths.
  - Alternatively: increment/count and delay immediately after the OpenAI call, before any `continue`.
- Keep skip paths (already-translated / no-Japanese) fast (no delay) if desired, but ensure failure paths are throttled.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +50 to +57
console.log('result : ', result);
// const success = await updateSongKoTranslationDB(song.id, result.title_ko, result.artist_ko);
// if (success) {
// resultsLog.success++;
// console.log(`[OK] ${song.title} → ${result.title_ko} / ${song.artist} → ${result.artist_ko}`);
// } else {
// resultsLog.failed++;
// }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. translationjpn.ts doesn’t update db 📎 Requirement gap ≡ Correctness

The new translation cron computes Korean translations but never persists them because the DB update
call is commented out. This fails the requirement to write translated values back to
songs.title_ko/songs.artist_ko.
Agent Prompt
## Issue description
`translationJpn.ts` translates titles/artists but does not write results back to Supabase because `updateSongKoTranslationDB(...)` is commented out.

## Issue Context
Compliance requires the script to persist translated values into `songs.title_ko` and `songs.artist_ko`.

## Fix Focus Areas
- packages/crawling/src/cron/translationJpn.ts[41-57]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

@GulSam00 GulSam00 deleted the feat/186-translateSongKo branch April 12, 2026 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Song 테이블 title_ko, artist_ko 번역 업데이트 크론 작업 추가

1 participant