Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThe PR adds two GitHub Actions CI/CD deployment workflows (production and staging) while removing numerous legacy data-crawling modules (notices, holidays, galleries, calendars, lectures), an entire Koreatech bus timetable Go service, and image scaling infrastructure. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (4)
.github/workflows/deploy-stage.yml (1)
26-28: Consider excluding the archive itself to prevent issues on re-runs.If this workflow runs again in the same workspace (e.g., due to caching or self-hosted runners), the tar command could include a stale
batch.tar.gzfrom a previous run.Proposed fix
- name: Create tar archive run: | - tar -cvzf batch.tar.gz --exclude='.github' . + tar -cvzf batch.tar.gz --exclude='.github' --exclude='batch.tar.gz' .🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/deploy-stage.yml around lines 26 - 28, The tar step named "Create tar archive" currently archives the whole workspace and may include an existing batch.tar.gz from previous runs; update the tar invocation used in that step to exclude the archive itself (e.g., add an --exclude for batch.tar.gz or a pattern like *.tar.gz) so the produced archive cannot contain a stale batch.tar.gz from earlier workflow executions..github/workflows/deploy-prod.yml (3)
26-28: Same tar archive concern as staging workflow.Consider excluding
batch.tar.gzto prevent potential issues on re-runs.Proposed fix
- name: Create tar archive run: | - tar -cvzf batch.tar.gz --exclude='.github' . + tar -cvzf batch.tar.gz --exclude='.github' --exclude='batch.tar.gz' .🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/deploy-prod.yml around lines 26 - 28, The tar step named "Create tar archive" currently archives everything and can include an existing batch.tar.gz on reruns; update the tar invocation used in the "Create tar archive" step to exclude the output archive itself (batch.tar.gz) by adding an exclude for 'batch.tar.gz' so the archive won't recursively include or overwrite itself during subsequent runs.
3-8: Consider restricting branch selection inworkflow_dispatchinputs.The current
if: github.ref == 'refs/heads/master'guard silently skips the job if triggered from a non-master branch. Users may accidentally trigger from the wrong branch and see a skipped job without explanation. You could add an explicit branch restriction or provide feedback.Option: Add input with explicit restriction
on: workflow_dispatch: + # Note: This workflow only runs on the master branch. + # If triggered from another branch, the deploy job will be skipped.Alternatively, you could remove the
ifcondition entirely since GitHub allows limitingworkflow_dispatchto specific branches in repository settings, which provides a better UX.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/deploy-prod.yml around lines 3 - 8, The job currently uses a silent guard "if: github.ref == 'refs/heads/master'" which can lead to confusing skipped runs when triggered via workflow_dispatch; add a workflow_dispatch input (e.g., inputs.branch with a default of "master" and allowed choices) and update the deploy job's condition to explicitly validate the input when the event is a manual dispatch (use github.event_name == 'workflow_dispatch' and github.event.inputs.branch == 'master') while preserving the original ref check for non-dispatch runs, or alternatively remove the silent if and rely on repository-level branch restrictions—modify the workflow_dispatch block and the deploy job's if condition (referencing workflow_dispatch, inputs.branch, deploy job, and the existing if github.ref check) to implement one of these options so users get an explicit branch choice or clear behavior.
1-68: Consider extracting a reusable workflow to reduce duplication.Both
deploy-stage.ymlanddeploy-prod.ymlshare nearly identical logic. This duplication increases maintenance burden. You could create a reusable workflow (composite action or called workflow) that accepts environment-specific inputs/secrets.Example structure for a reusable workflow:
# .github/workflows/deploy-reusable.yml on: workflow_call: inputs: environment: required: true type: string secrets: SSH_HOST: required: true # ... other secretsThen both workflows can call it with their respective environment parameters.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/deploy-prod.yml around lines 1 - 68, Extract the duplicated "deploy" job steps (the tar creation, scp transfer, ssh deploy, and Slack notifications currently in the deploy job and steps like "Notify Slack deploy started", "Create tar archive", "Transfer archive via SCP", "Run deploy script via SSH", "Notify Slack deploy result") into a reusable workflow that uses on: workflow_call with inputs for environment, repo/branch/actor/run id (or derive via github context) and required secrets (PROD_SSH_HOST/PROD_SSH_USER/PROD_SSH_KEY/PROD_SSH_PORT/PROD_DEPLOY_PATH/PROD_DEPLOY_SCRIPT_PATH and PROD_SLACK_DEPLOY_WEBHOOK_URL as generic names like SSH_HOST, SSH_USER, SSH_KEY, SSH_PORT, DEPLOY_PATH, DEPLOY_SCRIPT_PATH, SLACK_WEBHOOK); then replace deploy-prod.yml and deploy-stage.yml with minimal caller workflows that call the new reusable workflow, passing environment-specific inputs and mapping the appropriate secrets from each environment, and ensure the reusable workflow preserves the start time/output behavior used by "Notify Slack deploy result".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/deploy-prod.yml:
- Around line 49-62: The duration calculation in the "Notify Slack deploy
result" step assumes steps.start.outputs.time exists and will fail if the
"Record start time" step didn't produce it; update the run block to first
compute a START timestamp variable by checking whether steps.start.outputs.time
is non-empty and using it, otherwise fallback to the current epoch seconds, then
compute DURATION, MINUTES and SECONDS from that START value (references: the run
block variables DURATION, MINUTES, SECONDS and steps.start.outputs.time in the
"Notify Slack deploy result" step).
In @.github/workflows/deploy-stage.yml:
- Around line 49-62: The notification step can fail if steps.start.outputs.time
is empty; update the "Notify Slack deploy result" step (where DURATION and
START_TIME are computed) to defensively handle a missing start time by checking
if steps.start.outputs.time is empty and using a safe fallback (e.g., treat
missing start time as now or 0) before performing arithmetic, e.g., set a
START_TIME variable from steps.start.outputs.time and use shell parameter
expansion to default it (START_TIME=${START_TIME:-0}) or skip duration
calculation when empty, then compute DURATION/MINUTES/SECONDS only when
START_TIME is valid and continue to use job.status/EMOJI/STATUS as before.
---
Nitpick comments:
In @.github/workflows/deploy-prod.yml:
- Around line 26-28: The tar step named "Create tar archive" currently archives
everything and can include an existing batch.tar.gz on reruns; update the tar
invocation used in the "Create tar archive" step to exclude the output archive
itself (batch.tar.gz) by adding an exclude for 'batch.tar.gz' so the archive
won't recursively include or overwrite itself during subsequent runs.
- Around line 3-8: The job currently uses a silent guard "if: github.ref ==
'refs/heads/master'" which can lead to confusing skipped runs when triggered via
workflow_dispatch; add a workflow_dispatch input (e.g., inputs.branch with a
default of "master" and allowed choices) and update the deploy job's condition
to explicitly validate the input when the event is a manual dispatch (use
github.event_name == 'workflow_dispatch' and github.event.inputs.branch ==
'master') while preserving the original ref check for non-dispatch runs, or
alternatively remove the silent if and rely on repository-level branch
restrictions—modify the workflow_dispatch block and the deploy job's if
condition (referencing workflow_dispatch, inputs.branch, deploy job, and the
existing if github.ref check) to implement one of these options so users get an
explicit branch choice or clear behavior.
- Around line 1-68: Extract the duplicated "deploy" job steps (the tar creation,
scp transfer, ssh deploy, and Slack notifications currently in the deploy job
and steps like "Notify Slack deploy started", "Create tar archive", "Transfer
archive via SCP", "Run deploy script via SSH", "Notify Slack deploy result")
into a reusable workflow that uses on: workflow_call with inputs for
environment, repo/branch/actor/run id (or derive via github context) and
required secrets
(PROD_SSH_HOST/PROD_SSH_USER/PROD_SSH_KEY/PROD_SSH_PORT/PROD_DEPLOY_PATH/PROD_DEPLOY_SCRIPT_PATH
and PROD_SLACK_DEPLOY_WEBHOOK_URL as generic names like SSH_HOST, SSH_USER,
SSH_KEY, SSH_PORT, DEPLOY_PATH, DEPLOY_SCRIPT_PATH, SLACK_WEBHOOK); then replace
deploy-prod.yml and deploy-stage.yml with minimal caller workflows that call the
new reusable workflow, passing environment-specific inputs and mapping the
appropriate secrets from each environment, and ensure the reusable workflow
preserves the start time/output behavior used by "Notify Slack deploy result".
In @.github/workflows/deploy-stage.yml:
- Around line 26-28: The tar step named "Create tar archive" currently archives
the whole workspace and may include an existing batch.tar.gz from previous runs;
update the tar invocation used in that step to exclude the archive itself (e.g.,
add an --exclude for batch.tar.gz or a pattern like *.tar.gz) so the produced
archive cannot contain a stale batch.tar.gz from earlier workflow executions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a6192775-5f7b-4130-96e0-9f1a66a91bda
⛔ Files ignored due to path filters (7)
crawling/graduation_credit_calculator/pdfs/2019대학요람.pdfis excluded by!**/*.pdfcrawling/graduation_credit_calculator/pdfs/2020대학요람.pdfis excluded by!**/*.pdfcrawling/graduation_credit_calculator/pdfs/2021대학요람.pdfis excluded by!**/*.pdfcrawling/graduation_credit_calculator/pdfs/2022대학요람.pdfis excluded by!**/*.pdfcrawling/graduation_credit_calculator/pdfs/2023대학요람.pdfis excluded by!**/*.pdfcrawling/graduation_credit_calculator/pdfs/2024대학요람.pdfis excluded by!**/*.pdfcrawling/koreatech_bus/go.sumis excluded by!**/*.sum
📒 Files selected for processing (27)
.github/workflows/deploy-prod.yml.github/workflows/deploy-stage.ymlcrawling/dcinside_gallery.pycrawling/get_holiday.pycrawling/graduation_credit_calculator/culture_hrd_msc.pycrawling/graduation_credit_calculator/major_subject.pycrawling/koreatech_bus/cheonan_commuting.yamlcrawling/koreatech_bus/cheonan_shuttle.yamlcrawling/koreatech_bus/cheongju_commuting.yamlcrawling/koreatech_bus/cheongju_shuttle.yamlcrawling/koreatech_bus/config.example.propertiescrawling/koreatech_bus/daejeon_commuting.yamlcrawling/koreatech_bus/go.modcrawling/koreatech_bus/main.gocrawling/koreatech_bus/sejong_commuting.yamlcrawling/koreatech_bus/seoul_commuting.yamlcrawling/koreatech_calendar.pycrawling/koreatech_closed_lecture.pycrawling/koreatech_lecture/lecture.pycrawling/koreatech_notice.pycrawling/mysql_connect.pycrawling/search_migration.pycrawling/slack_notice.pyimage_scaling/image_scaling.pyimage_scaling/image_scaling.shimage_scaling/s3.example.configimage_scaling/s3_config.example.py
💤 Files with no reviewable changes (24)
- crawling/koreatech_bus/config.example.properties
- crawling/koreatech_bus/daejeon_commuting.yaml
- crawling/get_holiday.py
- crawling/koreatech_bus/go.mod
- crawling/koreatech_bus/cheongju_shuttle.yaml
- crawling/koreatech_calendar.py
- crawling/koreatech_bus/cheonan_commuting.yaml
- crawling/slack_notice.py
- image_scaling/image_scaling.sh
- crawling/koreatech_bus/cheongju_commuting.yaml
- crawling/koreatech_closed_lecture.py
- crawling/mysql_connect.py
- crawling/koreatech_bus/main.go
- crawling/koreatech_bus/seoul_commuting.yaml
- image_scaling/image_scaling.py
- image_scaling/s3_config.example.py
- crawling/graduation_credit_calculator/culture_hrd_msc.py
- crawling/koreatech_bus/cheonan_shuttle.yaml
- crawling/koreatech_bus/sejong_commuting.yaml
- crawling/graduation_credit_calculator/major_subject.py
- image_scaling/s3.example.config
- crawling/search_migration.py
- crawling/koreatech_notice.py
- crawling/dcinside_gallery.py
| - name: Notify Slack deploy result | ||
| if: always() | ||
| run: | | ||
| DURATION=$(( $(date +%s) - ${{ steps.start.outputs.time }} )) | ||
| MINUTES=$((DURATION / 60)) | ||
| SECONDS=$((DURATION % 60)) | ||
|
|
||
| if [ "${{ job.status }}" = "success" ]; then | ||
| EMOJI=":white_check_mark:" | ||
| STATUS="성공" | ||
| else | ||
| EMOJI=":x:" | ||
| STATUS="실패" | ||
| fi |
There was a problem hiding this comment.
Same start time fallback concern as staging workflow.
If the "Record start time" step fails, the duration calculation will error. Add a fallback for robustness.
Proposed fix
- name: Notify Slack deploy result
if: always()
run: |
+ START_TIME="${{ steps.start.outputs.time }}"
+ if [ -z "$START_TIME" ]; then
+ START_TIME=$(date +%s)
+ fi
- DURATION=$(( $(date +%s) - ${{ steps.start.outputs.time }} ))
+ DURATION=$(( $(date +%s) - START_TIME ))
MINUTES=$((DURATION / 60))
SECONDS=$((DURATION % 60))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Notify Slack deploy result | |
| if: always() | |
| run: | | |
| DURATION=$(( $(date +%s) - ${{ steps.start.outputs.time }} )) | |
| MINUTES=$((DURATION / 60)) | |
| SECONDS=$((DURATION % 60)) | |
| if [ "${{ job.status }}" = "success" ]; then | |
| EMOJI=":white_check_mark:" | |
| STATUS="성공" | |
| else | |
| EMOJI=":x:" | |
| STATUS="실패" | |
| fi | |
| - name: Notify Slack deploy result | |
| if: always() | |
| run: | | |
| START_TIME="${{ steps.start.outputs.time }}" | |
| if [ -z "$START_TIME" ]; then | |
| START_TIME=$(date +%s) | |
| fi | |
| DURATION=$(( $(date +%s) - START_TIME )) | |
| MINUTES=$((DURATION / 60)) | |
| SECONDS=$((DURATION % 60)) | |
| if [ "${{ job.status }}" = "success" ]; then | |
| EMOJI=":white_check_mark:" | |
| STATUS="성공" | |
| else | |
| EMOJI=":x:" | |
| STATUS="실패" | |
| fi |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/deploy-prod.yml around lines 49 - 62, The duration
calculation in the "Notify Slack deploy result" step assumes
steps.start.outputs.time exists and will fail if the "Record start time" step
didn't produce it; update the run block to first compute a START timestamp
variable by checking whether steps.start.outputs.time is non-empty and using it,
otherwise fallback to the current epoch seconds, then compute DURATION, MINUTES
and SECONDS from that START value (references: the run block variables DURATION,
MINUTES, SECONDS and steps.start.outputs.time in the "Notify Slack deploy
result" step).
| - name: Notify Slack deploy result | ||
| if: always() | ||
| run: | | ||
| DURATION=$(( $(date +%s) - ${{ steps.start.outputs.time }} )) | ||
| MINUTES=$((DURATION / 60)) | ||
| SECONDS=$((DURATION % 60)) | ||
|
|
||
| if [ "${{ job.status }}" = "success" ]; then | ||
| EMOJI=":white_check_mark:" | ||
| STATUS="성공" | ||
| else | ||
| EMOJI=":x:" | ||
| STATUS="실패" | ||
| fi |
There was a problem hiding this comment.
Result notification may fail silently if start time step didn't complete.
If the "Record start time" step fails or is skipped, steps.start.outputs.time will be empty, causing the arithmetic in line 52 to fail. Since this runs with if: always(), this edge case is possible.
Proposed fix to add a fallback
- name: Notify Slack deploy result
if: always()
run: |
+ START_TIME="${{ steps.start.outputs.time }}"
+ if [ -z "$START_TIME" ]; then
+ START_TIME=$(date +%s)
+ fi
- DURATION=$(( $(date +%s) - ${{ steps.start.outputs.time }} ))
+ DURATION=$(( $(date +%s) - START_TIME ))
MINUTES=$((DURATION / 60))
SECONDS=$((DURATION % 60))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Notify Slack deploy result | |
| if: always() | |
| run: | | |
| DURATION=$(( $(date +%s) - ${{ steps.start.outputs.time }} )) | |
| MINUTES=$((DURATION / 60)) | |
| SECONDS=$((DURATION % 60)) | |
| if [ "${{ job.status }}" = "success" ]; then | |
| EMOJI=":white_check_mark:" | |
| STATUS="성공" | |
| else | |
| EMOJI=":x:" | |
| STATUS="실패" | |
| fi | |
| - name: Notify Slack deploy result | |
| if: always() | |
| run: | | |
| START_TIME="${{ steps.start.outputs.time }}" | |
| if [ -z "$START_TIME" ]; then | |
| START_TIME=$(date +%s) | |
| fi | |
| DURATION=$(( $(date +%s) - START_TIME )) | |
| MINUTES=$((DURATION / 60)) | |
| SECONDS=$((DURATION % 60)) | |
| if [ "${{ job.status }}" = "success" ]; then | |
| EMOJI=":white_check_mark:" | |
| STATUS="성공" | |
| else | |
| EMOJI=":x:" | |
| STATUS="실패" | |
| fi |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/deploy-stage.yml around lines 49 - 62, The notification
step can fail if steps.start.outputs.time is empty; update the "Notify Slack
deploy result" step (where DURATION and START_TIME are computed) to defensively
handle a missing start time by checking if steps.start.outputs.time is empty and
using a safe fallback (e.g., treat missing start time as now or 0) before
performing arithmetic, e.g., set a START_TIME variable from
steps.start.outputs.time and use shell parameter expansion to default it
(START_TIME=${START_TIME:-0}) or skip duration calculation when empty, then
compute DURATION/MINUTES/SECONDS only when START_TIME is valid and continue to
use job.status/EMOJI/STATUS as before.
Summary by CodeRabbit
Release Notes
New Features
Removed Features