Skip to content

Comments

chore(script): add option to delete logs folders on clean-up#31955

Closed
serrovsky wants to merge 5 commits intoapache:mainfrom
serrovsky:main
Closed

chore(script): add option to delete logs folders on clean-up#31955
serrovsky wants to merge 5 commits intoapache:mainfrom
serrovsky:main

Conversation

@serrovsky
Copy link

This PR tries to fix a problem that I had some weeks ago. If you are storing your logs locally, and since the logs folders tree structure by default follows the pattern <dag_id>/<run_id>/<task_id>/<attempt_id>.log this scales quickly.

After some time you got thousands of empty folders and the clean-logs.sh starts to consume a lot of memory due to the find command.

image

Where you can find an example of what was happening in our case. Work-log-groomer containers were consuming 6-7 GB of memory on average just to clean logs. 🤯

With this small change, I added a flag that allows us to delete not only the file but also all the folders, once it doesn't make sense to save empty folders.

In the screenshot below you can find the real impact of this change:
image

@boring-cyborg boring-cyborg bot added area:dev-tools area:helm-chart Airflow Helm Chart area:production-image Production image improvements and fixes labels Jun 16, 2023
Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement.

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit but otherwise cool


readonly EVERY=$((15*60))

echo "Running the improved cleaning logs..."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
echo "Running the improved cleaning logs..."

Leftover from development? I don't think we need this :)

echo "Trimming airflow logs folders to ${RETENTION} days."
find "${DIRECTORY}"/logs \
-type d -name 'lost+found' -prune -o \
-type d -mtime +"${RETENTION}" -print0 | \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work as written. mtime updates don't propogate up, so you'd end up deleting all of a dags logs after the root folder has been around for $retention_days.

Instead, we could have a second command that goes through and deletes old empty folders? I think the "empty" aspect is critical.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point @jedcunningham adding -empty predicate in find should solve the issue.


readonly DIRECTORY="${AIRFLOW_HOME:-/usr/local/airflow}"
readonly RETENTION="${AIRFLOW__LOG_RETENTION_DAYS:-15}"
readonly DELETE_LOGS_FOLDERS="${AIRFLOW__LOG_DELETE_FOLDERS:-true}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I almost feel like this should just happen regardless and not be able to be turned off. @potiuk, thoughts?

args: ["bash", "/clean-logs"]
# Number of days to retain logs
retentionDays: 15
deleteLogFolders: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do keep it confugurable, I think deleteFolders is better. Like with retentionDays, we know we are dealing with logs already.

@potiuk potiuk requested a review from jedcunningham June 28, 2023 10:26
echo "Trimming airflow logs folders to ${RETENTION} days."
find "${DIRECTORY}"/logs \
-type d -name 'lost+found' -prune -o \
-type d -mtime +"${RETENTION}" -print0 | \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-type d -mtime +"${RETENTION}" -print0 | \
-type d -empty -mtime +"${RETENTION}" -print0 | \

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Aug 13, 2023
@github-actions github-actions bot closed this Aug 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools area:helm-chart Airflow Helm Chart area:production-image Production image improvements and fixes stale Stale PRs per the .github/workflows/stale.yml policy file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants