Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove log rotation script from gcsfuse long haul tests #1543

Merged
merged 2 commits into from
Dec 11, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 0 additions & 3 deletions perfmetrics/scripts/ml_tests/pytorch/run_container.sh
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,5 @@ echo "Running the docker image build in the previous step..."
sudo docker run --gpus all --name=pytorch_automation_container --privileged -d -v $HOME/github/gcsfuse/container_artifacts:/pytorch_dino/run_artifacts:rw,rshared \
--shm-size=128g pytorch-gcsfuse:latest

# Setup the log_rotation.
source perfmetrics/scripts/ml_tests/setup_log_rotation.sh $HOME/github/gcsfuse/container_artifacts/gcsfuse.log

# Wait for the script completion as well as logs output.
sudo docker logs -f pytorch_automation_container
13 changes: 9 additions & 4 deletions perfmetrics/scripts/ml_tests/pytorch/run_model.sh
Original file line number Diff line number Diff line change
Expand Up @@ -30,16 +30,21 @@ cd -
mkdir run_artifacts/gcsfuse_logs

echo "Mounting GCSFuse..."
echo "logging:
file-path: run_artifacts/gcsfuse.log
format: text
severity: trace
log-rotate:
max-file-size-mb: 1024
backup-file-count: 3
compress: true
" > /tmp/gcsfuse_config.yaml
nohup /pytorch_dino/gcsfuse/gcsfuse --foreground --type-cache-ttl=1728000s \
ashmeenkaur marked this conversation as resolved.
Show resolved Hide resolved
--stat-cache-ttl=1728000s \
--stat-cache-capacity=1320000 \
--stackdriver-export-interval=60s \
--implicit-dirs \
--max-conns-per-host=100 \
--debug_fuse \
--debug_gcs \
--log-file run_artifacts/gcsfuse.log \
--log-format text \
gcsfuse-ml-data gcsfuse_data > "run_artifacts/gcsfuse.out" 2> "run_artifacts/gcsfuse.err" &

# Update the pytorch library code to bypass the kernel-cache
Expand Down
53 changes: 0 additions & 53 deletions perfmetrics/scripts/ml_tests/setup_log_rotation.sh

This file was deleted.

21 changes: 0 additions & 21 deletions perfmetrics/scripts/ml_tests/smart_log_deleter.sh

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,5 @@ sudo docker run --gpus all --name tf_model_container --privileged -d \
-v $HOME/github/gcsfuse/container_artifacts/logs:/home/logs:rw,rshared \
-v $HOME/github/gcsfuse/container_artifacts/output:/home/output:rw,rshared --shm-size=24g tf-dlc-gcsfuse:latest

# Setup the log_rotation.
source perfmetrics/scripts/ml_tests/setup_log_rotation.sh $HOME/github/gcsfuse/container_artifacts/logs/gcsfuse.log

# Wait for the script completion as well as logs output.
sudo docker logs -f tf_model_container
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,21 @@ cd -

# Mount the bucket and run in background so that docker doesn't keep running after resnet_runner.py fails
echo "Mounting the bucket"
nohup gcsfuse/gcsfuse --foreground --implicit-dirs --debug_fuse --debug_gcs --max-conns-per-host 100 --log-format "text" --log-file /home/logs/gcsfuse.log --stackdriver-export-interval 60s ml-models-data-gcsfuse myBucket > /home/output/gcsfuse.out 2> /home/output/gcsfuse.err &
echo "logging:
file-path: /home/logs/gcsfuse.log
format: text
severity: trace
log-rotate:
max-file-size-mb: 1024
backup-file-count: 3
compress: true
" > /tmp/gcsfuse_config.yaml
nohup gcsfuse/gcsfuse --foreground \
--implicit-dirs \
--max-conns-per-host 100 \
--stackdriver-export-interval 60s \
--config-file /tmp/gcsfuse_config.yaml \
ml-models-data-gcsfuse myBucket > /home/output/gcsfuse.out 2> /home/output/gcsfuse.err &

# Install tensorflow model garden library
pip3 install --user tf-models-official==2.13.2
Expand Down