Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix logging of images with special characters in the key (uploads are immediate now) #3187

Merged
merged 4 commits into from
Jan 28, 2022

Conversation

speezepearson
Copy link
Contributor

@speezepearson speezepearson commented Jan 28, 2022

Fixes WB-6076

Description

Escapes filenames that are passed into InterfaceBase.publish_files, which expects globs, not filenames, and therefore misbehaves when filenames contain * or [...].

Testing

Currently working on unit tests.

Manual testing: repro before this change:

~/tempdir $ (cd ~/src/client/ && git checkout master)

~/tempdir $ python repro.py
<...wait a bit...>
^Z
[2]+  Stopped                 python repro.py

~/tempdir $ grep inoffensive_0 wandb/debug-internal.log
2022-01-28 12:21:22,517 INFO    SenderThread:85370 [sender.py:_save_file():944] saving file media/images/inoffensive_0_2f385627229499da4152.png with policy now
2022-01-28 12:21:22,903 INFO    Thread-13 :85370 [upload_job.py:push():137] Uploaded file /var/folders/wp/vvlmjhbd6q3gdr_l4grs6cyc0000gn/T/tmp1eis0p45wandb/jkbz5u30-media/images/inoffensive_0_2f385627229499da4152.png
2022-01-28 12:21:23,407 INFO    Thread-8  :85370 [dir_watcher.py:_on_file_created():217] file/dir created: /Users/pears/tempdir/wandb/run-20220128_122121-dwg2634u/files/media/images/inoffensive_0_2f385627229499da4152.png

~/tempdir $ grep 'weird chars' wandb/debug-internal.log
2022-01-28 12:21:22,519 INFO    SenderThread:85370 [sender.py:_save_file():944] saving file media/images/[weird chars]_1_54753c907b4065af6e3e.png with policy now
2022-01-28 12:21:23,406 INFO    Thread-8  :85370 [dir_watcher.py:_on_file_created():217] file/dir created: /Users/pears/tempdir/wandb/run-20220128_122121-dwg2634u/files/media/images/[weird chars]_1_54753c907b4065af6e3e.png

Note how the Uploaded file line is missing for the key with weird characters.

And after:

~/tempdir $ (cd ~/src/client/ && git checkout spencerpearson/brackets-escape)
Switched to branch 'spencerpearson/brackets-escape'
Your branch is up to date with 'origin/spencerpearson/brackets-escape'.

~/tempdir $ python repro.py
<...wait a bit...>
^Z
[2]+  Stopped                 python repro.py

~/tempdir $ grep inoffensive_0 wandb/debug-internal.log
2022-01-28 12:22:42,265 INFO    SenderThread:90789 [sender.py:_save_file():944] saving file media/images/inoffensive_0_a99039c325e43a8e4705.png with policy now
2022-01-28 12:22:42,660 INFO    Thread-13 :90789 [upload_job.py:push():137] Uploaded file /var/folders/wp/vvlmjhbd6q3gdr_l4grs6cyc0000gn/T/tmpjna_a9tswandb/19edoarb-media/images/inoffensive_0_a99039c325e43a8e4705.png
2022-01-28 12:22:43,156 INFO    Thread-8  :90789 [dir_watcher.py:_on_file_created():217] file/dir created: /Users/pears/tempdir/wandb/run-20220128_122241-1nzjog93/files/media/images/inoffensive_0_a99039c325e43a8e4705.png

~/tempdir $ grep 'weird chars' wandb/debug-internal.log
2022-01-28 12:22:42,266 INFO    SenderThread:90789 [sender.py:_save_file():944] saving file media/images/[[]weird chars]_1_1692ab4626d0418997ca.png with policy now
2022-01-28 12:22:42,661 INFO    Thread-14 :90789 [upload_job.py:push():137] Uploaded file /var/folders/wp/vvlmjhbd6q3gdr_l4grs6cyc0000gn/T/tmpjna_a9tswandb/3li9x4nz-media/images/[weird chars]_1_1692ab4626d0418997ca.png
2022-01-28 12:22:43,155 INFO    Thread-8  :90789 [dir_watcher.py:_on_file_created():217] file/dir created: /Users/pears/tempdir/wandb/run-20220128_122241-1nzjog93/files/media/images/[weird chars]_1_1692ab4626d0418997ca.png

@speezepearson
Copy link
Contributor Author

speezepearson commented Jan 28, 2022

Future potential changes:

  • publish_files immediately converts its argument into a pb.FilesRecord. Is there any reason the callers shouldn't just construct the FilesRecord directly?
  • The FilesItem type has a field string path; but that value is treated (access -> plumbing -> use) as a glob, not as a path. I'd like to rename this field to glob, or path_glob. Is there any reason this Protobuf field rename wouldn't be perfectly safe?

Copy link
Contributor

@vanpelt vanpelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@vanpelt
Copy link
Contributor

vanpelt commented Jan 28, 2022

Actually, we should really test this. Probably belongs in test_data_types.py.

@codecov
Copy link

codecov bot commented Jan 28, 2022

Codecov Report

Merging #3187 (cb54358) into master (d66850b) will increase coverage by 0.05%.
The diff coverage is 100.00%.

❗ Current head cb54358 differs from pull request most recent head 3d47e03. Consider uploading reports for the commit 3d47e03 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3187      +/-   ##
==========================================
+ Coverage   79.01%   79.06%   +0.05%     
==========================================
  Files         210      210              
  Lines       27636    27638       +2     
==========================================
+ Hits        21837    21853      +16     
+ Misses       5799     5785      -14     
Flag Coverage Δ
functest 57.17% <100.00%> (+0.06%) ⬆️
unittest 68.90% <83.33%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
wandb/sdk/internal/meta.py 90.74% <100.00%> (+3.16%) ⬆️
wandb/sdk/internal/tb_watcher.py 89.47% <100.00%> (+0.03%) ⬆️
wandb/sdk/wandb_run.py 87.60% <100.00%> (-0.27%) ⬇️
wandb/sdk/internal/stats.py 67.22% <0.00%> (-0.56%) ⬇️
wandb/sdk/lib/git.py 76.35% <0.00%> (ø)
wandb/__init__.py 92.52% <0.00%> (+0.93%) ⬆️
wandb/sdk/lib/sock_client.py 94.64% <0.00%> (+1.78%) ⬆️
wandb/old/core.py 75.38% <0.00%> (+16.92%) ⬆️

@speezepearson speezepearson marked this pull request as ready for review January 28, 2022 22:51
Copy link
Contributor

@vanpelt vanpelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥

@speezepearson speezepearson merged commit 0514f2d into master Jan 28, 2022
@speezepearson speezepearson deleted the spencerpearson/brackets-escape branch January 28, 2022 23:52
@raubitsj raubitsj changed the title [WB-6076] escape filenames passed to publish_files, which expects globs Fix logging of images with special characters in the key (uploads are immediate now) Mar 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants