Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial commit of sprite atlasmaker. #137

Merged
merged 9 commits into from
Aug 10, 2018
Merged

Initial commit of sprite atlasmaker. #137

merged 9 commits into from
Aug 10, 2018

Conversation

junjaytan
Copy link
Contributor

Command line utility for creating sprite atlases from a list of source image locations. Apologies in advance for the large PR.

@jimbojw
Copy link
Collaborator

jimbojw commented Jun 19, 2018

Thanks Junjay! I'm taking a look.

…rate list of images from wikipedia for testing,

Misc other minor bug fixes as well.
@junjaytan
Copy link
Contributor Author

Following up from last week, here are the updates in the latest commit:

  • It now handles file failures (although it doesn't retry requests outside of what the requests library handles by default). The two main failures are IOErrors (when it's not an image format recognized by PIL) and DecompressionBombError (when the image is beyond what PIL thinks is safe, which the docs say is >0.5GB). If we need to run files larger than this, PIL allows a setting to skip this exception that we can implement in the future.
  • Uses a default image for image failures if specified; otherwise it uses the background settings as the default image.
  • Added more debug/log messages, updated readme to list how you can show these via the verbosity flag.
  • The error that occurs when you run bazel run :atlasmaker -- --help seems to be an absl (or bazel?) bug. It happens even with a basic script that does nothing. However, once you've built it and run the binary (from blaze-bin/...) it doesn't appear. I haven't seen a bug report for this yet, but I assume it has something to do with the absl python wrapper over C++. I'll try to dig into this more to see what is causing this.
  • Made some flags required, updated readme accordingly.

Additionally, I added a utility in the 'utils' subdir patterned on your shell script that retrieves a list of featured images from wikipedia that can be used for testing. Running 1000 images through my mac pro (dual core) finished in under 8 minutes, with 12 images failed due to their not being PIL recognized image formats (webm, svg, etc).

Please let me how it's working for you.

@junjaytan
Copy link
Contributor Author

Update: the error when running bazel run :atlasmaker -- --help doesn't show up anymore once I upgraded my bazel to the latest version (0.15.0). Let me know if you see the same.

Copy link
Collaborator

@jimbojw jimbojw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really great work here Junjay! Couple of comments. I'm excited to start using this tool. I think it'll help a lot.


```sh
# Create temp dir to hold outputs if it doesn't exist (optional)
mkdir $PWD/outputs/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you skip this step, the montage process works, but then fails at the end with a Python stacktrace ending with:

IOError: [Errno 2] No such file or directory: '</path/to/pwd>/outputs/spriteatlas.png'

Would be nice to fail fast and discover this missing dir before making the sprite atlas.

Copy link
Contributor Author

@junjaytan junjaytan Jul 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to create dir if not exists and also verify we can write to the output dir. Fails fast otherwise.

Example usage for getting a list of 1000 images with debug messages printed to stdout:

```sh
bazel run :wikipedia_sourcelist_generator -- --num_images=1000 --outputdir=$PWD --verbosity=1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sourcelist generator worked fine for creating a long list, thanks! Was able to make a list of 10k images.

However, when I tried to generate an atlas with these 10k images, it seemed to be doing OK for 15 minutes and then blew up. I'm not sure why, but here's the top and bottom of the very long stack:

[Parallel(n_jobs=-1)]: Done 2576 tasks      | elapsed: 14.6min
E0629 14:57:10.262512 140735762211712 parallelize.py:33] Retrieval of file https://upload.wikimedia.org/wikipedia/commons/2/2a/Coat_of_arms_of_Mexico.svg failed with error: cannot identify image file <_io.BytesIO object at 0x109333d10>
E0629 14:57:11.501877 140735762211712 parallelize.py:33] Retrieval of file https://upload.wikimedia.org/wikipedia/commons/d/d4/Coat_of_Arms_of_Siam_%281873-1910%29.svg failed with error: cannot identify image file <_io.BytesIO object at 0x1093286b0>
E0629 14:57:12.611440 140735762211712 parallelize.py:33] Retrieval of file https://upload.wikimedia.org/wikipedia/commons/a/a1/Coat_of_arms_of_Trinidad_and_Tobago.svg failed with error: cannot identify image file <_io.BytesIO object at 0x109333e30>
E0629 14:57:15.713234 140735762211712 parallelize.py:33] Retrieval of file https://upload.wikimedia.org/wikipedia/commons/d/d6/College_of_Arms-Lant%27s_Roll.svg failed with error: cannot identify image file <_io.BytesIO object at 0x116457710>
E0629 14:57:39.827016 140735762211712 parallelize.py:33] Retrieval of file https://upload.wikimedia.org/wikipedia/commons/6/62/Cheetahs_on_the_Edge_%28Director%27s_Cut%29.ogv failed with error: cannot identify image file <_io.BytesIO object at 0x1166dec50>
E0629 14:59:09.965570 140735762211712 parallelize.py:33] Retrieval of file https://upload.wikimedia.org/wikipedia/commons/c/c0/Big_Buck_Bunny_4K.webm failed with error: cannot identify image file <_io.BytesIO object at 0x109335b30>

Joblib's exception reporting continues...

Traceback (most recent call last):
  File "/private/var/tmp/_bazel_jimbo/bdc924cd5f9f9e5884088ec4eae7b2c1/execroot/ai_google_pair_facets/bazel-out/darwin-fastbuild/bin/facets_atlasmaker/atlasmaker.runfiles/ai_google_pair_facets/facets_atlasmaker/atlasmaker.py", line 182, in <module>
    app.run(main)
  File "/Users/jimbo/facets-env/lib/python2.7/site-packages/absl/app.py", line 274, in run
    _run_main(main, args)
  File "/Users/jimbo/facets-env/lib/python2.7/site-packages/absl/app.py", line 238, in _run_main
    sys.exit(main(argv))
  File "/private/var/tmp/_bazel_jimbo/bdc924cd5f9f9e5884088ec4eae7b2c1/execroot/ai_google_pair_facets/bazel-out/darwin-fastbuild/bin/facets_atlasmaker/atlasmaker.runfiles/ai_google_pair_facets/facets_atlasmaker/atlasmaker.py", line 164, in main
    n_jobs=FLAGS.num_parallel_jobs, verbose=FLAGS.parallelization_verbosity)
  File "/Users/jimbo/checkouts/facets/facets_atlasmaker/parallelize.py", line 70, in get_and_convert_images_parallel
    for location in image_src_locations)
  File "/Users/jimbo/facets-env/lib/python2.7/site-packages/joblib/parallel.py", line 789, in __call__
    self.retrieve()
  File "/Users/jimbo/facets-env/lib/python2.7/site-packages/joblib/parallel.py", line 740, in retrieve
    raise exception
joblib.my_exceptions.JoblibIOError: JoblibIOError
___________________________________________________________________________
Multiprocessing exception:

...

...........................................................................
/Users/jimbo/facets-env/lib/python2.7/site-packages/PIL/Image.py in resize(self=<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=648x432>, size=(80, 53), resample=1, box=(0, 0, 648, 432))
   1742             return self.convert('La').resize(size, resample, box).convert('LA')
   1743 
   1744         if self.mode == 'RGBA':
   1745             return self.convert('RGBa').resize(size, resample, box).convert('RGBA')
   1746 
-> 1747         self.load()
        self.load = <bound method JpegImageFile.load of <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=648x432>>
   1748 
   1749         return self._new(self.im.resize(size, resample, box))
   1750 
   1751     def rotate(self, angle, resample=NEAREST, expand=0, center=None,

...........................................................................
/Users/jimbo/facets-env/lib/python2.7/site-packages/PIL/ImageFile.py in load(self=<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=648x432>)
    223                                 if LOAD_TRUNCATED_IMAGES:
    224                                     break
    225                                 else:
    226                                     self.tile = []
    227                                     raise IOError("image file is truncated "
--> 228                                                   "(%d bytes not processed)" % len(b))
        b = '\x95^\xb0V\xb6\x15\xb9EC\xd6'
    229 
    230                             b = b + s
    231                             n, err_code = decoder.decode(b)
    232                             if n < 0:

IOError: image file is truncated (10 bytes not processed)
___________________________________________________________________________

You don't have to fix this, just pasting in case it's something obvious to you. I'm running atlasmaker again with 1,000 images.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1,000 image atlasmaker worked great! Here's the end of the log:

[Parallel(n_jobs=-1)]: Done 952 tasks      | elapsed:  5.6min
[Parallel(n_jobs=-1)]: Done 1000 out of 1000 | elapsed:  5.8min finished
I0629 15:12:59.930185 140735762211712 montage.py:175] Montaged 1000 images onto sprite atlas of size (2560, 2560) pixels.
W0629 15:12:59.930639 140735762211712 montage.py:178] 12 images had failures and were replaced by the default image

I liked that it told me how many images failed. Here's the top of the manifest.json file:

{
  "source image": "https://upload.wikimedia.org/wikipedia/commons/e/e4/1_taipei_sunrise_panorama_dxr_edit_pangen_141215_1.jpg",
  "offset_x": 0,
  "image_name": "1_taipei_sunrise_panorama_dxr_edit_pangen_141215_1.jpg",
  "success": true,
  "offset_y": 0
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error message is currently the entire message, which is much more descriptive. I can truncate this to a more concise message if needed, but for now it may be more helpful to have the full error message.

@@ -0,0 +1,126 @@
# Facets Atlasmaker

Atlasmaker is a command line utility and library for creating sprite atlases. These atlases can be
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for writing all this documentation! Much appreciated.


# Manifest Key names
MANIFEST_IMAGE_NAME_KEY = 'image_name'
MANIFEST_SOURCE_IMAGE_KEY = 'source image'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: 'source image' should be 'source_image' (with an underscore) to match the other key names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

MANIFEST_SOURCE_IMAGE_KEY = 'source image'
MANIFEST_OFFSET_X_KEY = 'offset_x'
MANIFEST_OFFSET_Y_KEY = 'offset_y'
MANIFEST_IMAGE_SUCCESS_KEY = 'success'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that there's a flag to indicate whether download was successful. This allows for easy querying with jq to find out what failed:

$ cat outputs/manifest.json | jq 'select(.success == false)'

Since success is by far the majority case, it would lead to a slightly smaller manifest to omit the success key and instead have an error key which only appears for the failures.

Besides switching from success to error, how hard would it be for the value of the error to be a string describing the kind of error? For example, the two kinds of errors I'm seeing are image is too big (DecompressionBombWarning) and 'cannot identify image file' (e.g. SVG and webm files).

Here's an example of the DecompressionBombWarning:

/Users/jimbo/facets-env/lib/python2.7/site-packages/PIL/Image.py:2514: DecompressionBombWarning: Image size (124090014 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack.
  DecompressionBombWarning)

You don't have to implement this request, I'm just curious to get your assessment of how hard it might be to add a very brief reason for the error, e.g. TOO_BIG, NON_IMAGE or OTHER.

@junjaytan
Copy link
Contributor Author

Hi Jimbo-- made a couple of fixes:

  • Handles truncated images now, which was the error you were running into. Also contains unit tests for this case.
  • Success is not logged in the manifest, instead it now contains an 'errors' key if errors are found, along with the associated error message. Readme also contains a short description of the manifest format.
  • Now detects duplicate images specified in the source list. By default it ignores them with a warning (which is similar to the current behavior), but user can also opt to fail or only use uniques that retain the same order.

I ran successfully it on the 10k wikipedia test image set on my workstation, which finished in ~21 mins.

Also renamed source image key to have underscore.
@junjaytan
Copy link
Contributor Author

I just realized another bug: if one tries to generate an atlas as jpeg, it fails because RGBA can't be output as jpeg. Going to try to fix this, and also add a fail fast check if someone specifies an output format that PIL can't perform or is not possible with the current settings.

@jimbojw
Copy link
Collaborator

jimbojw commented Jul 9, 2018

In the RGBA/jpeg case, dropping the alpha channel is a reasonable approach. Partially or fully transparent pixels may appear strange, but this is OK.

location, image_convert_settings,
allow_truncated_images=allow_truncated_images,
disk_cache=disk_cache)
for location in image_src_locations)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the output of Parallel looks something like this:

[Parallel(n_jobs=-1)]: Done 562 tasks      | elapsed:  3.4min

It does not include the total number of tasks, just how many have been completed.

I tried casting image_src_locations to a list, hoping that would trigger Parallel to know how many jobs there were, but it didn't seem to make a difference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a "--parallelize_verbosity" flag that defines the joblib verbosity (defaulted to a pretty high level of 10, which is why the number of finished tasks is output). I increased this higher but it doesn't give the count of total tasks, like it shows in the joblib docs. I'm not sure what determines when it shows the "Done X tasks out of Y" message. I also tried superficially adding tqdm to get a progress bar but it's crashing when using it with joblib parallel. Might be tricky to add this in, so for now I've just logged at the start of the process how many images are listed in the source file, and one can extrapolate from the existing "jobs done" count. Can look into a better implementation if this is a big issue down the road.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't worry about this, not worth tearing your hear out about. A log line right before the Parallel call to state something like "Scheduling 10000 tasks..." might be nice, but also optional.

… RGB if image format doesn't support RGBA (like jpeg)
@junjaytan
Copy link
Contributor Author

junjaytan commented Jul 11, 2018

A few fixes:

  • It now handles the converting RGBA to RGB when a user wants image formats that don't support RGBA, like jpg.
  • Doesn't fail for any image retrieval failures, and retries up to the max number of retries for images retrieved via http, but only for timeout errors (can add others if you want). For all other types of errors like invalid urls, it doesn't retry and just gives up after one time. The max number of retries and the request timeout can be customized via the '--http_max_retries' and '--http_request_timeout' flags.
  • Reran a new set of 10000 images with some bad urls sprinkled in for good measure, and it worked for me. (~24min on workstation)

Please let me know if you're able to run it. You seem to always discover new edge cases ;)

@junjaytan
Copy link
Contributor Author

Added log message of number of scheduled tasks before parallel runs.

@jimbojw
Copy link
Collaborator

jimbojw commented Jul 17, 2018

Thanks for all your hard work on this! I'm running the 10k example again from my laptop. If it works I'd say we're good to go. 🎉

@jimbojw
Copy link
Collaborator

jimbojw commented Jul 17, 2018

It worked! @jameswex please take a look when you get a chance.

@jameswex
Copy link
Collaborator

jameswex commented Aug 6, 2018

didn't do a full code review but saw that you did a review and tested it out @jimbojw . this looks great, glad to have this addition.

one note is that it would be nice to add a section to the existing Facets Dive readme about sprite making and have it link to the readme for this atlas maker.

@jimbojw
Copy link
Collaborator

jimbojw commented Aug 6, 2018

one note is that it would be nice to add a section to the existing Facets Dive readme about sprite making and have it link to the readme for this atlas maker.

Hi Junjay, can you add a line or two to the Facets Dive README about this? It doesn't need to be a lot, just a pointer to your documentation.

@junjaytan
Copy link
Contributor Author

Hi Jimbo and James,

I added a sentence to the "Providing Sprites For Dive to Render" section linking to atlasmaker. Let me know if you think it should be its own section to be more obvious, or if it looks good to you. Thanks!

@jimbojw jimbojw merged commit 49e0df9 into PAIR-code:master Aug 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants