Initial commit of sprite atlasmaker. #137

junjaytan · 2018-06-19T05:12:23Z

Command line utility for creating sprite atlases from a list of source image locations. Apologies in advance for the large PR.

jimbojw · 2018-06-19T17:57:54Z

Thanks Junjay! I'm taking a look.

…rate list of images from wikipedia for testing, Misc other minor bug fixes as well.

junjaytan · 2018-06-25T22:24:36Z

Following up from last week, here are the updates in the latest commit:

It now handles file failures (although it doesn't retry requests outside of what the requests library handles by default). The two main failures are IOErrors (when it's not an image format recognized by PIL) and DecompressionBombError (when the image is beyond what PIL thinks is safe, which the docs say is >0.5GB). If we need to run files larger than this, PIL allows a setting to skip this exception that we can implement in the future.
Uses a default image for image failures if specified; otherwise it uses the background settings as the default image.
Added more debug/log messages, updated readme to list how you can show these via the verbosity flag.
The error that occurs when you run bazel run :atlasmaker -- --help seems to be an absl (or bazel?) bug. It happens even with a basic script that does nothing. However, once you've built it and run the binary (from blaze-bin/...) it doesn't appear. I haven't seen a bug report for this yet, but I assume it has something to do with the absl python wrapper over C++. I'll try to dig into this more to see what is causing this.
Made some flags required, updated readme accordingly.

Additionally, I added a utility in the 'utils' subdir patterned on your shell script that retrieves a list of featured images from wikipedia that can be used for testing. Running 1000 images through my mac pro (dual core) finished in under 8 minutes, with 12 images failed due to their not being PIL recognized image formats (webm, svg, etc).

Please let me how it's working for you.

junjaytan · 2018-06-26T22:11:45Z

Update: the error when running bazel run :atlasmaker -- --help doesn't show up anymore once I upgraded my bazel to the latest version (0.15.0). Let me know if you see the same.

jimbojw

Really great work here Junjay! Couple of comments. I'm excited to start using this tool. I think it'll help a lot.

jimbojw · 2018-06-29T18:34:09Z

facets_atlasmaker/README.md

+
+```sh
+# Create temp dir to hold outputs if it doesn't exist (optional)
+mkdir $PWD/outputs/


If you skip this step, the montage process works, but then fails at the end with a Python stacktrace ending with:

IOError: [Errno 2] No such file or directory: '</path/to/pwd>/outputs/spriteatlas.png'

Would be nice to fail fast and discover this missing dir before making the sprite atlas.

Updated to create dir if not exists and also verify we can write to the output dir. Fails fast otherwise.

jimbojw · 2018-06-29T19:09:21Z

facets_atlasmaker/utils/README.md

+Example usage for getting a list of 1000 images with debug messages printed to stdout:
+
+```sh
+bazel run :wikipedia_sourcelist_generator -- --num_images=1000 --outputdir=$PWD --verbosity=1


The sourcelist generator worked fine for creating a long list, thanks! Was able to make a list of 10k images.

However, when I tried to generate an atlas with these 10k images, it seemed to be doing OK for 15 minutes and then blew up. I'm not sure why, but here's the top and bottom of the very long stack:

[Parallel(n_jobs=-1)]: Done 2576 tasks | elapsed: 14.6min E0629 14:57:10.262512 140735762211712 parallelize.py:33] Retrieval of file https://upload.wikimedia.org/wikipedia/commons/2/2a/Coat_of_arms_of_Mexico.svg failed with error: cannot identify image file <_io.BytesIO object at 0x109333d10> E0629 14:57:11.501877 140735762211712 parallelize.py:33] Retrieval of file https://upload.wikimedia.org/wikipedia/commons/d/d4/Coat_of_Arms_of_Siam_%281873-1910%29.svg failed with error: cannot identify image file <_io.BytesIO object at 0x1093286b0> E0629 14:57:12.611440 140735762211712 parallelize.py:33] Retrieval of file https://upload.wikimedia.org/wikipedia/commons/a/a1/Coat_of_arms_of_Trinidad_and_Tobago.svg failed with error: cannot identify image file <_io.BytesIO object at 0x109333e30> E0629 14:57:15.713234 140735762211712 parallelize.py:33] Retrieval of file https://upload.wikimedia.org/wikipedia/commons/d/d6/College_of_Arms-Lant%27s_Roll.svg failed with error: cannot identify image file <_io.BytesIO object at 0x116457710> E0629 14:57:39.827016 140735762211712 parallelize.py:33] Retrieval of file https://upload.wikimedia.org/wikipedia/commons/6/62/Cheetahs_on_the_Edge_%28Director%27s_Cut%29.ogv failed with error: cannot identify image file <_io.BytesIO object at 0x1166dec50> E0629 14:59:09.965570 140735762211712 parallelize.py:33] Retrieval of file https://upload.wikimedia.org/wikipedia/commons/c/c0/Big_Buck_Bunny_4K.webm failed with error: cannot identify image file <_io.BytesIO object at 0x109335b30> Joblib's exception reporting continues... Traceback (most recent call last): File "/private/var/tmp/_bazel_jimbo/bdc924cd5f9f9e5884088ec4eae7b2c1/execroot/ai_google_pair_facets/bazel-out/darwin-fastbuild/bin/facets_atlasmaker/atlasmaker.runfiles/ai_google_pair_facets/facets_atlasmaker/atlasmaker.py", line 182, in <module> app.run(main) File "/Users/jimbo/facets-env/lib/python2.7/site-packages/absl/app.py", line 274, in run _run_main(main, args) File "/Users/jimbo/facets-env/lib/python2.7/site-packages/absl/app.py", line 238, in _run_main sys.exit(main(argv)) File "/private/var/tmp/_bazel_jimbo/bdc924cd5f9f9e5884088ec4eae7b2c1/execroot/ai_google_pair_facets/bazel-out/darwin-fastbuild/bin/facets_atlasmaker/atlasmaker.runfiles/ai_google_pair_facets/facets_atlasmaker/atlasmaker.py", line 164, in main n_jobs=FLAGS.num_parallel_jobs, verbose=FLAGS.parallelization_verbosity) File "/Users/jimbo/checkouts/facets/facets_atlasmaker/parallelize.py", line 70, in get_and_convert_images_parallel for location in image_src_locations) File "/Users/jimbo/facets-env/lib/python2.7/site-packages/joblib/parallel.py", line 789, in __call__ self.retrieve() File "/Users/jimbo/facets-env/lib/python2.7/site-packages/joblib/parallel.py", line 740, in retrieve raise exception joblib.my_exceptions.JoblibIOError: JoblibIOError ___________________________________________________________________________ Multiprocessing exception:

...

........................................................................... /Users/jimbo/facets-env/lib/python2.7/site-packages/PIL/Image.py in resize(self=<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=648x432>, size=(80, 53), resample=1, box=(0, 0, 648, 432)) 1742 return self.convert('La').resize(size, resample, box).convert('LA') 1743 1744 if self.mode == 'RGBA': 1745 return self.convert('RGBa').resize(size, resample, box).convert('RGBA') 1746 -> 1747 self.load() self.load = <bound method JpegImageFile.load of <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=648x432>> 1748 1749 return self._new(self.im.resize(size, resample, box)) 1750 1751 def rotate(self, angle, resample=NEAREST, expand=0, center=None, ........................................................................... /Users/jimbo/facets-env/lib/python2.7/site-packages/PIL/ImageFile.py in load(self=<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=648x432>) 223 if LOAD_TRUNCATED_IMAGES: 224 break 225 else: 226 self.tile = [] 227 raise IOError("image file is truncated " --> 228 "(%d bytes not processed)" % len(b)) b = '\x95^\xb0V\xb6\x15\xb9EC\xd6' 229 230 b = b + s 231 n, err_code = decoder.decode(b) 232 if n < 0: IOError: image file is truncated (10 bytes not processed) ___________________________________________________________________________

You don't have to fix this, just pasting in case it's something obvious to you. I'm running atlasmaker again with 1,000 images.

1,000 image atlasmaker worked great! Here's the end of the log:

[Parallel(n_jobs=-1)]: Done 952 tasks | elapsed: 5.6min [Parallel(n_jobs=-1)]: Done 1000 out of 1000 | elapsed: 5.8min finished I0629 15:12:59.930185 140735762211712 montage.py:175] Montaged 1000 images onto sprite atlas of size (2560, 2560) pixels. W0629 15:12:59.930639 140735762211712 montage.py:178] 12 images had failures and were replaced by the default image

I liked that it told me how many images failed. Here's the top of the manifest.json file:

{ "source image": "https://upload.wikimedia.org/wikipedia/commons/e/e4/1_taipei_sunrise_panorama_dxr_edit_pangen_141215_1.jpg", "offset_x": 0, "image_name": "1_taipei_sunrise_panorama_dxr_edit_pangen_141215_1.jpg", "success": true, "offset_y": 0 }

Error message is currently the entire message, which is much more descriptive. I can truncate this to a more concise message if needed, but for now it may be more helpful to have the full error message.

jimbojw · 2018-06-29T19:10:14Z

facets_atlasmaker/README.md

@@ -0,0 +1,126 @@
+# Facets Atlasmaker
+
+Atlasmaker is a command line utility and library for creating sprite atlases. These atlases can be 


Thank you for writing all this documentation! Much appreciated.

jimbojw · 2018-06-29T19:17:13Z

facets_atlasmaker/montage.py

+
+# Manifest Key names
+MANIFEST_IMAGE_NAME_KEY = 'image_name'
+MANIFEST_SOURCE_IMAGE_KEY = 'source image'


Nit: 'source image' should be 'source_image' (with an underscore) to match the other key names.

jimbojw · 2018-06-29T19:21:23Z

facets_atlasmaker/montage.py

+MANIFEST_SOURCE_IMAGE_KEY = 'source image'
+MANIFEST_OFFSET_X_KEY = 'offset_x'
+MANIFEST_OFFSET_Y_KEY = 'offset_y'
+MANIFEST_IMAGE_SUCCESS_KEY = 'success'


I like that there's a flag to indicate whether download was successful. This allows for easy querying with jq to find out what failed:

$ cat outputs/manifest.json | jq 'select(.success == false)'

Since success is by far the majority case, it would lead to a slightly smaller manifest to omit the success key and instead have an error key which only appears for the failures.

Besides switching from success to error, how hard would it be for the value of the error to be a string describing the kind of error? For example, the two kinds of errors I'm seeing are image is too big (DecompressionBombWarning) and 'cannot identify image file' (e.g. SVG and webm files).

Here's an example of the DecompressionBombWarning:

/Users/jimbo/facets-env/lib/python2.7/site-packages/PIL/Image.py:2514: DecompressionBombWarning: Image size (124090014 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. DecompressionBombWarning)

You don't have to implement this request, I'm just curious to get your assessment of how hard it might be to add a very brief reason for the error, e.g. TOO_BIG, NON_IMAGE or OTHER.

… tests.

…t tests.

junjaytan · 2018-07-04T20:58:51Z

Hi Jimbo-- made a couple of fixes:

Handles truncated images now, which was the error you were running into. Also contains unit tests for this case.
Success is not logged in the manifest, instead it now contains an 'errors' key if errors are found, along with the associated error message. Readme also contains a short description of the manifest format.
Now detects duplicate images specified in the source list. By default it ignores them with a warning (which is similar to the current behavior), but user can also opt to fail or only use uniques that retain the same order.

I ran successfully it on the 10k wikipedia test image set on my workstation, which finished in ~21 mins.

Also renamed source image key to have underscore.

junjaytan · 2018-07-06T17:19:02Z

I just realized another bug: if one tries to generate an atlas as jpeg, it fails because RGBA can't be output as jpeg. Going to try to fix this, and also add a fail fast check if someone specifies an output format that PIL can't perform or is not possible with the current settings.

jimbojw · 2018-07-09T16:47:11Z

In the RGBA/jpeg case, dropping the alpha channel is a reasonable approach. Partially or fully transparent pixels may appear strange, but this is OK.

jimbojw · 2018-07-10T18:24:27Z

facets_atlasmaker/parallelize.py

+          location, image_convert_settings,
+          allow_truncated_images=allow_truncated_images,
+          disk_cache=disk_cache)
+      for location in image_src_locations)


Currently, the output of Parallel looks something like this:

[Parallel(n_jobs=-1)]: Done 562 tasks | elapsed: 3.4min

It does not include the total number of tasks, just how many have been completed.

I tried casting image_src_locations to a list, hoping that would trigger Parallel to know how many jobs there were, but it didn't seem to make a difference.

There is a "--parallelize_verbosity" flag that defines the joblib verbosity (defaulted to a pretty high level of 10, which is why the number of finished tasks is output). I increased this higher but it doesn't give the count of total tasks, like it shows in the joblib docs. I'm not sure what determines when it shows the "Done X tasks out of Y" message. I also tried superficially adding tqdm to get a progress bar but it's crashing when using it with joblib parallel. Might be tricky to add this in, so for now I've just logged at the start of the process how many images are listed in the source file, and one can extrapolate from the existing "jobs done" count. Can look into a better implementation if this is a big issue down the road.

Don't worry about this, not worth tearing your hear out about. A log line right before the Parallel call to state something like "Scheduling 10000 tasks..." might be nice, but also optional.

… RGB if image format doesn't support RGBA (like jpeg)

junjaytan · 2018-07-11T14:07:03Z

A few fixes:

It now handles the converting RGBA to RGB when a user wants image formats that don't support RGBA, like jpg.
Doesn't fail for any image retrieval failures, and retries up to the max number of retries for images retrieved via http, but only for timeout errors (can add others if you want). For all other types of errors like invalid urls, it doesn't retry and just gives up after one time. The max number of retries and the request timeout can be customized via the '--http_max_retries' and '--http_request_timeout' flags.
Reran a new set of 10000 images with some bad urls sprinkled in for good measure, and it worked for me. (~24min on workstation)

Please let me know if you're able to run it. You seem to always discover new edge cases ;)

…set timeout and max retries.

junjaytan · 2018-07-11T17:07:21Z

Added log message of number of scheduled tasks before parallel runs.

jimbojw · 2018-07-17T15:01:33Z

Thanks for all your hard work on this! I'm running the 10k example again from my laptop. If it works I'd say we're good to go. 🎉

jimbojw · 2018-07-17T19:16:15Z

It worked! @jameswex please take a look when you get a chance.

jameswex · 2018-08-06T18:39:20Z

didn't do a full code review but saw that you did a review and tested it out @jimbojw . this looks great, glad to have this addition.

one note is that it would be nice to add a section to the existing Facets Dive readme about sprite making and have it link to the readme for this atlas maker.

jimbojw · 2018-08-06T18:56:53Z

one note is that it would be nice to add a section to the existing Facets Dive readme about sprite making and have it link to the readme for this atlas maker.

Hi Junjay, can you add a line or two to the Facets Dive README about this? It doesn't need to be a lot, just a pointer to your documentation.

… fixes.

junjaytan · 2018-08-07T21:06:56Z

Hi Jimbo and James,

I added a sentence to the "Providing Sprites For Dive to Render" section linking to atlasmaker. Let me know if you think it should be its own section to be more obvious, or if it looks good to you. Thanks!

junjaytan added 2 commits June 25, 2018 17:30

Initial commit of sprite atlasmaker.

300a857

Handle file retrieval failures, support default image, script to gene…

1dd32ec

…rate list of images from wikipedia for testing, Misc other minor bug fixes as well.

jimbojw reviewed Jun 29, 2018

View reviewed changes

junjaytan added 2 commits July 3, 2018 11:21

Creates output dir if not exists, or fail fast. A few additional unit…

09d6834

… tests.

Handles truncated images, passes error messages to manifest, more uni…

700032b

…t tests.

Detect and handle duplicate images specified in source list.

5e683a5

Also renamed source image key to have underscore.

jimbojw reviewed Jul 10, 2018

View reviewed changes

Verifies output image format early on or fails, handles converting to…

2ab02b1

… RGB if image format doesn't support RGBA (like jpeg)

Catch all fetch image errors, retry for http timeouts, allow user to …

3652a10

…set timeout and max retries.

jameswex mentioned this pull request Aug 6, 2018

How to load and view your data? #15

Open

junjaytan added 2 commits August 7, 2018 10:20

Merge remote-tracking branch 'upstream/master'

71a5044

Added reference to atlasmaker in facets dive readme, other minor typo…

7eb40a9

… fixes.

jimbojw merged commit 49e0df9 into PAIR-code:master Aug 10, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial commit of sprite atlasmaker. #137

Initial commit of sprite atlasmaker. #137

junjaytan commented Jun 19, 2018

jimbojw commented Jun 19, 2018

junjaytan commented Jun 25, 2018

junjaytan commented Jun 26, 2018

jimbojw left a comment

jimbojw Jun 29, 2018

junjaytan Jul 6, 2018 •

edited

jimbojw Jun 29, 2018

jimbojw Jun 29, 2018

junjaytan Jul 6, 2018

jimbojw Jun 29, 2018

jimbojw Jun 29, 2018

junjaytan Jul 6, 2018

jimbojw Jun 29, 2018

junjaytan commented Jul 4, 2018

junjaytan commented Jul 6, 2018

jimbojw commented Jul 9, 2018

jimbojw Jul 10, 2018

junjaytan Jul 11, 2018

jimbojw Jul 11, 2018

junjaytan commented Jul 11, 2018 •

edited

junjaytan commented Jul 11, 2018

jimbojw commented Jul 17, 2018

jimbojw commented Jul 17, 2018

jameswex commented Aug 6, 2018

jimbojw commented Aug 6, 2018

junjaytan commented Aug 7, 2018

		@@ -0,0 +1,126 @@
		# Facets Atlasmaker

		Atlasmaker is a command line utility and library for creating sprite atlases. These atlases can be

Initial commit of sprite atlasmaker. #137

Initial commit of sprite atlasmaker. #137

Conversation

junjaytan commented Jun 19, 2018

jimbojw commented Jun 19, 2018

junjaytan commented Jun 25, 2018

junjaytan commented Jun 26, 2018

jimbojw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

junjaytan Jul 6, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

junjaytan commented Jul 4, 2018

junjaytan commented Jul 6, 2018

jimbojw commented Jul 9, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

junjaytan commented Jul 11, 2018 • edited

junjaytan commented Jul 11, 2018

jimbojw commented Jul 17, 2018

jimbojw commented Jul 17, 2018

jameswex commented Aug 6, 2018

jimbojw commented Aug 6, 2018

junjaytan commented Aug 7, 2018

junjaytan Jul 6, 2018 •

edited

junjaytan commented Jul 11, 2018 •

edited