Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Injection error #14

Closed
shc261392 opened this issue May 14, 2021 · 3 comments
Closed

Injection error #14

shc261392 opened this issue May 14, 2021 · 3 comments
Assignees

Comments

@shc261392
Copy link
Contributor

Image https://drive.google.com/file/d/17DgQCF-TOUEk9LaroHYa9OPaGhXw6soQ/view?usp=sharing

Image sha256sum: 55146f1665fa84fe2a76d13772f7f83ea02a188cde68a047cb9acd2e28005d90

$ git checkout 72735b
$ mv <downloaded-image> dog.jpeg
$ cp dog.jpeg dog-thumbnail.jpeg
$ python3 utils/starling_multiple_injection.py dog.jpeg                                                15:41:23  Traceback (most recent call last):  File "/Users/shc/numbers/github/starling-cai/utils/starling_multiple_injection.py", line 166, in <module>    starling = Starling(photo_bytes,
  File "/Users/shc/numbers/github/starling-cai/cai/starling.py", line 74, in __init__
    self.app11_headers = get_app11_marker_segment_headers(self.raw_bytes)
  File "/Users/shc/numbers/github/starling-cai/cai/jumbf.py", line 219, in get_app11_marker_segment_headers
    header['tbox']   = data_bytes[offset + 16 : offset + 20].decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 3: unexpected end of data
@bafu
Copy link
Contributor

bafu commented May 14, 2021

Root Cause

CAI module treats non-CAI data as CAI metadata and tries to parse it.

Analysis

Currently, the CAI module finds the CAI metadata (APP11 Marker Segments if more precisely) only by searching the 0xFFEB which represents the APP11 Marker.

Screenshot from 2021-05-14 17-42-42

Under this condition, any data identical to 0xFFEB will be treated as the beginning of a CAI metadata.

Solution

Method 1: Workaround (I will go in this way because of resource constraint)

Checking both the App11 Marker and the CI parameter can be a quick workaround. It's a workaround because it only reduces the probability to treat non-CAI data as CAI metadata.

Screenshot from 2021-05-14 17-44-36

Method 2: Root cause solution

To my best knowledge, to fix this issue completely, we need to find all the starting points of the Marker Segments between SOI and DQT and only parse the APP11 Marker Segments.

Screenshot from 2021-05-14 17-49-15

@bafu
Copy link
Contributor

bafu commented May 14, 2021

Testing image information

$ file scott.jpg 
scott.jpg: JPEG image data, Exif standard: [TIFF image data, little-endian, direntries=12, height=3024, manufacturer=samsung, model=SM-N9810, orientation=upper-right, xresolution=210, yresolution=218, resolutionunit=2, software=N9810ZSU1ATI4, datetime=2020:10:24 15:03:33, width=4032], baseline, precision 8, 4032x3024, components 3

@bafu
Copy link
Contributor

bafu commented May 19, 2021

The workaround seems to work (although with known issue #15)

  • Raw photo
    • scott
  • Thumbnail (100x100)
    • scott-thumbnail
  • Multi-injection photo
    • scott-cai-cai-cai
$ sha256sum scott*
47e148074e9a3f658119c82e1a2e5aebb148a2a3864f6a1e4d1f58a4bd31a0ee  scott-cai-cai-cai.jpg
55146f1665fa84fe2a76d13772f7f83ea02a188cde68a047cb9acd2e28005d90  scott.jpg
bfd0c280dfa195a0e8468a0f0d1d6beecb652a70cf591c19187d0c5166cef6a8  scott-thumbnail.jpg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants