# How to Use jpegio
### (Last updated 2019.03.19)
##### *If you find any errors or need additional features, please let us know (leedaewon@nsr.re.kr, daewon4you@gmail.com)*

- `jpegio` is a Python package that provides an API by wrapping some JPEG input/output functions of [libjpeg](https://www.ijg.org) implemented in C.
- It references the source code of [image-forensics](https://github.com/MKLab-ITI/image-forensics) provided by the Greek ITI-CERTH research lab [MKLab](https://mklab.iti.gr).
- Uber research also offers [similar code](https://github.com/uber-research/jpeg2dct).
- [Cython](https://cython.org/) was used to create the Python module from the C code.
- On Microsoft Windows, it uses [libjpeg-turbo](https://libjpeg-turbo.org).
- On UNIX-like operating systems, the `jpegio` package installation process includes compiling the source code of `libjpeg`.

In [1]:
import jpegio as jio

### Reading a JPEG Image
Compressed JPEG data is primarily handled through the `DecompressedJpeg` object.
Other objects (e.g., `DECOMPRESSED`) can be used to handle DCT coefficients in separate data structures (explained in detail later).

In [2]:
fpath = "../tests/images/cherries01.jpg"  # JPEG file address
jpeg = jio.read(fpath)
type(jpeg)

jpegio.decompressedjpeg.DecompressedJpeg

#### Checking Image Size

In [3]:
jpeg.image_width, jpeg.image_width

(756, 756)

#### Checking the Number of YCbCr Channels

In [4]:
jpeg.num_components

3

#### Accessing DCT Coefficients
- The member variable name is `coef_arrays`, which is a basic Python list.
- The `coef_arrays` list contains DCT coefficient arrays corresponding to each channel.
- Each DCT coefficient array is a 2D `numpy.ndarray` object.
- The reason for not managing DCT coefficient arrays as a 3D `numpy.ndarray` is that the sizes of DCT coefficient arrays can differ by channel.
- Different sizes of DCT coefficient arrays per channel occur when down sampling is applied to CbCr channels during JPEG compression.

In [5]:
type(jpeg.coef_arrays)

list

In [6]:
type(jpeg.coef_arrays[0])

numpy.ndarray

In [7]:
len(jpeg.coef_arrays)  # Same as the number of channels

3

In [8]:
print(jpeg.coef_arrays[0].ndim)  # Dimension of the first DCT coefficient array
print(jpeg.coef_arrays[1].ndim)  # Dimension of the second DCT coefficient array
print(jpeg.coef_arrays[2].ndim)  # Dimension of the third DCT coefficient array

2
2
2


In [9]:
# Print the first DCT coefficient block of each channel
for i in range(jpeg.num_components):
    coef = jpeg.coef_arrays[i]
    print("[Channel #%d] The 1st DCT coef. block" % (i + 1))
    print(coef[:8, :8])

[Channel #1] The 1st DCT coef. block
[[-567   11  -47   -6    4    2    1    1]
 [ -81  -41   22   13   -5   -2   -2   -4]
 [  19   15   10  -12    3   -1    1    1]
 [   8   -7   -7   -4    3   -3    0   -2]
 [  -7   -5    6    3   -2   -2   -2   -2]
 [  10    0  -10   -5   -3    1   -1   -1]
 [  -2   -4   -1    2    2   -2   -2    1]
 [  -3    0    4   -3   -3    0   -1   -1]]
[Channel #2] The 1st DCT coef. block
[[-58   6   1   8   0   0   2   0]
 [ 11   5  -4   3   6  -3   1  -2]
 [  5  -7  -2   1  -2  -2  -2   0]
 [ -3  -3   5  -1  -3   1   2   2]
 [  3   0  -6  -2   2   1  -1   0]
 [  4  -1   2   1  -1  -3  -2  -1]
 [ -2   0   2   0   0  -1   0   0]
 [  2   0  -2  -1   1   2   1  -1]]
[Channel #3] The 1st DCT coef. block
[[-4 -4 -3 -2  0  1  2 -1]
 [-2 -2 -3  3  0  1 -2  0]
 [-9  9  2 -1  3 -1  0  0]
 [ 4  1 -2 -5 -2  0  0  0]
 [-3 -2 -2 -1  0 -1  1  0]
 [-2  1  1 -1  1  1  0  0]
 [ 1 -2 -1 -1  0  1  0 -1]
 [ 0 -2  0  0  0 -1  0  0]]


In [10]:
# Print the size of the first DCT coefficient's array for each channel.
# (You can see that the sizes of the DCT arrays are different)
for i in range(jpeg.num_components):
    coef = jpeg.coef_arrays[i]
    print("[Channel #%d] Size of DCT coef. array: %s" % (i + 1, coef.shape))

[Channel #1] Size of DCT coef. array: (504, 760)
[Channel #2] Size of DCT coef. array: (256, 384)
[Channel #3] Size of DCT coef. array: (256, 384)


#### Reshaping DCT Coefficient `numpy.ndarray` Arrays
- To use DCT coefficient arrays more efficiently, it is necessary to reshape the arrays.
- For example, if you want to process data in blocks, it is more convenient to use indices like (block row, block column, 8x8 array row, 8x8 array column).
- Use `numpy.reshape` and `numpy.transpose` appropriately.
- Since `numpy.reshape` and `numpy.transpose` change only the view of the data without altering the internal memory structure, you don't need to worry much about performance issues.

In [11]:
# To access the array in 8x8 blocks,
# you can reshape the array as follows.

coef = jpeg.coef_arrays[0]  # DCT coefficient array of the first channel
nr_blk = coef.shape[0] // 8  # Number of 8x8 block rows
nc_blk = coef.shape[1] // 8  # Number of 8x8 block columns
print(nr_blk, nc_blk)

63 95


In [12]:
# Slice the array into blocks of 8 elements along the columns (nc_blk),
# then further slice these blocks into blocks of 8 elements along the rows (nr_blk).

coef_blk = coef.reshape(nr_blk, 8, nc_blk, 8)
print(coef_blk.shape)

(63, 8, 95, 8)


In [13]:
# Change the axis positions of the array to make indexing easier.
# You only need to think about indexing as desired, without considering how the data is stored internally
# (since the internal data memory is a 1-dimensional array anyway).

coef_blk = coef.reshape(nr_blk, 8, nc_blk, 8).transpose(0, 2, 1, 3)
print(coef_blk.shape)

(63, 95, 8, 8)


In [14]:
# DCT coefficient block located at row 3, column 2
block = coef_blk[3, 2, :, :]
print(block)


[[-644   -6    3  -11    2   -2    7   -1]
 [ -26  -27   28    5   -5    0   -4    0]
 [  -4   -6    7   15   -1   -2   -4    3]
 [   2   17  -17    0   -1    1    1    0]
 [   0   -8    5    0    4   -2   -2    1]
 [  -5   -8    6    4   -1    0   -3   -2]
 [   1    2   -3   -2   -1    0    3    0]
 [   2    1    4   -1    0   -2   -1   -1]]


In [15]:
# DCT coefficient block located at row 10, column 10
block = coef_blk[10, 10, :, :]
print(block)


[[-674    0  -51   -3   16   -3    0   -3]
 [ -61   15  -58   14   15   10   -4    2]
 [  -4  -11  -13    2   15    4    3    1]
 [   8   -2  -13   -3    5   -2   -2    1]
 [   2    2   -2   -7    3   -5   -2    0]
 [ -16    6    8    1   -7   -5    3    0]
 [  -2   -1    8    1   -9   -4    0    1]
 [   4    0   -5   -3    5   -1    5    0]]


In [16]:
# You can provide the block indices as shown below.
block = coef_blk[3, 4]
print(block)


[[-680  -29    7   -8   -2   -3   -1    4]
 [ -19  -28    6   12   -4    3    0    2]
 [  -5  -14    0    9    1   -2   -1    1]
 [   0    8    6   -5   -1   -4    0    4]
 [  -2   -6    1    3   -5    0    0   -2]
 [  -8   -5    0    5    6   -3    2    0]
 [   1   -1    0    1   -1    0    1    2]
 [   0    3    1   -3   -3   -1    0    0]]


#### Checking Information by Channel
- To check information for each JPEG channel, use the `comp_info` member variable of `DecompressedJpeg`.
- It contains various information related to downsampling, making it particularly useful for checking size information when dealing with JPEGs where downsampling has been applied to the CbCr channels.
- For example, the `v_samp_factor` and `h_samp_factor` of the `ComponentInfo` object represent the downsampling ratios for each YCbCr channel. If you simply need the image size after downsampling, use the `downsampled_width` and `downsampled_height` of the `ComponentInfo` object.

In [17]:
# comp_info is a list object that contains ComponentInfo objects corresponding to each channel.
# You can think of "component" as corresponding to "channel".

type(jpeg.comp_info)


list

In [18]:
type(jpeg.comp_info[0])  # ComponentInfo object

jpegio.componentinfo.ComponentInfo

In [19]:
jpeg.comp_info[0]  # ComponentInfo of the first channel

<jpegio.componentinfo.ComponentInfo at 0x7bc1c80bbb30>

In [20]:
for ci in jpeg.comp_info:
    print("[Component #%d]" % (ci.component_id))
    print("Quantization table number:", ci.quant_tbl_no)
    print("DC table number:", ci.dc_tbl_no)
    print("AC table number:", ci.ac_tbl_no)
    print("Width after downsampling:", ci.downsampled_width)  # Width after downsampling
    print("Height after downsampling:", ci.downsampled_height)  # Height after downsampling
    print("Width in blocks:", ci.width_in_blocks)  # Number of block rows
    print("Height in blocks:", ci.height_in_blocks)  # Number of block columns
    print("Vertical sampling factor:", ci.h_samp_factor)  # Vertical sampling factor
    print("Horizontal sampling factor:", ci.v_samp_factor)  # Horizontal sampling factor
    print()

[Component #1]
Quantization table number: 0
DC table number: 0
AC table number: 0
Width after downsampling: 756
Height after downsampling: 504
Width in blocks: 95
Height in blocks: 63
Vertical sampling factor: 2
Horizontal sampling factor: 2

[Component #2]
Quantization table number: 1
DC table number: 1
AC table number: 1
Width after downsampling: 378
Height after downsampling: 252
Width in blocks: 48
Height in blocks: 32
Vertical sampling factor: 1
Horizontal sampling factor: 1

[Component #3]
Quantization table number: 1
DC table number: 1
AC table number: 1
Width after downsampling: 378
Height after downsampling: 252
Width in blocks: 48
Height in blocks: 32
Vertical sampling factor: 1
Horizontal sampling factor: 1


#### Counting Non-Zero DCT AC Coefficients
- In an 8x8 DCT coefficient block, the first coefficient (row 0, column 0) is called the DC coefficient, and the remaining coefficients are called AC coefficients.
- Since most steganography tools that modify JPEG DCT coefficients target AC coefficients, it is necessary to count the number of AC coefficients, excluding the DC coefficient.
- In `jpegio`, there is a member function called `count_nnz_ac`. `count_nnz_ac` provides the count of non-zero AC coefficients in all DCT coefficient blocks. In other words, it counts the number of non-zero coefficients, excluding the DC coefficient.

In [21]:
jpeg.count_nnz_ac()

476659

If you want to count the number of non-zero AC coefficients for each channel, you can use the code below.


In [22]:
import numpy as np

for i in range(jpeg.num_components):
    coef = jpeg.coef_arrays[i]
    nnz_total = np.count_nonzero(coef)  # Number of non-zero coefficients among all DCT coefficients
    nnz_dc = np.count_nonzero(coef[::8, ::8])  # Number of non-zero DC coefficients
    print(
        "[Channel #%d] Number of non-zero DCT AC coefficients: %d"
        % (i + 1, nnz_total - nnz_dc)
    )


[Channel #1] Number of non-zero DCT AC coefficients: 327921
[Channel #2] Number of non-zero DCT AC coefficients: 76925
[Channel #3] Number of non-zero DCT AC coefficients: 71813


#### Reading DCT Coefficients as a 1D Array Using Zig-Zag Scanning
- Depending on the need, you might require a 1D array of DCT coefficients read using the zig-zag scanning method.
- Processing zig-zag scanning block by block in Python can result in some performance degradation.
- `jpegio` provides a `DECOMPRESSED` class, which is a subclass of `DecompressedJpeg`.
- To read a JPEG as a `DECOMPRESSED` object, you need to specify a flag.

In [23]:
# For reference, DecompressedJpeg is designated as jpegio.DECOMPRESSED.
jpeg_zz = jio.read(fpath, jio.DECOMPRESSED)

In [24]:
type(jpeg_zz)

jpegio.decompressedjpeg.DecompressedJpeg

In [25]:
coef = jpeg_zz.coef_arrays[0]
coef.shape

(504, 760)

- You can see that the size of the last dimension of the DCT coefficient array is 64 (the size of a 1D array of an 8x8 array, not a 2D array).
- Below is the result of comparing the performance of Python code and zig-zag scanning.

In [26]:
import os
import glob
import time

BS = 8  # Size of the DCT square block width

list_fpaths = []

for fpath in glob.glob(os.path.join("../tests/images", "*.jpg")):
    list_fpaths.append(fpath)

for fpath in list_fpaths:
    # Read DCT with ZigzagDct1d
    time_beg_zz = time.time()
    jpeg_zz = jio.read(fpath, jio.DECOMPRESSED)
    list_coef_zz = []
    for c in range(jpeg_zz.num_components):
        nrows_blk, ncols_blk = jpeg_zz.get_coef_block_array_shape(c)

        arr_zz = jpeg_zz.coef_arrays[c].reshape(nrows_blk * ncols_blk, BS * BS)
        list_coef_zz.append(arr_zz)
    # end of for
    time_elapsed_zz = time.time() - time_beg_zz

    # Read DCT with DecompressedJpeg
    time_beg_de = time.time()
    jpeg_de = jio.read(fpath, jio.DECOMPRESSED)
    list_coef_de = []
    for c in range(jpeg_de.num_components):
        arr_de = jpeg_de.coef_arrays[c]
        nrows_blk, ncols_blk = jpeg_de.get_coef_block_array_shape(c)
        arr_de = arr_de.reshape(nrows_blk, BS, ncols_blk, BS)
        arr_de = arr_de.transpose(0, 2, 1, 3)
        arr_de = arr_de.reshape(nrows_blk, ncols_blk, BS, BS)

        zz_de = np.zeros((nrows_blk, ncols_blk, BS * BS), dtype=np.int16)

        # Zigzag scanning over DCT blocks.
        for i in range(nrows_blk):
            for j in range(ncols_blk):
                zz_de[i, j][0] = arr_de[i, j][0, 0]

                zz_de[i, j][1] = arr_de[i, j][0, 1]
                zz_de[i, j][2] = arr_de[i, j][1, 0]

                zz_de[i, j][3] = arr_de[i, j][2, 0]
                zz_de[i, j][4] = arr_de[i, j][1, 1]
                zz_de[i, j][5] = arr_de[i, j][0, 2]

                zz_de[i, j][6] = arr_de[i, j][0, 3]
                zz_de[i, j][7] = arr_de[i, j][1, 2]
                zz_de[i, j][8] = arr_de[i, j][2, 1]
                zz_de[i, j][9] = arr_de[i, j][3, 0]

                zz_de[i, j][10] = arr_de[i, j][4, 0]
                zz_de[i, j][11] = arr_de[i, j][3, 1]
                zz_de[i, j][12] = arr_de[i, j][2, 2]
                zz_de[i, j][13] = arr_de[i, j][1, 3]
                zz_de[i, j][14] = arr_de[i, j][0, 4]

                zz_de[i, j][15] = arr_de[i, j][0, 5]
                zz_de[i, j][16] = arr_de[i, j][1, 4]
                zz_de[i, j][17] = arr_de[i, j][2, 3]
                zz_de[i, j][18] = arr_de[i, j][3, 2]
                zz_de[i, j][19] = arr_de[i, j][4, 1]
                zz_de[i, j][20] = arr_de[i, j][5, 0]

                zz_de[i, j][21] = arr_de[i, j][6, 0]
                zz_de[i, j][22] = arr_de[i, j][5, 1]
                zz_de[i, j][23] = arr_de[i, j][4, 2]
                zz_de[i, j][24] = arr_de[i, j][3, 3]
                zz_de[i, j][25] = arr_de[i, j][2, 4]
                zz_de[i, j][26] = arr_de[i, j][1, 5]
                zz_de[i, j][27] = arr_de[i, j][0, 6]

                zz_de[i, j][28] = arr_de[i, j][0, 7]
                zz_de[i, j][29] = arr_de[i, j][1, 6]
                zz_de[i, j][30] = arr_de[i, j][2, 5]
                zz_de[i, j][31] = arr_de[i, j][3, 4]
                zz_de[i, j][32] = arr_de[i, j][4, 3]
                zz_de[i, j][33] = arr_de[i, j][5, 2]
                zz_de[i, j][34] = arr_de[i, j][6, 1]
                zz_de[i, j][35] = arr_de[i, j][7, 0]

                zz_de[i, j][36] = arr_de[i, j][7, 1]
                zz_de[i, j][37] = arr_de[i, j][6, 2]
                zz_de[i, j][38] = arr_de[i, j][5, 3]
                zz_de[i, j][39] = arr_de[i, j][4, 4]
                zz_de[i, j][40] = arr_de[i, j][3, 5]
                zz_de[i, j][41] = arr_de[i, j][2, 6]
                zz_de[i, j][42] = arr_de[i, j][1, 7]

                zz_de[i, j][43] = arr_de[i, j][2, 7]
                zz_de[i, j][44] = arr_de[i, j][3, 6]
                zz_de[i, j][45] = arr_de[i, j][4, 5]
                zz_de[i, j][46] = arr_de[i, j][5, 4]
                zz_de[i, j][47] = arr_de[i, j][6, 3]
                zz_de[i, j][48] = arr_de[i, j][7, 2]

                zz_de[i, j][49] = arr_de[i, j][7, 3]
                zz_de[i, j][50] = arr_de[i, j][6, 4]
                zz_de[i, j][51] = arr_de[i, j][5, 5]
                zz_de[i, j][52] = arr_de[i, j][4, 6]
                zz_de[i, j][53] = arr_de[i, j][3, 7]

                zz_de[i, j][54] = arr_de[i, j][4, 7]
                zz_de[i, j][55] = arr_de[i, j][5, 6]
                zz_de[i, j][56] = arr_de[i, j][6, 5]
                zz_de[i, j][57] = arr_de[i, j][7, 4]

                zz_de[i, j][58] = arr_de[i, j][7, 5]
                zz_de[i, j][59] = arr_de[i, j][6, 6]
                zz_de[i, j][60] = arr_de[i, j][5, 7]

                zz_de[i, j][61] = arr_de[i, j][6, 7]
                zz_de[i, j][62] = arr_de[i, j][7, 6]

                zz_de[i, j][63] = arr_de[i, j][7, 7]
            # end of for (j)
        # end of for (i)
        list_coef_de.append(zz_de)
    # end of for (c)
    time_elapsed_de = time.time() - time_beg_de
    print("[File: %s]" % (os.path.basename(fpath)))
    print(
        "[Time] C-optimized: %f, Naive Python: %f" % (time_elapsed_zz, time_elapsed_de),
        end="\n\n",
    )

[File: football05.jpg]
[Time] C-optimized: 0.011345, Naive Python: 0.140260

[File: football02.jpg]
[Time] C-optimized: 0.010443, Naive Python: 0.135342
[File: greenlake03.jpg]
[Time] C-optimized: 0.007940, Naive Python: 0.154066

[File: greenlake06.jpg]
[Time] C-optimized: 0.008807, Naive Python: 0.132836
[File: greenlake09.jpg]
[Time] C-optimized: 0.009086, Naive Python: 0.134209

[File: football09.jpg]
[Time] C-optimized: 0.010123, Naive Python: 0.134594
[File: test01.jpg]
[Time] C-optimized: 0.013407, Naive Python: 0.232347

[File: cherries01.jpg]
[Time] C-optimized: 0.010126, Naive Python: 0.133333
[File: test06.jpg]
[Time] C-optimized: 0.064610, Naive Python: 0.848039

[File: greenlake04.jpg]
[Time] C-optimized: 0.011782, Naive Python: 0.154014
[File: cherries02.jpg]
[Time] C-optimized: 0.011211, Naive Python: 0.146698

[File: arborgreens02.jpg]
[Time] C-optimized: 0.009580, Naive Python: 0.134932
[File: football08.jpg]
[Time] C-optimized: 0.010058, Naive Python: 0.136592

[File: