# Visualizing TID2008

> We'll be logging TID2008 into WandB to visualize it.

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
#| hide
import os; os.environ["CUDA_VISIBLE_DEVICES"] = "-1"

In [28]:
from pathlib import Path

import pandas as pd
import wandb
from fastcore.foundation import L
from fastcore.parallel import parallel
from fastprogress.fastprogress import master_bar, progress_bar

from iqadatasets.datasets import TID2008

Logging a dataset into WandB as a `Table` is actually very easy and can help us know more about our data. Because we have already built a set of helper classes to load the datasets, it will be easy to do. We'll start by building the helper object and inspecting the original `.csv` file:

In [7]:
dst = TID2008(path = Path("/lustre/ific.uv.es/ml/uv075/Databases/IQA/TID/TID2008"))

In [8]:
dst.data

Unnamed: 0,Reference,Distorted,MOS,Reference_ID,Distortion_ID,Distortion_Intensity
0,I01.BMP,I01_01_1.bmp,5.9706,1,1,1
1,I01.BMP,I01_01_2.bmp,5.4167,1,1,2
2,I01.BMP,I01_01_3.bmp,4.5556,1,1,3
3,I01.BMP,I01_01_4.bmp,4.3143,1,1,4
4,I01.BMP,I01_02_1.bmp,6.1429,1,2,1
...,...,...,...,...,...,...
1695,I25.BMP,I25_16_4.bmp,4.6000,25,16,4
1696,I25.BMP,I25_17_1.bmp,7.2400,25,17,1
1697,I25.BMP,I25_17_2.bmp,5.0000,25,17,2
1698,I25.BMP,I25_17_3.bmp,6.4615,25,17,3


As we can see, the `.csv` file already contains all the information we need.

We could start iterating over its rows to load the images, but we can make use of the `.dataset` attribute to do it in a batched fashion, which will be quite faster.

In [25]:
round(len(dst.data) / BATCH_SIZE)

13

In [30]:
BATCH_SIZE = 128
total = len(dst.data)/BATCH_SIZE
total = round(total) + 1 if total > round(total) else round(total)
reference, distorted = L(), L()
mb = master_bar(dst.dataset.batch(BATCH_SIZE), total=total, total_time=True)
for img, dist, mos in mb:
    img = L([wandb.Image(i) for i in progress_bar(img, leave=True, master=mb)])
    dist = L([wandb.Image(i) for i in progress_bar(dist, leave=True, master=mb)])
    reference.extend(img)
    distorted.extend(dist)

 |████████████████████████████████████████| 100.00% [36/36 00:01<00:00]0]

Having created all the necessary `wandb.Image` objects, we can replace the `Reference` and `Distorted` columns in the previous dataframe and log it into *WandB*:

In [31]:
wb_table = dst.data.copy()
wb_table["Reference"] = reference
wb_table["Distorted"] = distorted
wb_table.head()

Unnamed: 0,Reference,Distorted,MOS,Reference_ID,Distortion_ID,Distortion_Intensity
0,<wandb.sdk.data_types.image.Image object at 0x...,<wandb.sdk.data_types.image.Image object at 0x...,5.9706,1,1,1
1,<wandb.sdk.data_types.image.Image object at 0x...,<wandb.sdk.data_types.image.Image object at 0x...,5.4167,1,1,2
2,<wandb.sdk.data_types.image.Image object at 0x...,<wandb.sdk.data_types.image.Image object at 0x...,4.5556,1,1,3
3,<wandb.sdk.data_types.image.Image object at 0x...,<wandb.sdk.data_types.image.Image object at 0x...,4.3143,1,1,4
4,<wandb.sdk.data_types.image.Image object at 0x...,<wandb.sdk.data_types.image.Image object at 0x...,6.1429,1,2,1


In [34]:
wb_table_table = wandb.Table(data=wb_table)

In [32]:
wandb.init(job_type="viz_data",
           project="TID2008",
           name="VizTID2008")

Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.
[34m[1mwandb[0m: Currently logged in as: [33mjorgvt[0m. Use [1m`wandb login --relogin`[0m to force relogin


In [35]:
wandb.log({"TID2008": wb_table_table})

In [36]:
wandb.finish()

VBox(children=(Label(value='556.905 MB of 556.966 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=0.99…