# Create Image Folder

This notebook downloads the WikiArt dataset (for research and educational use only),
extracts the archive, and organizes all images into a single folder for further processing.

- **Dataset**: WikiArt (see README for access instructions)
- **Output**: `../images/` — a flattened directory of all images
- **Usage**: Run all cells in order after configuring the dataset download link.

In [None]:
import os
import gdown
import zipfile
import pandas as pd
from tqdm.notebook import tqdm
import shutil

In [None]:
# Dataset information
dataset_url = "<dataset_download_link>"   # (see README for the actual URL)
zip_path = "../wikiart_dataset.zip"

# Only download if file is missing
if not os.path.exists(zip_path):
    print("Downloading dataset ZIP file (this may take a while)...")
    gdown.download(dataset_url, zip_path, quiet=False)
    print("Download complete.")
else:
    print("Dataset ZIP already exists — skipping download.")

In [None]:
extract_dir = "../WikiArt_images/"

if os.path.exists(zip_path):
    print("Extracting dataset...")
    with zipfile.ZipFile(zip_path, 'r') as zip_ref:
        zip_ref.extractall(extract_dir)
    print("Extraction complete.")
else:
    print("Dataset zip file not found. Please download it manually as instructed above.")

In [None]:
# Load dataset
df = pd.read_csv('https://raw.githubusercontent.com/thefth/ArtSAGENet/main/Dataset/wikiart_full.csv')

In [None]:
create_dir = "../images/"

if not os.path.exists(create_dir):
    os.makedirs(create_dir)
    print(f"Created directory: {create_dir}")
else:
    print(f"Directory already exists: {create_dir}")

In [None]:
# Move images into a flat directory for easier access
for rel_path in tqdm(df['relative_path'].tolist(), desc="Moving images"):
    src = os.path.join(extract_dir, rel_path)
    dst = os.path.join(create_dir, os.path.basename(rel_path))
    if os.path.exists(src):
        shutil.move(src, dst)

moved_files = len(os.listdir(create_dir))
print(f"Moved {moved_files} images into {create_dir}")

---

**Note:**  
This notebook is provided for academic and research purposes.  
The WikiArt dataset remains the property of its original authors and is distributed under its respective usage terms.  
Please cite the dataset’s original publication if used in research.