A rotation-invariant vector representation of planar shapes built on top of the SQUID fish dataset. Every shape is turned into a star-shaped radial function on the discrete circle, then collapsed into a single signature vector whose Euclidean distance is invariant to rotations of the underlying shape. With this representation, clustering and nearest-neighbour search can run directly in vector space — no per-shape alignment needed.
A separate non-star branch generalises the same idea to non-star objects via
an annulus decomposition.
.
|-- data/
| |-- SQUID/ raw SQUID dataset (1100 fish: .gif + .pts)
| |-- ExtractedFromGifs/ boundary points + visualisations recovered from the .gif files
| | (.pts and .png per shape; the pipeline reads from here)
| `-- experiment_ids.json fixed shape IDs used in the rotation and kNN experiments
|-- src/ pipeline scripts
|-- figures/ scripts that render headline figures
|-- docs/ static images used in this README
`-- README.md
After running the pipeline the following files appear (all git-ignored):
data/combinedPts.json,data/combinedPtsNormalizedByArea.json— cached vertex datadata/starShaped_signatures*ByArea.json,data/euclideanStarSignatures128ByArea.json— cached signatures and pairwise distancesdata/rotatedSampleByArea*.json— the random-rotation collections used for clusteringfigures/*.png,figures/*.pdf— rendered figuresreports/knn_EuclideanStar/<query_id>/— per-query 5-NN PNGs
Everything shipped in the repo is enough to regenerate them.
data/SQUID/ is the upstream SQUID dataset (1100 fish outlines) of
Mokhtarian, Abbasi and Kittler. Each shape comes as a binary GIF plus a .pts
file with boundary samples.
data/ExtractedFromGifs/ contains the same shapes after re-extracting their
boundaries from the GIFs (so the point ordering is consistent across the
dataset) along with a PNG preview per shape; the whole pipeline reads from
this folder.
A sample preprocessed shape:
A 4x6 sample grid of fish from the dataset:
For a star-shaped function f : Z_m -> R (radial samples of a standardised
shape), the rotation-invariant signature is
V_f[k] = (1/m) * sum_{j=0..m-1} exp(-(f[j] - f[(j+k) mod m]))
which is a circular convolution of two pointwise exponentials and is therefore
computed in O(m log m) via the FFT (see src/starShapedSignatureMaker.py).
The Euclidean distance ||V_f - V_g|| is invariant to discrete rotations of
f as well as the Reverse-of-Complement operation f -> M - f(2*pi - .).
The numbered scripts are the ones you actually run; the others are helpers imported / invoked by them.
| # | Script | What it does |
|---|---|---|
| 1 | src/extractor.py |
Reads the raw .gif files, traces boundaries with OpenCV, and writes ordered .pts files into data/ExtractedFromGifs/. |
| 2 | src/combinePts.py |
Combines the per-shape .pts files into a single data/combinedPts.json. |
| 3 | src/combinePtsNormalizeByArea.py |
Translates each polygon so its area-weighted centroid is at the origin, then rescales so the maximum radius is 1. Writes data/combinedPtsNormalizedByArea.json. |
| 4 | src/draw.py |
(Optional) Renders a PNG preview of every .pts file in data/ExtractedFromGifs/. |
| 5 | src/starShapedSignatureMaker.py |
Approximates each polygon by a star-shaped function f: S^m -> R+ and computes its FFT-based signature V_f. Writes data/starShaped_signatures128ByArea.json. |
| 6 | src/sampleRotatedVersion.py |
Picks 10 random shape IDs, generates 9 random-rotation copies of each (100 shapes total), and writes one data/rotatedSampleByArea*.json file. Run multiple times to get multiple trials. |
| 7 | src/euclideanOfStarSignatures.py |
Computes the full 1100 x 1100 pairwise Euclidean distance matrix between signatures. Writes data/euclideanStarSignatures128ByArea.json. |
| 8 | src/kmeansForStarShaped.py |
Simple stand-alone k-means clustering report on one rotated-sample file. Useful as a smoke test. |
| 9 | src/sign_kmeans_report.py |
Sweeps m in {8,16,32,48,64,80,96,128,256,512,1024} across every data/rotatedSampleByArea*.json file and appends per-trial accuracy rows to figures/kmeans_results_per_file.csv. |
| 10 | src/sign_kmeans_reportFromCSV_with_std.py |
Reads the CSV produced by step 9 and renders the accuracy-vs-m plot with a +/- 1 std band (figures/overall_accuracy_vs_ntheta_with_std.png). |
| 11 | src/knn_Euclidean.py |
5-NN retrieval for the 10 fixed query IDs (see data/experiment_ids.json). Writes reports/knn_EuclideanStar/<idx>/. |
| 12 | src/resultOfReports.py |
Convenience report summariser used to inspect intermediate JSON outputs. |
| Script | Output | What it shows |
|---|---|---|
figures/shape_table.py |
figures/shape_table_4x6.{png,pdf} |
A 4x6 grid of sample fish outlines from the dataset. |
figures/shape8_triple.py |
figures/kk102_triple_styled.{png,pdf} |
The same fish at three discretisation levels (m = 36, 120, 360). |
figures/shape8_styled.py |
figures/kk53_60_styled_COMBINED.{png,pdf} |
Illustration of the Reverse-of-Complement operation on a single fish. |
figures/knn_combined_figure.py |
figures/combined_knn_accuracy.png |
Side-by-side: clustering accuracy curve and a 5-NN retrieval gallery. |
Tested with Python 3.13.
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txtIf data/ExtractedFromGifs/ is already populated (it ships with the repo) you
can skip step 1.
python3 src/extractor.py # GIF -> .pts (only needed once)
python3 src/combinePts.py # data/combinedPts.json
python3 src/combinePtsNormalizeByArea.py # data/combinedPtsNormalizedByArea.json
python3 src/starShapedSignatureMaker.py # data/starShaped_signatures128ByArea.json
python3 src/euclideanOfStarSignatures.py # data/euclideanStarSignatures128ByArea.json# 6 independent trials (each samples 10 shapes + 9 random rotations each)
for i in 0 1 2 3 4 5; do python3 src/sampleRotatedVersion.py; done
python3 src/sign_kmeans_report.py # -> figures/kmeans_results_per_file.csv
python3 src/sign_kmeans_reportFromCSV_with_std.py # -> figures/overall_accuracy_vs_ntheta_with_std.pngsampleRotatedVersion.py writes to data/rotatedSampleByArea<N>.json; the
exact shape IDs used in each trial are recorded in data/experiment_ids.json
if you want to reproduce them byte-for-byte.
python3 src/knn_Euclidean.py
python3 figures/knn_combined_figure.pyThis writes figures/combined_knn_accuracy.png.
python3 figures/shape8_triple.py
python3 figures/shape8_styled.py
python3 figures/shape_table.pyThis code accompanies the paper Rotation-Invariant Vectorized Shape Representations by Hamid Shafieasl and Jeff M. Phillips, available on arXiv: https://arxiv.org/abs/2605.27498.
@misc{shafieasl2025rotation,
author = {Hamid Shafieasl and Jeff M. Phillips},
title = {Rotation-Invariant Vectorized Shape Representations},
year = {2025},
eprint = {2605.27498},
archivePrefix = {arXiv},
primaryClass = {cs.CG},
url = {https://arxiv.org/abs/2605.27498}
}



