GitHub - FatikImran/Parallel-Triangle-Counting: C++17 Triangle Counting on real SNAP graphs with scalar, SIMD intrinsics, OpenMP SIMD, OpenMP threads, and hybrid variants; reproducible benchmarking + analysis.

Quick Start (3 commands)

make
./scripts/run_benchmarks.sh 8 1
./scripts/analyze_latest.sh

This builds the project, runs all main datasets (including LiveJournal), and generates analysis artifacts under results/analysis/.

Build

make

Run

./triangle_counter <edge_list_file> [threads]

Examples:

./triangle_counter data/sanity_edges.txt
./triangle_counter data/sanity_edges.txt 8

Project structure (what each file does)

Core C++ code

src/main.cpp
- Program entry point.
- Parses CLI args (input_path, optional threads).
- Prints architecture info, runs scalar baseline, then runs all optimized variants.
- Reports vertices, edges, triangle count, time, and speedup.
src/triangle_counter.hpp
- Public API and shared data structures.
- Declares Graph, Variant, preprocessing and counting function signatures.
src/triangle_counter.cpp
- Graph loading and preprocessing:
  - reads edge list,
  - removes self-loops,
  - removes duplicate undirected edges,
  - builds degree-based ordering,
  - builds oriented forward adjacency.
- Triangle counting kernels:
  - scalar,
  - SIMD intrinsics,
  - OpenMP SIMD,
  - OpenMP threading,
  - hybrid variants.
src/arch_info.hpp
- Architecture info struct and function declarations.
src/arch_info.cpp
- Detects ISA/SIMD capability and machine core/thread topology.
- Prints architecture details used in experiments/report.
Makefile
- Builds the triangle_counter executable with OpenMP enabled.

Scripts and analysis

scripts/run_benchmarks.sh
- Runs datasets automatically.
- Decompresses .txt.gz datasets to temp text files.
- Executes all variants and writes benchmark CSV under results/.
scripts/analyze_latest.sh
- Finds latest benchmark CSV and runs analysis.
scripts/analyze_results.py
- Aggregates benchmark rows, computes best variant per dataset, writes aggregated_results.csv and summary.md, and optionally plots PNGs.
scripts/extract_datasets.sh
- One-time extraction helper for all .txt.gz datasets.
scripts/create_submission_zip.sh
- Creates submission staging folder/zip with strict assignment naming.

Data and outputs

data/
- Small sanity input(s) for quick correctness testing.
datasets/
- Assignment datasets (.txt.gz compressed edge lists).
results/
- Raw benchmark CSV outputs and analysis artifacts.

Common input mistakes

triangle_counter expects a plain text edge list file as input.
Passing .tar.gz or .gz directly to triangle_counter can produce invalid parsing or meaningless output.
Correct usage options:
- Decompress first, then pass the extracted .txt file to triangle_counter.
- Or use scripts/run_benchmarks.sh, which handles .txt.gz decompression automatically.

Datasets in this workspace

datasets/email-Eu-core.txt.gz
datasets/facebook_combined.txt.gz
datasets/com-lj.ungraph.txt.gz

(com-friendster.ungraph.txt.gz is intentionally not included due to size.)

Benchmark automation

Run all available assignment datasets and export a CSV:

./scripts/run_benchmarks.sh [threads] [include_large]

threads: optional (default prefers detected physical cores; falls back to nproc)
include_large: optional 0/1; when 1, also runs com-lj.ungraph.txt.gz

The output CSV is written under results/.

Optional one-time extraction of all .txt.gz files:

./scripts/extract_datasets.sh

Analyze benchmark results

Analyze the latest benchmark CSV and generate report-ready artifacts:

./scripts/analyze_latest.sh

Or analyze a specific CSV:

python3 scripts/analyze_results.py results/benchmark_YYYYMMDD_HHMMSS.csv --out-dir results/analysis

Generated outputs:

results/analysis/aggregated_results.csv
results/analysis/summary.md
per-dataset PNG plots (if matplotlib is installed)

Input format

Plain text edge list with two vertex IDs per line.
Lines starting with # or % are treated as comments.
Self-loops are removed.
Duplicate undirected edges are removed.

Variants implemented

scalar
simd-intrinsics
openmp-simd
openmp-threads
hybrid-intrinsics-openmp
hybrid-openmp-simd-openmp

Notes

Vertex ordering is degree-based (tie-break by vertex ID).
Triangle counting uses oriented forward adjacency and counts each triangle exactly once.
OpenMP thread count is controlled by CLI argument; if omitted, defaults to detected physical cores.

Requirement mapping

Architecture awareness: ISA/SIMD width, vector lanes, physical cores, and hardware threads are printed at startup.
Input preprocessing correctness: self-loops and duplicate undirected edges are removed.
Duplicate triangle avoidance: degree-based vertex ordering with forward-oriented adjacency is used.
Variant coverage: scalar baseline + all 5 required comparison variants are executed from one binary.
Thread control: all OpenMP variants use explicit omp_set_num_threads; benchmark script defaults to physical cores when not specified.

Deliverables in this repo

Source code and build system: src/, Makefile
Reproducibility scripts: scripts/run_benchmarks.sh, scripts/analyze_latest.sh, scripts/analyze_results.py
Current experimental outputs: results/ and results/analysis/

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Reference Papers		Reference Papers
data		data
datasets/1. dataset sources		datasets/1. dataset sources
scripts		scripts
src		src
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Quick Start (3 commands)

Build

Run

Project structure (what each file does)

Core C++ code

Scripts and analysis

Data and outputs

Common input mistakes

Datasets in this workspace

Benchmark automation

Analyze benchmark results

Input format

Variants implemented

Notes

Requirement mapping

Deliverables in this repo

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Quick Start (3 commands)

Build

Run

Project structure (what each file does)

Core C++ code

Scripts and analysis

Data and outputs

Common input mistakes

Datasets in this workspace

Benchmark automation

Analyze benchmark results

Input format

Variants implemented

Notes

Requirement mapping

Deliverables in this repo

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages