I wrote this code, chiefly comprising C++ algorithms, to collect my own data for my Senior Research Project (SRP) on the set cover problem (SCP), "Heuristics of the Set Cover Problem". The data provide insight on the accuracy, runtime, and feasibility of various exact algorithms and approximation heuristics.
The repository implements the following algorithms/heuristics:
ID | Name |
---|---|
NG | Naive-greedy |
OG | Optimized-greedy |
2ME | 2ᵐ-exact |
2NE | 2ⁿ-exact |
Documentation in solveScpInstance
supplies further descriptions on these algorithms.
The repository contains 5 C++ functions to help collect data on heuristics for SCP:
Function name | File | Description |
---|---|---|
generateSCPinstance |
generator.cpp |
Generates random SCP instances (input data sets) |
writeSCPinstance |
generator.cpp |
Writes a SCP instance to a file |
readSCPinstance |
solver.cpp |
Reads and parses a SCP instance from a file |
solveSCPinstance |
solver.cpp |
Solves or approximates a SCP instance using an algorithm, producing a SCP solution |
writeSCPsolution |
solver.cpp |
Writes a SCP solution to a file |
To manipulate how the functions are run, modify main.cpp
.
Before running the project, it is recommended that you close all other application windows to minimize interference with computing resources used by this program.
Then, to run the project, compile and run the entry point main.cpp
. Example: g++ -std=gnu++17 main.cpp -o main && ./main
.
The program will then run a full factorial experiment with kTrialsPerCondition
trials on all combinations of matrix sizes in kMatrixSizes
and densities in kDensities
. In each trial, all implemented algorithms with feasible time and memory complexities will be run.
- SCP input data sets are read from and written to
kInputDirectory
:- SCP instances (input data sets)
- Random instances generated by
generateSCPinstance
- Instances you download from the OR-Library or other sources
- Random instances generated by
- SCP instances (input data sets)
- SCP output data sets are written to
kOutputDirectory
, with a new directory generated for each run of the program:- SCP solutions (algorithm output data sets) generated by
solveSCPinstance
- Statistics files generated in
main.cpp
- Each file summarizes a single variable (runtime, total cost, or approximation ratio) over all algorithms and all densities grouped with an input size.
- Values are delimited with horizontal tabs such that the contents of the file can be pasted directly into a spreadsheet like Google Sheets.
- SCP solutions (algorithm output data sets) generated by
Style notes: The code is written in alignment with the Google C++ Style Guide and sometimes documented with Doxygen.
The OR-Library SCP page contains many real-world SCP data sets that can be used for testing.
Note: Lists of rows in each column and lists of columns in each row are not guaranteed to be in order. However, the algorithms in solveSCPinstance
are insensitive to unordered lists (they work normally).
[# of rows/elements (n)] [# of columns/sets (m)]
[cost_1] ... [cost_m]
// For each row 1 ≤ i ≤ n (each row is described by 2 lines):
[# of columns covering row i]
[a list of columns covering row i]
Problem set | File count | Files |
---|---|---|
4 | 10 | scp41, ..., scp410 |
5 | 10 | scp51, ..., scp510 |
6 | 5 | scp61, ..., scp65 |
A | 5 | scpa1, ..., scpa5 |
B | 5 | scpb1, ..., scpb5 |
C | 5 | scpc1, ..., scpc5 |
D | 5 | scpd1, ..., scpd5 |
E | 5 | scpe1, ..., scpe5 |
Problem set | File count | Files |
---|---|---|
E | 5 | scpnre1, ..., scpnre5 |
F | 5 | scpnrf1, ..., scpnrf5 |
G | 5 | scpnrg1, ..., scpnrg5 |
H | 5 | scpnrh1, ..., scpnrh5 |
Problem set | File count | Files |
---|---|---|
Corresponds to CYC set (number of edges required to hit every 4-cycle in a hypercube) | 6 | scpcyc06, ..., scpcyc11 |
Corresponds to the CLR set of problems (number of 4-tuples forming the smallest non-bi-chromatic hypergraph) | 4 | scpclr10, ..., scpclr13 |
[# of rows/elements (n)] [# of columns/sets (m)]
// For each column 1 ≤ j ≤ m (each column is described by 1 line):
[cost of column j] [# of rows that j covers] [a list of the rows j covers]
File | Row count (n) | Column count (m) |
---|---|---|
rail507 | 507 | 63009 |
rail516 | 516 | 47311 |
rail582 | 582 | 55515 |
rail2536 | 2536 | 1081841 |
rail2586 | 2586 | 920683 |
rail4284 | 4284 | 1092610 |
rail4872 | 4872 | 968672 |