Skip to content
/ SCP Public

Code for my Senior Research Project on the Set Cover Problem

Notifications You must be signed in to change notification settings

spiralsim/SCP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SCP Data Collector

About

I wrote this code, chiefly comprising C++ algorithms, to collect my own data for my Senior Research Project (SRP) on the set cover problem (SCP), "Heuristics of the Set Cover Problem". The data provide insight on the accuracy, runtime, and feasibility of various exact algorithms and approximation heuristics.

Algorithms

The repository implements the following algorithms/heuristics:

ID Name
NG Naive-greedy
OG Optimized-greedy
2ME 2-exact
2NE 2-exact

Documentation in solveScpInstance supplies further descriptions on these algorithms.

Structure

The repository contains 5 C++ functions to help collect data on heuristics for SCP:

Function name File Description
generateSCPinstance generator.cpp Generates random SCP instances (input data sets)
writeSCPinstance generator.cpp Writes a SCP instance to a file
readSCPinstance solver.cpp Reads and parses a SCP instance from a file
solveSCPinstance solver.cpp Solves or approximates a SCP instance using an algorithm, producing a SCP solution
writeSCPsolution solver.cpp Writes a SCP solution to a file

Usage

To manipulate how the functions are run, modify main.cpp.

Before running the project, it is recommended that you close all other application windows to minimize interference with computing resources used by this program.

Then, to run the project, compile and run the entry point main.cpp. Example: g++ -std=gnu++17 main.cpp -o main && ./main.

The program will then run a full factorial experiment with kTrialsPerCondition trials on all combinations of matrix sizes in kMatrixSizes and densities in kDensities. In each trial, all implemented algorithms with feasible time and memory complexities will be run.

I/O

  • SCP input data sets are read from and written to kInputDirectory:
    • SCP instances (input data sets)
      • Random instances generated by generateSCPinstance
      • Instances you download from the OR-Library or other sources
  • SCP output data sets are written to kOutputDirectory, with a new directory generated for each run of the program:
    • SCP solutions (algorithm output data sets) generated by solveSCPinstance
    • Statistics files generated in main.cpp
      • Each file summarizes a single variable (runtime, total cost, or approximation ratio) over all algorithms and all densities grouped with an input size.
      • Values are delimited with horizontal tabs such that the contents of the file can be pasted directly into a spreadsheet like Google Sheets.

Style notes: The code is written in alignment with the Google C++ Style Guide and sometimes documented with Doxygen.

Description and Info for the OR-Library Data Sets

The OR-Library SCP page contains many real-world SCP data sets that can be used for testing.

Note: Lists of rows in each column and lists of columns in each row are not guaranteed to be in order. However, the algorithms in solveSCPinstance are insensitive to unordered lists (they work normally).

Group 1 (80 files)

Format: rows

[# of rows/elements (n)] [# of columns/sets (m)]
[cost_1] ... [cost_m]
// For each row 1 ≤ i ≤ n (each row is described by 2 lines):
[# of columns covering row i]
[a list of columns covering row i]

Subgroup 1A (50 files)

Problem set File count Files
4 10 scp41, ..., scp410
5 10 scp51, ..., scp510
6 5 scp61, ..., scp65
A 5 scpa1, ..., scpa5
B 5 scpb1, ..., scpb5
C 5 scpc1, ..., scpc5
D 5 scpd1, ..., scpd5
E 5 scpe1, ..., scpe5

Subgroup 1B (20 files)

Problem set File count Files
E 5 scpnre1, ..., scpnre5
F 5 scpnrf1, ..., scpnrf5
G 5 scpnrg1, ..., scpnrg5
H 5 scpnrh1, ..., scpnrh5

Subgroup 1C (10 files)

Problem set File count Files
Corresponds to CYC set (number of edges required to hit every 4-cycle in a hypercube) 6 scpcyc06, ..., scpcyc11
Corresponds to the CLR set of problems (number of 4-tuples forming the smallest non-bi-chromatic hypergraph) 4 scpclr10, ..., scpclr13

Group 2 (7 files)

Format: columns

[# of rows/elements (n)] [# of columns/sets (m)]
// For each column 1 ≤ j ≤ m (each column is described by 1 line):
[cost of column j] [# of rows that j covers] [a list of the rows j covers]

Subgroup 2A (7 files)

File Row count (n) Column count (m)
rail507 507 63009
rail516 516 47311
rail582 582 55515
rail2536 2536 1081841
rail2586 2586 920683
rail4284 4284 1092610
rail4872 4872 968672

About

Code for my Senior Research Project on the Set Cover Problem

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published