Mark Glinberg, Seth Nye, Steve Tang, Shih-Huan Chou
CSE 6140 - Final Project
04/24/2024
This is our submission for the Computational Science and Engineering Algorithms Final Project, which consists of four different algorithms that all solve the 0-1 Knapsack Problem. The algorithms are Branch-and-Bound, a Greedy Approximation, and two local search (ls) algorithms which are Simulated Annealing and Random Restart Hill Climbing.
Beyond these algorithms, we've also written code to compare the different algorithms along certain metrics as well as generate plots for the two local search algorithms.
README.md - this file!
requirements.txt - requirements file to easily set up conda environment
knapsack.yml - packaged conda environment in case requirements.txt does not work
exec.py - main file that you run for anything
BnB.py - contains the Branch-and-Bound implementation, called by exec.py
approximation.py - contains the greedy approximation implementation, called by exec.py
simulated_annealing.py - contains the simulated annealing implementation, called by exec.py
hill_climbing.py - contains the random restart hill climbing implementation, called by exec.py
generate_plots.py - script to generate lots of trace files and create QRTDs and SQDs from them (only for local search algorithms), called by exec.py
average.py - script to compute average time, quality, and relative error of the final solutions generated by different algorithms across all instances
run_BnB.ipynb - notebook with script to run simulatneous instances of BnB
run_ls1.ipynb - notebook with script to run simulatneous instances of simulated annealing
run_ls2.ipynb - notebook with script to run simulatneous instances of hill climbing
In each problem, the data is structured as follows: The first row indicates the number of items and the capacity of the knapsack. Subsequent rows provide the value and weight of each item
test - directory with test problems (5-24 items)
test_solution - directory with optimal solution values and optimal item choices for test instances
small_scale - directory with small-scale problems (4-23 items)
small_scale_solution - directory with optimal solution values for small-scale instances
large_scale - directory with large-scale problems (100-10000 items)
large_scale_solution - directory with optimal solution values for large-scale instances
The output direcotry contains all of our regular outputs for any algorithm running on any instance.
output/our_solutions - Contains all solution files generated with line one being the total value of the best solution found and line two containing the indices of the items selected
-
solutions are 0-indexed
-
file names follow this format: <instance>_<method>_<cutoff>_<randSeed>.sol
output/our_traces - Contains all traces files generated with each line containing a timestamp and the value of the newly discovered best solution. Trace files aren't generated for the approximation algorithm
- file names follow this format: <instance>_<method>_<cutoff>_<randSeed>.trace
output/plot_traces - contains all of our trace files for plotting the QRTD and SQD. It contains 200 total trace files: 50 trace files for each local search algorithm run on large_1 and large_3
output/table_instances - contains 10 runs of our local search algorithms for each instance in DATASET. This is used to generate an average for each instance for use in creating our table for evalutation in the report
ls*_averages.trace - Contains average times, qualities, and relative errors for each instance ran by a local search algorithm. Each local search algorithm is stochastic, so 10 different seeds were ran and these values were calculated from those. These files are generated by average.py
All images and pdfs are generated QRTDs, SQDs, and boxplots for the local search algorithms, simulated annealing and hill climbing.
Navigate to the knapsack-project/code/ directory using your conda command prompt. Then run the following commands to create a proper conda environment to run this code (you may skip creating the conda environment if you already can run numpy, pandas, and matplotlib):
>> conda create -n <env_name> python=3.10
>> conda activate <env_name>From here, make sure pip is installed using the following line, and afterwards install the requirements needed:
>> conda install pip
>> pip install -r requirements.txtAlternatively, you can create a conda environment using knapsack.yml:
>> conda env create -f knapsack.ymlOnce that's done, you can now run exec.py using the following command:
>> python exec.py -inst <filename> -alg [bnb|approx|ls1|ls2] -time <cutoff (secs)> -seed <random seed>The flag -inst followed by a string will denote the name of the input file from which you want to run the program. You can choose any of the instances in the DATASET directory.
The flag -alg followed by a string will denote the algorithm you want to run on a given instance. Your options for algorithms are "bnb", "approx", "ls1", or "ls2".
The flag -time denotes the max amount of time you'll let the algorithm run for (default 30 secs).
The flag -seed denotes the random seed you want to use (only applicable for local search algorithms). Default is 0.
Some examples are below:
>> python exec.py -inst small_1 -alg bnb -time 10
>> python exec.py -inst small_6 -alg approx -time 2
>> python exec.py -inst large_3 -alg ls1 -time 30 -seed 3
>> python exec.py -inst large_20 -alg ls2 -time 600 -seed 3000The flag -p can be used if you want the program to generate QRTD and SQD charts for a local search algorithm. Charts can only be generated on existing trace files in plot_traces.
If there are no trace files in plot_traces or you want to regenerate new ones, you can use the flag -g with -p and then it will first generate traces on instances "large_1" and "large_3", and then plot the graphs.
When plotting and generating, you must select which local search algorithm you want to use.
>> python exec.py -alg ls1 -p
>> python exec.py -alg ls1 -p -g