Skip to content

This code implements a hybrid modeling technique that combines the strengths of surrogate-based modeling (SBM) and k nearest-neighbor regression $k$NN into a single method called HYPPO. The hybrid method is a piecewise polynomial model composed of many small, local models. We demonstrate that HYPPO significantly improves overall prediction accur…

License

Notifications You must be signed in to change notification settings

TauferLab/HYPPO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HYPPO: A Hybrid, Piecewise Polynomial Modeling Technique for Non-Smooth Surfaces

Using the source code

Note The code uses Python 2.7

The modeling code, build_model.py requires as input a .csv file. The format of the csv file must be:

x_(1,1), x_(1,2), x_(1,3), ..., x_(1,n), z_1

x_(2,1), x_(2,2), x_(2,3), ..., x_(2,n), z_2

...

x_(m,1), x_(m,2), x_(m,3), ..., x_(m,n), z_m

x_(i,j) is the value of the j'th variable in the i'th observation. z_i is the corresponding dependent variable. A sample data file is included called WordCount_times.txt This file contains the run time of Word Count (using Spark) varying two framework parameters (the x and y values in the file).

The build_model.py code accepts the .csv file of data and creates an n-dimensional grid of predictions. The usage is:

./build_model.py FileName --model HYPPO --K --ranges xmin xmax xstep ymin ymax ystep ...

For each independent variable, the code requires 3 parameters: min, max, and step. In the code, ranges is optional and contains default ranges suitable for the provided data. However, if the model you construct has a number of independent variables other than 2 you MUST set this to fit your data. The model predicts the dependent variable for each point in the cartesian product of the lists range(min, max, step). K is the number of nearest neighbors used to build the model--either KNN or HYPPO. To run the code on the sample dataset:

./build_model.py WordCount_times.txt --model HYPPO --K 7 --ranges 1 60 1 100 1500 100 >> model.txt

This stores the output of build_model.py in a text file called model.txt The file model.txt is in the correct format for use with the other piece of provided code: plot_3d_model.py

The usage for this code is simply:

./plot_3D_model.py model.txt

Typing:

./build_model.py --help

Will display additional help information for the code.

Note This code can be used to compute KNN regression (fixing polynomial degree to be 0) by using the --model KNN flag. It can also be used to compute traditional Surrogate Based Models (SBM) where the degree of the polynomial is flexible, but K=N-1 (1 less than the number of data points available). This can be accomplished with the --model SBM flag.

About

This code implements a hybrid modeling technique that combines the strengths of surrogate-based modeling (SBM) and k nearest-neighbor regression $k$NN into a single method called HYPPO. The hybrid method is a piecewise polynomial model composed of many small, local models. We demonstrate that HYPPO significantly improves overall prediction accur…

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages