Skip to content

An XGBoost Regression model that predicts interfacial tension and pressure from the edge profile of a pendant drop.

Notifications You must be signed in to change notification settings

DmitriLyalikov/pdt_regressor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pdt-regressor

A collection of ensemble models that predict interfacial tension (beta) from the edge profile of a pendant drop.

Artificial Model

pdt-regressor can currently predict Beta values given a drop profile feature set with an RMSE of .004. feature datasets are currently generated by solving ODE to solve for radii of the droplet across a range of Beta and Smax values.

Image Model

This model will also be trained and tested on a real droplet image profiles where the feature data sets are extracted from the output of the pdt-canny-edge-detector.
profile data will be stored in the /data folder in .csv format.

Table of Contents

  1. Requirements
  2. Setup
  3. Usage
  4. Feature Extraction
  5. Hyperparameter Tuning
  6. Results
  7. Appendix

Requirements

The finalized models, feature extraction, and data preparation will be placed in a single pdt_regressor.py application that can be ran as a complete system. However, for development and understanding, it is often useful to use JupyterNotebooks to visualize our data and step through parameter tuning. For that reason, it is recommended to use a JupyterNotebook enabled IDE such as Pycharm. Students and researchers are able to access the professional version for free.

Setup

To use this project, and develop on it, either download the .zip file from the repository or

git clone https://github.com/DmitriLyalikov/pdt_regressor.git

Open the project in your IDE and run

pip install . e

in the PyCharm IDE terminal. This should install all the library dependencies for the project like scikit-learn, xgboost, and pandas.

Usage

Trained models are stored in the /models folder as .pkl files. They can be stored and loaded as an XGB Regressor object using the pickle python package.

To load and use a model:

# Load the model from models folder
with open("../models/pdt-regression-model.pkl", 'rb') as f:
    model = pickle.load(f)

To predict and get Root-Mean-Squared-Error of the prediction:

y_pred = model.predict(X_test)

reg_mse = mean_squared_error(y_test, y_pred)
reg_rmse = np.sqrt(reg_mse)

Hyperparameter Tuning

XGBoost Hyperparameters are used to improve the performance of the model, reduce variance, and minimize overfitting. Some important HP are learning_rate (eta), max_depth, no_of_iterations, and subsamples. Complete list of XGBoost Hyperparameters can be found here

HP depend on the model, data, and methods of regression, and generally are found empirically. Included in XGBoost.ipynb is grid_search function which will automate the tuning process by finding the best parameter provided in params

grid_search(params={'max_depth': [1, 2, 3, 4, 5, 6]})

This will yield the output:

Best params: {'max_depth': 6}
Training score: 951.398

For full examples of usage consult XGBoost.ipynb provided. Generally Hyperparameters should be tested together, as one may or may not have an effect on another's score

Results

Train/Test with pdt-dataset (2500 entries, Beta [0.4,0.8], Smax=True)

  • n-estimators: 800
  • learning_rate=.1
  • max_depth = 5

Accuracy score on test data (.999), RMSE: (0.0034324513493428823)

Appendix

About

An XGBoost Regression model that predicts interfacial tension and pressure from the edge profile of a pendant drop.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published