Skip to content

A framework to predict ignition delay using GMM, Spath, multiple regression and error based tertiary tree algorithms to generate training models. Extreme points of the cluster are utilized to make the correct identification of cluster and test predictions.

License

Notifications You must be signed in to change notification settings

Computational-Chemistry-and-Combustion/DataDrivenIDT

Repository files navigation

DOI

DOI

CodeFactor DOI codecov

Build Status FOSSA Status GitHub top language GitHub GitHub release (latest by date) Open Source Love Twitter

About The Project:

🔥 This repository can be used to develop training models using tree type regression-based clustering and make prediction using those models. The frame-work is customized for the ignition delay data but with minor changes it works fine with any data having continuous dependent (output) variable. The algorithm uses error-based technique to divide the data into three clusters based on relative error in prediction and sign of prediction error to obtain the accurate regression models. Please look at the manual for more information.

Table of Contents
  1. About The Project
  2. System Requirements
  3. Installation
  4. Commands to run the program
  5. Examples
  6. Brought up by

System Requirements:

🔥 OS : Linux

🔥 Python 3.6+


Installation:

🔥 Clone the repository in suitable directory.

🔥 Open your ./bashrc file and add lines given below at the bottom of file.


Add sourcing to find command:

🔥 Copy the following commands in your ./bashrc file

##Package command Finder:
export IDCODE="${HOME}/PathToDir/.../Data_driven_Kinetics/"
export PATH=$PATH:$IDCODE
alias IDprediction="pwd>${HOME}/PathToDir/.../Data_driven_Kinetics/filelocation.txt && Run.sh"

Replace "/PathToDir/.../" with your directory location.


Example:

If repo is cloned in ./home directory then configure .bashrc using following command:

##Package command Finder:
export IDCODE="${HOME}/Data_driven_Kinetics/"
export PATH=$PATH:$IDCODE
alias IDprediction="pwd>${HOME}/Data_driven_Kinetics/filelocation.txt && Run.sh"

Source the changes:

🔥 (IMPORTANT) To configure the changes in .bashrc, write following command in terminal.

cd
source .bashrc

Install dependency:

🔥 To install all the dependency use INSTALL.sh file. Write the commands given below in the terminal

chmod +x INSTALL.sh

./INSTALL.sh

Make Run.sh file executable:

🔥 To make run file executable, go to ./Data_driven_Kinetics and write following command.

chmod +x Run.sh

Commands to run the program:

All set!

Now, open terminal and type following commands to generate result.

IDprediction -flag file_name.csv

Input arguments to 'IDprediction' are specified as below:

Consider the data file as 'file_name.csv'

🔥 -a : ‘Analyze’ the data-set by selecting certain parameters

IDprediction -a  file_name.csv  

🔥 -b : Find types of 'bond’ associated with given fuel

IDprediction -b  FuelSMILES
IDprediction -b CCC
IDprediction -b CCCCCC

🔥 -h : Generates 'histogram’ plots of parameters for each fuel individually

IDprediction -h  file_name.csv 

🔥 -c : To define the 'criteria' for error based clustering

🔥 -l : To ‘limit’ number of reference point

🔥 -r : To 'remove’ feature by back-elimination

🔥 -s : To specify significance level

🔥 -m : To find out multiple linear regression of data

IDprediction -c 0.05 -l 10 -r True -s 0.05  -m  file_name.csv 

🔥 -t : ‘Tree’ type regression based clustering algorithm

IDprediction -c 0.05 -r False -t file_name.csv 

🔥 -e : 'External' Dataset used for prediction (Complete above Model generation first)

IDprediction -e  test_data.csv 

🔥 -k : To run code multiple ‘(k)’ times and store all test prediction result in different directory

IDprediction -k testset.csv

🔥 -f : Probability density ‘function’ plot of testing result after running code 'k' times

IDprediction -f testset.csv

🔥 -p : Plot and obtain of average value of coefficient from coefficient file (If coefficient result obtained many times and there is variation in coefficients)

IDprediction -p  coefficient_3.csv 

🔥 -o : To run any 'other’ dataset than fuel

IDprediction -c 0.05 -l 10 -o anyFile.csv

Don’t forget to make changes in ’feature selection.py file’


Examples:

Example:1 Run the following commands to generate models and make predictions using Ignition delay data:

cd TryYourself/nAlkaneIDT/
IDprediction -c 0.1 -t trainset.csv
IDprediction -e testset.csv

Example:2 Run the following commands to generate models and make predictions using Wine quality data:

cd TryYourself/WineQuality/
IDprediction -c 0.1 -o trainset.csv
IDprediction -e testset.csv

Make appropriate changes in ’feature selection.py' file to change features accordingly to the data. (Check manual)


Brought up by:

CCC Group SAFRAN Group

About

A framework to predict ignition delay using GMM, Spath, multiple regression and error based tertiary tree algorithms to generate training models. Extreme points of the cluster are utilized to make the correct identification of cluster and test predictions.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published