Skip to content

Latest commit

 

History

History
178 lines (148 loc) · 8.24 KB

README.md

File metadata and controls

178 lines (148 loc) · 8.24 KB

UML Energy & Combustion Research Laboratory

ECabc : Feature tuning program

GitHub version PyPI version GitHub license

ECabc is a generic, small scale feature tuning program based on the Artificial Bee Colony by N. Karboga that imitates the honey foraging techniques of bees. ECabc optimizes user supplied functions called the fitness function using a given set of variables known as the value set. The bee colony consists of three types of bees: employers, onlookers and scouts. An employer bee is an object which stores a set of values and a fitness score that correlates to that value as well as the bee's probability of being picked by an onlooker bee. An onlooker bee is an object that chooses employer bees with a high probability and calculates new positions for them. The scout bee will create a new set of random values, which will then be assigned to a poorly performing employer bee as a replacement.

Research applications

While it has several applications, ECabc has been successfully used by the Energy and Combustion Research Laboratory (ECRL) at the University of Massachusetts Lowell to tune the hyperparameters of ECNet, a large-scale machine learning project for predicting fuel properties. ECNet provides scientists an open source tool for predicting key fuel properties of potential next-generation biofuels, reducing the need for costly fuel synthesis and experimentation. By increasing the accuracy of ECNet and similar models efficiently, ECabc helps to provide a higher degree of confidence in discovering new, optimal fuels. A single run of ECabc on ECNet yielded a lower average root mean square error (RMSE) for cetane number (CN) and yield sooting index (YSI) when compared to the RMSE generated by a year of manual tuning. While the manual tuning generated an RMSE of 10.13, the ECabc was able to yield an RMSE of 8.06 in one run of 500 iterations.

Installation

Prerequisites:

  • Have python 3.X installed
  • Have the ability to install python packages

Method 1: pip

If you are working in a Linux/Mac environment:

sudo pip install ecabc

Alternatively, in a windows environment, make sure you are running cmd as administrator:

pip install ecabc

To update your version of ECabc to the latest release version, use

pip install --upgrade ecabc

Note: if multiple Python releases are installed on your system (e.g. 2.7 and 3.6), you may need to execute the correct version of pip. For Python 3.6, change "pip install ecabc" to "pip3 install ecabc".

Method 2: From source

  • Download the ECabc repository, navigate to the download location on the command line/terminal, and execute:
python setup.py install

Additional package dependencies (Numpy) will be installed during the ECabc installation process.

Usage

To get started import ECabc

from ecabc import *

Then define your fitness function as a function. The fitness function is the user defined function whose solution is being optimized. Pass in the values and args and have it return the output that is being optimized

def fitness_function(values,args):
    ***code***
    return output

After that, in the main function define your value ranges i.e. the user defined ranges for the variables which are being optimized

values = [('int', (0,10)), ('int', (0,100)), ('float',(0,80)), ('float', (0, 360))]

Optionally, one can also add args. Any additional arguments that your fitness function must take outside of the values given in value_ranges. This defaults to None.

arguments = {'test_argument', 10} 

Then call ECabc as follows:

abc = ABC(fitness_fxn=fitness_function,  value_ranges=values, args = arguments)

Certain setting also need to be toggled, such as

abc._minimize = True

And the settings can be imported and saved as follows

abc._import_settings = example.json
abc._save_settings = output.json

Then call create_employers on it to generate your population of employer bees. This ony needs to be done once

abc.create_employers()

After this, the code should enter a loop with a break condition. The contents of ECabc that should be in the loop have been encompassed in run_iteration() for simplicity.

while True:
abc.run_iteration()
if (abc.best_performer[0] < 2):
    break

The above snippet shows the setup if one wants to run ECabc until a certain output value has been obtained. Alternatively one could just set it up so that it runs for a preset number of cycles as follows:

for i in range(500):
    run_iteration()

Other parameters that can be specified in the loop are: file logging: debug'/'info'/'warn'/'error'/'crit' or 'disable

abc._logger.file_level = 'info'
abc._logger.file_level = 'debug'
abc._logger.file_level = 'warn'
abc._logger.file_level = 'error'
abc._logger.file_level = 'crit'
abc._logger.file_level = 'disable'

print_level. This will print out log information to the console:

abc._logger.stream_level = 'info'
abc._logger.stream_level = 'debug'
abc._logger.stream_level = 'warn'
abc._logger.stream_level = 'error'
abc._logger.stream_level = 'crit'
abc._logger.stream_level = 'disable'

and processes:

processes = 1

Finally, to view the output:

print(abc.best_performer[2], abc.best_performer[1])

where best_performer[2] is the values and best_performer[1] is the fitness score associated with it.

Example

'''
Simple sample script to demonstrate how to use the artificial bee colony, this script is a simple example, which is just
used to demonstrate how the program works.

If an ideal day is 70 degrees, with 37.5% humidity. The fitness functions takes four values and tests how 'ideal' they are.
The first two values input will be added to see how hot the day is, and the second two values will be multiplied to see how much
humidity there is. The resulting values will be compared to 70 degrees, and 37.5% humidity to determine how ideal the day those 
values produce is. 

The goal is to have the first two values added up to as close to 70 as possible, while the second two values multiply out to as 
close to 37.5 as possible.
'''

from eabc import *
import os
import time

def idealDayTest(values, args=None):          # Fitness function that will be passed to the abc
    temperature = values[0] + values[1]       # Calcuate the day's temperature
    humidity = values[2] * values[3]          # Calculate the day's humidity
    
    cost_temperature = abs(70 - temperature)  # Check how close the daily temperature to 70
    cost_humidity = abs(37.5 - humidity)      # Check how close the humidity is to 37.5

    return cost_temperature + cost_humidity   # This will be the cost of your fitness function generated by the values


if __name__ == '__main__':
            # First value      # Second Value     # Third Value      # Fourth Value
    values = [('int', (0,100)), ('int', (0,100)), ('float',(0,100)), ('float', (0, 100))]

    start = time.time()
    abc = ABC(fitness_fxn=idealDayTest, 
            value_ranges=values
            )
    abc.create_employers()
    while True:
        abc.save_settings('{}/settings.json'.format(os.getcwd()))
        abc.run_iteration()
        if (abc.best_performer[0] < 2):
            break
    print("execution time = {}".format(time.time() - start))

Contributing, Reporting Issues and Other Support:

To contribute to ECabc, make a pull request. Contributions should include tests for new features added, as well as extensive documentation.

To report problems with the software or feature requests, file an issue. When reporting problems, include information such as error messages, your OS/environment and Python version.

For additional support/questions, contact Sanskriti Sharma (sanskriti_sharma@student.uml.edu), Travis Kessler (travis.j.kessler@gmail.com), Hernan Gelaf-Romer (hernan_gelafromer@student.uml.edu) and/or John Hunter Mack (Hunter_Mack@uml.edu).