Skip to content

lahovniktadej/gatree

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

GATree

GATree

PyPI version PyPI - Python Version PyPI - Downloads Downloads GATree Documentation status

Repository size License GitHub commit activity Percentage of issues still open Average time to resolve an issue GitHub contributors

DOI JOSS

πŸ“‹ About β€’ πŸ“¦ Installation β€’ πŸš€ Usage β€’ 🧬 Genetic Operators β€’ πŸ«‚ Community Guidelines β€’ πŸ“œ License

πŸ“‹ About

GATree is a Python library designed for implementing evolutionary decision trees using a standard genetic algorithm approach. The library provides functionalities for selection, mutation, and crossover operations within the decision tree structure, allowing users to evolve and optimise decision trees for various classification and clustering tasks. 🌲🧬

The library's core objective is to empower users in creating and fine-tuning decision trees through an evolutionary process, opening avenues for innovative approaches to classification and clustering problems. GATree enables the dynamic growth and adaptation of decision trees, offering a flexible and powerful tool for machine learning enthusiasts and practitioners. πŸš€πŸŒΏ

GATree is currently limited to classification and clustering tasks, with support for regression tasks planned for future releases. πŸ’‘

πŸ“¦ Installation

pip

To install GATree using pip, run the following command:

pip install gatree

πŸš€ Usage

The following example demonstrates how to perform classification of the iris dataset using GATree. More examples can be found in the examples directory.

import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from gatree.methods.gatreeclassifier import GATreeClassifier

# Load the iris dataset
iris = load_iris()
X = pd.DataFrame(iris.data, columns=iris.feature_names)
y = pd.Series(iris.target, name='target')

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=ΒΈ10)

# Create and fit the GATree classifier
gatree = GATreeClassifier(n_jobs=16, random_state=32)
gatree.fit(X=X_train, y=y_train, population_size=100, max_iter=100)

# Make predictions on the testing set
y_pred = gatree.predict(X_test)

# Evaluate the accuracy of the classifier
print(accuracy_score(y_test, y_pred))

🧬 Genetic Operators in GATree

The genetic algorithm for decision trees in GATree involves several key operators: selection, elitism, crossover, and mutation. Each of these operators plays a crucial role in the evolution and optimisation of the decision trees. Below is a detailed description of each operator within the context of the GATree class.

Selection

Selection is the process of choosing parent trees from the current population to produce offspring for the next generation. By default, GATree class uses tournament selection, a method where a subset of the population is randomly chosen, and the best individual from this subset is selected.

Elitism

Elitism ensures that the best-performing individuals (trees) from the current generation are carried over to the next generation without any modification. This guarantees that the quality of the population does not decrease from one generation to the next.

Crossover

Crossover is a genetic operator used to combine the genetic information of two parent trees to generate new offspring. This enables exploration, which helps in creating diversity in the population and combining good traits from both parents.

Mutation

Mutation introduces random changes to a tree to maintain genetic diversity and explore new solutions. This helps in avoiding local optima by introducing new genetic structures.

πŸ«‚ Community Guidelines

Contributing

To contribure to the software, please read the contributing guidelines.

Reporting Issues

If you encounter any issues with the library, please report them using the issue tracker. Include a detailed description of the problem, including the steps to reproduce the problem, the stack trace, and details about your operating system and software version.

Seeking Support

If you need support, please first refer to the documentation. If you still require assistance, please open an issue on the issue tracker with the question tag. For private inquiries, you can contact us via e-mail at tadej.lahovnik1@um.si or saso.karakatic@um.si.

πŸ“œ License

This package is distributed under the MIT License. This license can be found online at http://www.opensource.org/licenses/MIT.

Disclaimer

This framework is provided as-is, and there are no guarantees that it fits your purposes or that it is bug-free. Use it at your own risk!