Exposes synthetic dataset generation code from L2X as a pip package. To install, run:
pip install l2x-synthetic
You can now create the synthetic datasets like:
from l2x_synthetic import XORGenerator
generator = XORGenerator(n_samples=100)
X, y = generator.get_data()
Which generates new data every time you call get_data()
✨. Use random_state
to create reproducible data generation.
Available generators:
from l2x_synthetic import XORGenerator
Relevant features: [0, 1]
from l2x_synthetic import OrangeGenerator
Relevant features: [0, 1, 2, 3]
from l2x_synthetic import AdditiveGenerator
Relevant features: [0, 1, 2, 3]
from l2x_synthetic import SwitchGenerator
Relevant features for X[:n//2]
(first 1/2 of dataset): [0, 1, 2, 3]
Relevant features for X[n//2:]
(second 1/2 of dataset): [4, 5, 6, 7]
All generators are of the following type:
class l2x_synthetic.DataGenerator:
name: str = None # contains a human-friendly name for the generator.
n_samples: int = 100
random_state: Optional[int] = None
def get_data(self) -> Tuple[np.ndarray, np.ndarray]:
...
def get_dataframe(self) -> pd.DataFrame:
...
pip install -r requirements.txt
See the original repo: