# Explore fully parallel launching of all trainables
this is option 1 in below conversation with Claude Opus from Phind:

**Me: what is the recommended way of launching multiple trainables? option 1 or option 2? and why?**


**Claude:** Both options have their merits, but I would recommend using Option 2 for launching multiple trainables. Here's why:

Separation of search spaces: Option 2 defines separate search spaces for each model type (rf_search_space and linreg_search_space). This approach allows for cleaner and more explicit definition of the hyperparameters specific to each model type. It makes the code more readable and maintainable, especially if you have multiple model types with different sets of hyperparameters.
Flexibility in tuning: With Option 2, you have the flexibility to run separate tuning sessions for each model type. This allows you to control the tuning process independently for each model type. You can easily modify the tuning configuration (e.g., num_samples, metric, mode) for each model type if needed. It also enables you to run the tuning sessions in parallel or sequentially, depending on your requirements.
Clarity in results: Option 2 provides separate reporting of the best hyperparameters and best metric value for each model type. This makes it easier to analyze and compare the results of different model types. You can clearly see the best configuration and performance for each model type, which can help in making informed decisions about which model to choose for your specific task.
Scalability: Option 2 is more scalable when you have multiple model types or want to add new model types in the future. You can easily add new search spaces for additional model types without modifying the existing code. This modularity and extensibility make it easier to maintain and expand your codebase.
In contrast, Option 1 combines the search spaces for all model types into a single search space. While this approach can work, it may become less manageable as the number of model types and hyperparameters grows. It also doesn't provide the same level of flexibility and clarity in tuning and reporting results for each model type separately.

However, it's worth noting that Option 1 can be suitable if you have a small number of model types and want to perform a single tuning session across all model types. It may be more concise and straightforward in such cases.

Ultimately, the choice between Option 1 and Option 2 depends on your specific requirements, the complexity of your model types, and the desired level of control and clarity in the tuning process. In general, Option 2 offers more flexibility, modularity, and clarity, making it the recommended approach for launching multiple trainables.


In [None]:
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error
from sklearn.linear_model import ElasticNet
import ray
from ray import tune
from ray.air import session


def train_model(config):
    data = fetch_california_housing()
    X_train, X_test, y_train, y_test = train_test_split(
        data.data, data.target, test_size=0.2, random_state=42
    )

    if config["model_type"] == "rf":
        model = RandomForestRegressor(
            n_estimators=config["n_estimators"],
            max_depth=config["max_depth"],
            min_samples_split=config["min_samples_split"],
            min_samples_leaf=config["min_samples_leaf"],
        )
    else:  # config["model_type"] == "linreg"
        model = ElasticNet(
            alpha=config["alpha"],
            l1_ratio=config["l1_ratio"],
            fit_intercept=config["fit_intercept"],
        )

    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    session.report({"mse": mse})


ray.init(ignore_reinit_error=True)
search_space = {
    "model_type": tune.grid_search(["rf", "linreg"]),
    # rf
    "n_estimators": tune.choice([1, 2]),
    "max_depth": tune.choice([None, 1, 2]),
    "min_samples_split": tune.choice([2, 5, 10]),
    "min_samples_leaf": tune.choice([1, 2, 4]),
    # linreg
    "alpha": tune.uniform(0.0, 1.0),
    "l1_ratio": tune.uniform(0.0, 1.0),
    "fit_intercept": tune.choice([True, False]),
}

tuner = tune.Tuner(
    train_model,
    param_space=search_space,
    tune_config=tune.TuneConfig(
        num_samples=2,
        metric="mse",
        mode="min",
    ),
)

results = tuner.fit()
best_result = results.get_best_result(metric="mse", mode="min")
print(f"Best hyperparameters: {best_result.config}")
print(f"Best MSE: {best_result.metrics['mse']}")