# Example usage

To use `autopredictor` in a project:

In [2]:
import autopredictor

print(autopredictor.__version__)

ModuleNotFoundError: No module named 'autopredictor'

In [None]:
from autopredictor.select_model import select_model

In real-life scenarios, like in healthcare, the process of selecting appropriate models, determining relevant metrics, scrutinizing specific models, and ultimately choosing the most suitable model is not only time-consuming but also critical in influencing decision-making and patient outcomes. Our package, autopredictor, is meticulously designed to streamline these aspects of machine learning for continuous data, thereby significantly accelerating the workflow. This acceleration is particularly beneficial for healthcare professionals, data scientists, or researchers, as it allows them to allocate more time towards insightful data interpretation and strategic decision-making. The diabetes dataset showcased in this example serves as a representative sample of real-world health data. The methodology employed in training, evaluating, and selecting models closely parallels the procedural work observed in actual research settings.

To sum up, our script presents a comprehensive framework for model selection within the realm of machine learning tasks. It facilitates the training and evaluation of a diverse array of models, provides a detailed array of performance metrics for thorough assessment, and ensures a clear, user-friendly presentation of results, aiding in the informed selection of the most effective model. The application of this script to the diabetes dataset underlines its relevance and adaptability to real-world data, underscoring its potential to be a valuable asset in healthcare analytics and various other domains where predictive modeling is a cornerstone.

## SELECT MODEL

After training the models and evaluating their performance, the select_model function allows the user to select a specific model and view its performance metrics. It ensures that the inputs are of the correct types and that the specified model is present in the results. If the model is found, it returns the performance metrics for that model; otherwise, it provides a list of available models. This function is particularly useful for zooming in on a specific model's performance or for retrieving the performance of the best model based on a particular metric.

In the context of a diabetes dataset, for instance, if an analyst suspects that a certain model, like a Random Forest Regressor, might be particularly well-suited to handling the complexities of diabetes data (due to its ability to model non-linear relationships and interactions between variables), the select_model function allows them to isolate and closely examine the performance of just this model. This is especially useful in situations where a multitude of models have been trained and evaluated, and there's a need to drill down into the specifics of one model without getting overwhelmed by the broader data.

Expanding beyond healthcare, this function has broad applicability in various fields. For example, in finance, an analyst might want to specifically evaluate the performance of a particular model in predicting stock prices or market trends. Similarly, in environmental science, a researcher could use this function to singularly assess a model's accuracy in forecasting climate patterns or pollution levels.

The ability to selectively examine a model is crucial when comparing models that might have different strengths and weaknesses depending on the context. This targeted approach enables a more thoughtful and focused analysis, allowing analysts to make more informed decisions about which model to deploy based on specific criteria relevant to their field or problem at hand. It's a tool that enhances precision in model selection.

In [None]:
selected_model_name = 'AdaBoost'
AdaBoost_scores = select_model(score_df, selected_model_name)

In [2]:
selected_model_name = 'Linear Regression'
LR_scores = select_model(score_df, selected_model_name)

In [None]:
selected_model_name = 'Random Forest'
RF_scores = select_model(score_df, selected_model_name)

### SELECT MODEL ADDITIONAL FEATURES

The select_model function is not only beneficial for focusing on a specific model's performance but also serves as a useful tool for verifying the presence of a model within the dataset. When an analyst specifies a model, the function checks if that model is included in the DataFrame's index. If the model is not found, the function doesn't just stop at returning an error message; it goes a step further by providing a list of the models that are included. This feature is particularly helpful in multiple ways:

- **Error Prevention and Troubleshooting:** It helps in preventing errors that might occur from typos or incorrect model names. By returning an error message and a list of included models, it guides the user towards the correct model names, making the troubleshooting process more intuitive and less time-consuming.

- **Model Inventory Check:** It essentially acts as a quick inventory check, allowing users to confirm which models have been trained and evaluated. This is especially useful in collaborative environments where multiple team members might be working on the same dataset but focusing on different models. It ensures that everyone is aware of the models that are already included in the analysis, helping to avoid redundant work.

- **Informed Decision Making:** By providing a list of available models, it aids in informed decision-making. Analysts can quickly scan through the available models and decide which ones to focus on based on their specific criteria or hypothesis, without having to look through the entire DataFrame.

For example, in a real-life scenario, an environmental scientist analyzing a dataset on air quality might be interested in examining a specific model's ability to predict pollution levels. If they aren't sure whether the model has been included in the analysis, they can use the select_model function. If the model is not found, the function's feedback not only informs them of this but also shows which models are available, allowing the scientist to make an informed decision on whether to proceed with an available trained model or to train and evaluate the model of interest.

In [None]:
selected_model_name = 'Other Regressor'
Other_Regressor_scores = select_model(score_df, selected_model_name)

In [None]:
#Wrong Spelling
selected_model_name = 'Rand Fore'
RF_scores = select_model(score_df, selected_model_name)