Examples of use

We provide some examples to desmostrate the applicability of the tool to different domains, reading data from different data sources, performing several cleanings, using diverse training algorithms, and some evaluation metrics.

Examples details

The first two pipelines has been developed to evaluate the applicability: the diabetes example is using a known toy dataset provided by sklearn module (data read from a csv file); the Big Mart Sales example has been developed to participate in a popular hackathon. The Iris Classification, the Purchase Experience score and the Image Classification pipelines has been generated to replicate three existing ML pipelines hosted in GitHub.

Example	Domain	Cleanings	Training algorithm	Evaluation metrics	URL
Diabetes	Health	No cleaning	SVM	Accuracy	www4.stat.ncsu.edu/~boos/var.select/diabetes.html
Big Mart Sales	Sales prediction	Replace Nulls per average and Mode	Random Forest Regression	Accuracy	www.analyticsvidhya.com/datahack/contest/practice-problem-big-mart-sales-iii/
Iris classification	Botany	No cleaning	Random Forest Regression, SVM	Accuracy	www.github.com/Ernesto905/Zenml-Sentiment-Analysis-Pipeline - training_pipeline.py
Purchase Experience Score	Sales Reviews	Replace Null by Median or Text	Linear Regression	Accuracy, MSE, MRSE, R2	www.github.com/Akurati-Kaustiki/MLOPS-Assignment-Group17
Image Classification	Images	No cleaning	SVM	Confusion matrix, Precision, Recall, F1-score	[svm-image-classification](https://github.com/ahmdmohamedd/svm-image-classification/tree/main)

Evaluation

To evalute the quality of the generated code, we used well-known tools such as Pylint and Radon. Pylint provides a quality score from 0 to 10, being 0 the worst value and 10 the best code quality evaluation. Radon computes Cyclomatic complexity (CC) and Maintainability index (MI) metrics. CC measures how many independent paths exist in the code, the more branching logic (e.g., if, for, while, try, etc.), the higher the number, and the harder the code is to understand, test, and maintain. MI is a composite score (from 0 to 100) that estimates how maintainable your code is; the higher the number, the better maintainability. Both metrics complement the quantitative value with a qualitative label from A (the best qualification) to F (the worst).

The following table presents the results for the three code quality metrics for the provided examples generated using the MLS Toolbox Code Generator. The generated code obtains good values for the three metrics, values next to 10 in PyLint scores, values next to 1 for Cyclomatic complexity, and values next to 100 for the Maintainability index.

Example	PyLint score	Cyclomatic complexity	Maintainability index
Diabetes	9,25	1,40 (A)	97,95 (A)
Big Mart Sales	9,03	1,41 (A)	97,96 (A)
Iris classification	8,82	1,39 (A)	97,62 (A)
Purchase Experience Score	8,94	1,39 (A)	97,29 (A)
Image Classification	8.87	1.48 (A)	93.36 (A)

The following table provides the quality metrics results for the example implementations that has not been generated by the MLSToolbox Code Generator.

Example	Tool	PyLint score	Cyclomatic complexity	Maintainability index
Diabetes	Kubeflow	4,29	1 (A)	74,55 (A)
Diabetes	ZenML	7,53	1 (A)	71,35 (A)
Diabetes	scikit-learn	3,57	1 (A)	88,85 (A)
Diabetes	--	0,55	2 (A)	81,32 (A)
Big Mart Sales	Kubeflow	0	1 (A)	61,77 (A)
Big Mart Sales	ZenML	5,49	2,5 (A)	63,09 (A)
Big Mart Sales	scikit-learn	6,94	1,25 (A)	100 (A)
Big Mart Sales	--	0,55	1 (A)	72,51 (A)
Iris classification	ZenML	0	1,97 (A)	97,77 (A)
Purchase Experience Score	--	5,66	3 (A)	71,90 (A)
Image Classification	--	7.1	2 (A)	84.732 (A)

Conclusion

The results consistently showed that MLSToolbox-generated code achieved the highest Pylint scores and a better MI than most other tools, highlighting its design focus on maintainability and evolution. The slightly higher CC can be attributed to a modular design that prioritizes reusability and extensibility.

Home
How to install
How to use
How to configure and extend
Demos
- MLSToolbox related Wikis
  - mls_lib
  - MLS Toolbox

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Examples of use

Examples details

Evaluation

Conclusion

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally