Add LogisticRegressionOperator with API training support and integration test by xristlamp · Pull Request #570 · apache/wayang

xristlamp · 2025-05-16T12:45:14Z

This pull request introduces support for training logistic regression models in Apache Wayang via the public API.

Summary of changes:

Implemented LogisticRegressionOperator (logical operator).
Added SparkLogisticRegressionOperator for Spark execution.
Defined LogisticRegressionModel interface.
Created LogisticRegressionMapping to map the logical operator to Spark.
Registered the mapping in Spark's ML plugin.
Extended the public API:
- Added .trainLogisticRegression(...) to DataQuanta and DataQuantaBuilder.
- Introduced LogisticRegressionDataQuantaBuilder class (consistent with dlTraining).
Added an integration test: testLogisticRegressionWithAPI in SparkIntegrationIT.java that verifies:
.loadCollection(...)
.trainLogisticRegression(...)
.predict(...)
.collect()

Testing:

Verified with mvn clean install: all tests pass successfully.

zkaoudi · 2025-05-19T08:56:54Z

wayang-api/wayang-api-scala-java/src/main/scala/org/apache/wayang/api/DataQuanta.scala

+    val operator = new LogisticRegressionOperator(fitIntercept)
+    this.connectTo(operator, 0)
+    labels.connectTo(operator, 1)
+    new DataQuanta[LogisticRegressionModel](operator)


Why create a new DataQuanta and not just output the operator as it's done for the other cases (see dlTrainingJava)?

python/src/pywy/basic/model/ops.py

python/src/pywy/basic/model/models.py

xristlamp and others added 4 commits May 14, 2025 15:58

testLogisticRegressionOperator API

6066b99

Merge branch 'apache:main' into main

2bd6827

LogisticRegressionOperator with Python API support

fecfa4f

LogisticRegressionOperator with Python API support

a915e16

zkaoudi requested changes May 19, 2025

View reviewed changes

xristlamp and others added 4 commits May 19, 2025 15:13

Merge branch 'apache:main' into main

e96c382

LogisticRegression changes as suggested

c2bcf97

Merge remote-tracking branch 'origin/main'

00b1d78

LogisticRegression changes as suggested UPDATE

691e535

zkaoudi reviewed May 20, 2025

View reviewed changes

python/src/pywy/basic/model/models.py Outdated Show resolved Hide resolved

Remove iputs_required() from LogisticRegression

160d04e

zkaoudi approved these changes May 20, 2025

View reviewed changes

zkaoudi merged commit e7426e1 into apache:main May 20, 2025
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LogisticRegressionOperator with API training support and integration test#570

Add LogisticRegressionOperator with API training support and integration test#570
zkaoudi merged 9 commits intoapache:mainfrom
xristlamp:main

xristlamp commented May 16, 2025

Uh oh!

zkaoudi May 19, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

xristlamp commented May 16, 2025

Uh oh!

zkaoudi May 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants