Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance Integration with Spark #1097

Merged
merged 15 commits into from
Jul 10, 2023
Merged

Conversation

levscaut
Copy link
Contributor

Why are these changes needed?

As Spark DataFrames and a new model class, SparkLGBMEstimator, have been introduced in FLAML, there is currently no official documentation detailing their usage. I am in the process of adding the necessary documentation for these Spark-related features.

Additionally, I encountered minor issues with lgbm_spark while revising this model. Users were required to specify their label twice, in both the label and labelCol arguments. I've eliminated the need for input in labelCol and aligned it with label by default. This enhancement makes the Spark models more consistent with the FLAML style.

Related issue number

Closes #1088

Checks

@levscaut levscaut requested a review from thinkall June 28, 2023 16:28
@levscaut levscaut changed the title Improve spark Enhance Integration with Spark Jun 28, 2023
website/docs/Examples/Integrate - Spark.md Outdated Show resolved Hide resolved
website/docs/Examples/Integrate - Spark.md Outdated Show resolved Hide resolved
website/docs/Examples/Integrate - Spark.md Outdated Show resolved Hide resolved
website/docs/Examples/Integrate - Spark.md Outdated Show resolved Hide resolved
website/docs/Examples/Integrate - Spark.md Outdated Show resolved Hide resolved
flaml/automl/model.py Outdated Show resolved Hide resolved
website/docs/Examples/Integrate - Spark.md Outdated Show resolved Hide resolved
website/docs/Examples/Integrate - Spark.md Show resolved Hide resolved
website/docs/Examples/Integrate - Spark.md Outdated Show resolved Hide resolved
website/docs/Examples/Integrate - Spark.md Show resolved Hide resolved
Copy link
Collaborator

@thinkall thinkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

@thinkall thinkall added this pull request to the merge queue Jul 10, 2023
Merged via the queue into microsoft:main with commit 5eece5c Jul 10, 2023
13 checks passed
@levscaut levscaut deleted the improve_spark branch July 10, 2023 07:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Spark dataframe input and Spark models support documentation
4 participants