Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify spark session fixture usage #10915

Merged

Conversation

Cokral
Copy link
Contributor

@Cokral Cokral commented Jan 26, 2024

🛠 DevTools 🛠

Open in GitHub Codespaces

Install mlflow from this PR

pip install git+https://github.com/mlflow/mlflow.git@refs/pull/10915/merge

Checkout with GitHub CLI

gh pr checkout 10915

Related Issues/PRs

#10045

What changes are proposed in this pull request?

Remove the duplicated definitions of spark_session constructor in the following files:

  • tests.spark.autologging.ml.test_pyspark_ml_autologging
  • tests.evaluate.test_evaluation
  • tests.tensorflow.test_keras_pyfunc_model_works_with_all_input_types

Instead, use the defined spark_session constructor already present in tests.utils.test_file_utils.

I had to use # noqa: F401 to ensure that the import doesn't get removed by the linter.
Do you know any better way to do this?
I imagine we could also work with the conftest.py files to do the imports instead of doing it directly in the test file.

How is this PR tested?

  • Existing unit/integration tests
  • New unit/integration tests
  • Manual tests

Does this PR require documentation update?

  • No. You can skip the rest of this section.
  • Yes. I've updated:
    • Examples
    • API references
    • Instructions

Release Notes

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/artifacts: Artifact stores and artifact logging
  • area/build: Build and test infrastructure for MLflow
  • area/deployments: MLflow Deployments client APIs, server, and third-party Deployments integrations
  • area/docs: MLflow documentation pages
  • area/examples: Example code
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/recipes: Recipes, Recipe APIs, Recipe configs, Recipe Templates
  • area/projects: MLproject format, project running backends
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/server-infra: MLflow Tracking server backend
  • area/tracking: Tracking Service, tracking client APIs, autologging

Interface

  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • area/windows: Windows support

Language

  • language/r: R APIs and clients
  • language/java: Java APIs and clients
  • language/new: Proposals for new client languages

Integrations

  • integrations/azure: Azure and Azure ML integrations
  • integrations/sagemaker: SageMaker integrations
  • integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

@github-actions github-actions bot added the rn/none List under Small Changes in Changelogs. label Jan 26, 2024
Copy link

github-actions bot commented Jan 26, 2024

Documentation preview for d1e750c will be available here when this CircleCI job completes successfully.

More info

Copy link

@Cokral Thank you for the contribution! Could you fix the following issue(s)?

⚠ DCO check

The DCO check failed. Please sign off your commit(s) by following the instructions here. See https://github.com/mlflow/mlflow/blob/master/CONTRIBUTING.md#sign-your-work for more details.

@Cokral Cokral force-pushed the feature/simplify-spark-session-fixture-usage branch from 900cf55 to 4dc98ee Compare January 26, 2024 20:29
@Cokral Cokral changed the title Moving back model_uri to predict Simplify spark session fixture usage Jan 26, 2024
Signed-off-by: Cokral <coquereau.thomas@gmail.com>
@Cokral Cokral force-pushed the feature/simplify-spark-session-fixture-usage branch from 4dc98ee to d1e750c Compare January 26, 2024 20:47
Copy link
Collaborator

@serena-ruan serena-ruan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@serena-ruan serena-ruan merged commit 8513570 into mlflow:master Jan 29, 2024
68 of 71 checks passed
@Cokral Cokral deleted the feature/simplify-spark-session-fixture-usage branch February 6, 2024 07:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rn/none List under Small Changes in Changelogs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants