Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-25345][ML] Deprecate public APIs from ImageSchema #22349

Closed
wants to merge 5 commits into from

Conversation

WeichenXu123
Copy link
Contributor

What changes were proposed in this pull request?

Deprecate public APIs from ImageSchema.

How was this patch tested?

N/A

@SparkQA
Copy link

SparkQA commented Sep 6, 2018

Test build #95749 has finished for PR 22349 at commit 2397321.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 6, 2018

Test build #95751 has finished for PR 22349 at commit 38c33da.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -20,6 +20,9 @@

An attribute of this module that contains the instance of :class:`_ImageSchema`.

.. note:: Deprecated in 2.4.0. Use `spark.read.format("image").load(path)` instead and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, that datasource wouldn't be able to replace toNDArray and toImage which require numpy. Might be better leave a note that it needs manual conversion with NumPy API for both.

@@ -35,6 +35,8 @@ import org.apache.spark.sql.types._
*/
@Experimental
@Since("2.3.0")
@deprecated("use `spark.read.format(\"image\").load(path)` and this `ImageSchema` will be " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other methods defined under ImageSchema that are not covered by the image data source. So we shall only deprecate readImages and leave other public methods as experimental. Same for Python.

@SparkQA
Copy link

SparkQA commented Sep 7, 2018

Test build #95796 has finished for PR 22349 at commit f5a13a5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -207,6 +207,9 @@ def readImages(self, path, recursive=False, numPartitions=-1,
.. note:: If sample ratio is less than 1, sampling uses a PathFilter that is efficient but
potentially non-deterministic.

.. note:: Deprecated in 2.4.0. Use `spark.read.format("image").load(path)` instead and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to issue a warning when people call this code path (e.g. warnings.warn)? We do it in some of the other Python deprecated APIs but not all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added.

@SparkQA
Copy link

SparkQA commented Sep 8, 2018

Test build #95817 has finished for PR 22349 at commit 5cefd23.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -30,6 +30,7 @@
from pyspark import SparkContext
from pyspark.sql.types import Row, _create_row, _parse_datatype_json_string
from pyspark.sql import DataFrame, SparkSession
import warnings
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically builtin package should be ordered above per PEP 8. I wonder why this is not caught.

import sys
import warnings
 
import numpy as np

from pyspark import SparkContext
from pyspark.sql.types import Row, _create_row, _parse_datatype_json_string
from pyspark.sql import DataFrame, SparkSession

@@ -222,7 +226,8 @@ def readImages(self, path, recursive=False, numPartitions=-1,

.. versionadded:: 2.3.0
"""

warnings.warn("`ImageSchema.readImage` is deprecated. " +
"Use `spark.read.format(\"image\").load(path)` instead.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warnings.warn(..., DeprecationWarning)

@mengxr
Copy link
Contributor

mengxr commented Sep 8, 2018

@WeichenXu123 Could you address the comments?

@SparkQA
Copy link

SparkQA commented Sep 8, 2018

Test build #95831 has finished for PR 22349 at commit 576b7d8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Sep 8, 2018
## What changes were proposed in this pull request?

Deprecate public APIs from ImageSchema.

## How was this patch tested?

N/A

Closes #22349 from WeichenXu123/image_api_deprecate.

Authored-by: WeichenXu <weichen.xu@databricks.com>
Signed-off-by: Xiangrui Meng <meng@databricks.com>
(cherry picked from commit 08c02e6)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
@mengxr
Copy link
Contributor

mengxr commented Sep 8, 2018

LGTM. Merged into master and branch-2.4. Thanks!

@asfgit asfgit closed this in 08c02e6 Sep 8, 2018
@WeichenXu123 WeichenXu123 deleted the image_api_deprecate branch September 9, 2018 02:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants