Skip to content

Commit

Permalink
[SPARK-7667] [MLLIB] MLlib Python API consistency check
Browse files Browse the repository at this point in the history
MLlib Python API consistency check

Author: Yanbo Liang <ybliang8@gmail.com>

Closes apache#6856 from yanboliang/spark-7667 and squashes the following commits:

21bae35 [Yanbo Liang] remove duplicate code
eb12f95 [Yanbo Liang] fix doc inherit problem
9e7ec3c [Yanbo Liang] address comments
e763d32 [Yanbo Liang] MLlib Python API consistency check
  • Loading branch information
yanboliang authored and jkbradley committed Jun 30, 2015
1 parent 4915e9e commit f9b6bf2
Showing 1 changed file with 10 additions and 5 deletions.
15 changes: 10 additions & 5 deletions python/pyspark/mllib/feature.py
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,15 @@ class JavaVectorTransformer(JavaModelWrapper, VectorTransformer):
"""

def transform(self, vector):
"""
Applies transformation on a vector or an RDD[Vector].
Note: In Python, transform cannot currently be used within
an RDD transformation or action.
Call transform directly on the RDD instead.
:param vector: Vector or RDD of Vector to be transformed.
"""
if isinstance(vector, RDD):
vector = vector.map(_convert_to_vector)
else:
Expand Down Expand Up @@ -191,7 +200,7 @@ def fit(self, dataset):
Computes the mean and variance and stores as a model to be used
for later scaling.
:param data: The data used to compute the mean and variance
:param dataset: The data used to compute the mean and variance
to build the transformation model.
:return: a StandardScalarModel
"""
Expand Down Expand Up @@ -346,10 +355,6 @@ def transform(self, x):
vector
:return: an RDD of TF-IDF vectors or a TF-IDF vector
"""
if isinstance(x, RDD):
return JavaVectorTransformer.transform(self, x)

x = _convert_to_vector(x)
return JavaVectorTransformer.transform(self, x)

def idf(self):
Expand Down

0 comments on commit f9b6bf2

Please sign in to comment.