Skip to content

Commit

Permalink
PR feedback.
Browse files Browse the repository at this point in the history
  • Loading branch information
MrBago committed Jan 19, 2018
1 parent 0cdfc1b commit 6228902
Show file tree
Hide file tree
Showing 4 changed files with 9 additions and 10 deletions.
2 changes: 1 addition & 1 deletion docs/ml-features.md
Original file line number Diff line number Diff line change
Expand Up @@ -1285,7 +1285,7 @@ for more details on the API.

## VectorSizeHint

It can sometimes be useful to explicitly specify the size of the vectors a column of
It can sometimes be useful to explicitly specify the size of the vectors for a column of
`VectorType`. For example, `VectorAssembler` uses size information from its input columns to
produce size information and metadata for its output column. While in some cases this information
can be obtained by inspecting the contents of the column, in a streaming dataframe the contents are
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,22 +17,21 @@

package org.apache.spark.examples.ml;

import org.apache.spark.sql.SparkSession;

// $example on$
import java.util.Arrays;

import org.apache.spark.ml.feature.VectorAssembler;
import org.apache.spark.ml.feature.VectorSizeHint;
import org.apache.spark.ml.linalg.VectorUDT;
import org.apache.spark.ml.linalg.Vectors;
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.RowFactory;
import org.apache.spark.sql.SparkSession;
import org.apache.spark.sql.types.StructField;
import org.apache.spark.sql.types.StructType;

import java.util.Arrays;

import static org.apache.spark.sql.types.DataTypes.*;

// $example on$
// $example off$

public class JavaVectorSizeHintExample {
Expand Down Expand Up @@ -66,7 +65,7 @@ public static void main(String[] args) {
.setInputCols(new String[]{"hour", "mobile", "userFeatures"})
.setOutputCol("features");

// This dataframe can be used by used by downstream transformers as before
// This dataframe can be used by downstream transformers as before
Dataset<Row> output = assembler.transform(datasetWithSize);
System.out.println("Assembled columns 'hour', 'mobile', 'userFeatures' to vector column " +
"'features'");
Expand Down
2 changes: 1 addition & 1 deletion examples/src/main/python/ml/vector_size_hint_example.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@
inputCols=["hour", "mobile", "userFeatures"],
outputCol="features")

# This dataframe can be used by used by downstream transformers as before
# This dataframe can be used by downstream transformers as before
output = assembler.transform(datasetWithSize)
print("Assembled columns 'hour', 'mobile', 'userFeatures' to vector column 'features'")
output.select("features", "clicked").show(truncate=False)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ object VectorSizeHintExample {
.setInputCols(Array("hour", "mobile", "userFeatures"))
.setOutputCol("features")

// This dataframe can be used by used by downstream transformers as before
// This dataframe can be used by downstream transformers as before
val output = assembler.transform(datasetWithSize)
println("Assembled columns 'hour', 'mobile', 'userFeatures' to vector column 'features'")
output.select("features", "clicked").show(false)
Expand Down

0 comments on commit 6228902

Please sign in to comment.