Skip to content

Commit

Permalink
Add DataFrameHelper disclaimers as these methods use collect
Browse files Browse the repository at this point in the history
  • Loading branch information
MrPowers committed Sep 2, 2017
1 parent de0770f commit 4402979
Showing 1 changed file with 14 additions and 10 deletions.
24 changes: 14 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -367,7 +367,7 @@ Here are the contents of `actualDF`:

### `isNullOrBlank`

The `isNullOrBlank` method retuns `true` if a column is `null` or `blank` and `false` otherwise.
The `isNullOrBlank` method returns `true` if a column is `null` or `blank` and `false` otherwise.

Suppose you start with the following `sourceDF`:

Expand Down Expand Up @@ -492,7 +492,7 @@ Suppose you have the following `sourceDF`:

`sourceDF.printSchemaInCodeFormat()` will output the following rows in the console:

```
```scala
StructType(
List(
StructField("team", StringType, true),
Expand All @@ -506,6 +506,8 @@ StructType(

### `twoColumnsToMap`

* N.B. This method uses `collect` and should only be called on small DataFrames.*

Converts two columns in a DataFrame to a Map.

Suppose we have the following `sourceDF`:
Expand All @@ -521,25 +523,25 @@ Suppose we have the following `sourceDF`:

Let's convert this DataFrame to a Map with `island` as the key and `fun_level` as the value.

```
```scala
val actual = DataFrameHelpers.twoColumnsToMap[String, Integer](
sourceDF,
"island",
"fun_level"
)

println(actual)
```

```
Map(
"boracay" -> 7,
"long island" -> 9
)
// Map(
// "boracay" -> 7,
// "long island" -> 9
// )
```

### `columnToArray`

* N.B. This method uses `collect` and should only be called on small DataFrames.*

This function converts a column to an array of items.

Suppose we have the following `sourceDF`:
Expand All @@ -561,11 +563,13 @@ val actual = DataFrameHelpers.columnToArray[Int](sourceDF, "num")

println(actual)

Array(1, 2, 3)
// Array(1, 2, 3)
```

### `toArrayOfMaps`

* N.B. This method uses `collect` and should only be called on small DataFrames.*

Converts a DataFrame to an array of Maps.

Suppose we have the following `sourceDF`:
Expand Down

0 comments on commit 4402979

Please sign in to comment.