Skip to content

Commit c2656b7

Browse files
authored
Update III. Getting started with machine learning pipelines.py
1 parent 3a3a771 commit c2656b7

File tree

1 file changed

+6
-0
lines changed

1 file changed

+6
-0
lines changed

Introduction to PySpark/III. Getting started with machine learning pipelines.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -139,3 +139,9 @@
139139
### Transform the data
140140
# Fit and transform the data
141141
piped_data = flights_pipe.fit(model_data).transform(model_data)
142+
#|
143+
#|
144+
### Split the data
145+
# Split the data into training and test sets
146+
# training with 60% of the data, and test with 40%
147+
training, test = piped_data.randomSplit([.6, .4])

0 commit comments

Comments
 (0)