Skip to content

Commit 6870f85

Browse files
authored
Update II Data engineering toolbox.py
1 parent d360b38 commit 6870f85

File tree

1 file changed

+11
-1
lines changed

1 file changed

+11
-1
lines changed

Introduction to Data Engineering/II Data engineering toolbox.py

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -136,4 +136,14 @@ def parallel_apply(apply_func, groups, nb_cores):
136136
print(athlete_events_spark.groupBy('Year').mean('Age').show())
137137

138138
#---
139-
#
139+
#Running PySpark files
140+
"""spark-submit. This tool can help you submit your application to a spark cluster.
141+
142+
spark-submit \
143+
--master local[4] \
144+
/home/repl/spark-script.py
145+
146+
147+
1 An error.
148+
ok 2 A DataFrame with average Olympian heights by year.
149+
3 A DataFrame with Olympian ages. """

0 commit comments

Comments
 (0)