forked from douban/dpark
/
README
26 lines (18 loc) · 781 Bytes
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Dpark is a Python clone of Spark, MapReduce computing
framework supporting regression computation.
Word count example wc.py:
from dpark import DparkContext
ctx = DparkContext()
file = ctx.textFile("/tmp/words.txt")
words = file.flatMap(lambda x:x.split()).map(lambda x:(x,1))
wc = words.reduceByKey(lambda x,y:x+y).collectAsMap()
print wc
This scripts can run locally or on Mesos cluster without
any modification, just with different command arguments:
$ python wc.py
$ python wc.py -m process
$ python wc.py -m mesos
See examples/ for more examples.
Some Chinese docs: https://github.com/jackfengji/test_pro/wiki
DPark run on Mesos -r 1292597@trunk or 112ea04 of github mirror
Mailing list: dpark-users@googlegroups.com (http://groups.google.com/group/dpark-users)