No-DFS and compression patches #3

Merged
merged 2 commits into from May 6, 2012

Conversation

Projects
None yet
2 participants
Contributor

Ssmithcr commented May 6, 2012

  1. Remove reliance on DFS, but still allow the use of a DFS if you have one.
  2. Fast-enough compression of intermediate data files.

Ssmithcr added some commits May 6, 2012

Remove reliance on a distributed file system.
If DPARK_WORK_DIR is set instead of DPARK_SHARE_DIR, a web server will
be started on each slave to serve files to the other slaves.
Use compression for the intermediate shuffle files.
I tried Google's Snappy compression, but it still left my nodes I/O
bound; I found zlib level 1 to be the best compromise between CPU and
I/O.  Compression in this case takes around the same amount of time
that marshalling takes; however decompression is much less than the
time to unmarshal.

open, url = file, xxxx for readability

this patch should been updated to work with compression

Owner

Ssmithcr replied May 6, 2012

the compression patch changes this code; I was trying to keep them independent (nodfs patch applies first, then compression patch).

davies commented on 2c190a4 May 6, 2012

good, but need more work to handling fetching failure.

Owner

Ssmithcr replied May 6, 2012

Without reimplementing a DFS, there isn't a good way to deal with a slave failure -- recreating intermediate files from the shuffle jobs would basically require starting over.

If you're talking about a transient failure, though, that should be handled properly. If fetching fails, it will raise an exception. This will cause the task to fail. The scheduler will then reschedule the task.

davies added a commit that referenced this pull request May 6, 2012

Merge pull request #3 from Ssmithcr/master
No-DFS and compression patches.

TODO: handle fetching failure when slave lost.

@davies davies merged commit aa3edd5 into douban:master May 6, 2012

davies pushed a commit that referenced this pull request Jul 17, 2013

windreamer added a commit to windreamer/dpark that referenced this pull request May 25, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment