Skip to content

Commit

Permalink
fixtypos
Browse files Browse the repository at this point in the history
  • Loading branch information
GuoPhilipse committed Apr 22, 2022
1 parent 6f29c61 commit dc16cb4
Showing 1 changed file with 7 additions and 7 deletions.
14 changes: 7 additions & 7 deletions hadoop-tools/hadoop-distcp/src/site/markdown/DistCp.md.vm
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ Overview

[The erstwhile implementation of DistCp]
(http://hadoop.apache.org/docs/r1.2.1/distcp.html) has its share of quirks
and drawbacks, both in its usage, as well as its extensibility and
and drawbacks, both in its usage and its extensibility and
performance. The purpose of the DistCp refactor was to fix these
shortcomings, enabling it to be used and extended programmatically. New
paradigms have been introduced to improve runtime and setup performance,
Expand Down Expand Up @@ -179,7 +179,7 @@ $H3 Update and Overwrite
hdfs://nn2:8020/target/10 32
hdfs://nn2:8020/target/20 64

Will effect:
The result will be:

hdfs://nn2:8020/target/1 32
hdfs://nn2:8020/target/2 32
Expand All @@ -190,7 +190,7 @@ $H3 Update and Overwrite
because it doesn't exist at the target. `10` and `20` are overwritten since
the contents don't match the source.

If `-update` is used, `1` is skipped because the file-length and contents match. `2` is copied because it doesnt exist at the target. `10` and `20` are overwritten since the contents don’t match the source. However, if `-append` is additionally used, then only `10` is overwritten (source length less than destination) and `20` is appended with the change in file (if the files match up to the destination's original length).
If `-update` is used, `1` is skipped because the file-length and contents match. `2` is copied because it doesn't exist at the target. `10` and `20` are overwritten since the contents don’t match the source. However, if `-append` is additionally used, then only `10` is overwritten (source length less than destination) and `20` is appended with the change in file (if the files match up to the destination's original length).

If `-overwrite` is used, `1` is overwritten as well.

Expand Down Expand Up @@ -269,7 +269,7 @@ $H4 Experiment 1: Syncing diff of two adjacent snapshots

$H4 Experiment 2: syncing diff of two non-adjacent snapshots

First do a clean up from Experiment 1.
First do a cleanup from Experiment 1.

hdfs dfs -rm -skipTrash /dst/1.txt

Expand Down Expand Up @@ -514,7 +514,7 @@ $H3 InputFormats and MapReduce Components
* A file with the same name exists at target, but `-overwrite` is
specified.
* A file with the same name exists at target, but differs in block-size
(and block-size needs to be preserved.
and block-size needs to be preserved.

* **CopyCommitter:** This class is responsible for the commit-phase of the
DistCp job, including:
Expand Down Expand Up @@ -576,7 +576,7 @@ $H3 MapReduce and other side-effects
map on a re-execution will be marked as "skipped".
* If a map fails `mapreduce.map.maxattempts` times, the remaining map tasks
will be killed (unless `-i` is set).
* If `mapreduce.map.speculative` is set set final and true, the result of the
* If `mapreduce.map.speculative` is set to be true, the result of the
copy is undefined.

$H3 DistCp and Object Stores
Expand Down Expand Up @@ -691,7 +691,7 @@ Frequently Asked Questions
directory is copied over, rather than the source-directory itself. This
behaviour is consistent with the legacy DistCp implementation as well.

2. **How does the new DistCp differ in semantics from the Legacy DistCp?**
2. **How does the new DistCp differs in semantics from the Legacy DistCp?**

* Files that are skipped during copy used to also have their
file-attributes (permissions, owner/group info, etc.) unchanged, when
Expand Down

0 comments on commit dc16cb4

Please sign in to comment.