From 8af1bf8f0d1a59aea35633780ca439f6c459bb78 Mon Sep 17 00:00:00 2001 From: WeichenXu Date: Mon, 6 Jun 2016 07:23:10 -0700 Subject: [PATCH 1/2] fix typo in documents --- docs/graphx-programming-guide.md | 2 +- docs/hardware-provisioning.md | 2 +- docs/streaming-programming-guide.md | 4 ++-- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/graphx-programming-guide.md b/docs/graphx-programming-guide.md index 9dea9b5904d2d..81cf17475fb60 100644 --- a/docs/graphx-programming-guide.md +++ b/docs/graphx-programming-guide.md @@ -132,7 +132,7 @@ var graph: Graph[VertexProperty, String] = null Like RDDs, property graphs are immutable, distributed, and fault-tolerant. Changes to the values or structure of the graph are accomplished by producing a new graph with the desired changes. Note -that substantial parts of the original graph (i.e., unaffected structure, attributes, and indicies) +that substantial parts of the original graph (i.e., unaffected structure, attributes, and indices) are reused in the new graph reducing the cost of this inherently functional data structure. The graph is partitioned across the executors using a range of vertex partitioning heuristics. As with RDDs, each partition of the graph can be recreated on a different machine in the event of a failure. diff --git a/docs/hardware-provisioning.md b/docs/hardware-provisioning.md index 60ecb4f483afa..bb6f616b18a24 100644 --- a/docs/hardware-provisioning.md +++ b/docs/hardware-provisioning.md @@ -22,7 +22,7 @@ Hadoop and Spark on a common cluster manager like [Mesos](running-on-mesos.html) * If this is not possible, run Spark on different nodes in the same local-area network as HDFS. -* For low-latency data stores like HBase, it may be preferrable to run computing jobs on different +* For low-latency data stores like HBase, it may be preferable to run computing jobs on different nodes than the storage system to avoid interference. # Local Disks diff --git a/docs/streaming-programming-guide.md b/docs/streaming-programming-guide.md index 78ae6a7407467..efcda7ff9afd2 100644 --- a/docs/streaming-programming-guide.md +++ b/docs/streaming-programming-guide.md @@ -1259,7 +1259,7 @@ dstream.foreachRDD(sendRecord) This is incorrect as this requires the connection object to be serialized and sent from the -driver to the worker. Such connection objects are rarely transferrable across machines. This +driver to the worker. Such connection objects are rarely transferable across machines. This error may manifest as serialization errors (connection object not serializable), initialization errors (connection object needs to be initialized at the workers), etc. The correct solution is to create the connection object at the worker. @@ -2037,7 +2037,7 @@ and configuring them to receive different partitions of the data stream from the For example, a single Kafka input DStream receiving two topics of data can be split into two Kafka input streams, each receiving only one topic. This would run two receivers, allowing data to be received in parallel, thus increasing overall throughput. These multiple -DStreams can be unioned together to create a single DStream. Then the transformations that were +DStreams can be united together to create a single DStream. Then the transformations that were being applied on a single input DStream can be applied on the unified stream. This is done as follows.
From 0e359a396aa3bd708157f1e5a235fd0bb78f18bb Mon Sep 17 00:00:00 2001 From: WeichenXu Date: Mon, 6 Jun 2016 17:05:36 -0700 Subject: [PATCH 2/2] recover 'unioned' not typo --- docs/streaming-programming-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/streaming-programming-guide.md b/docs/streaming-programming-guide.md index efcda7ff9afd2..0a6a0397d9570 100644 --- a/docs/streaming-programming-guide.md +++ b/docs/streaming-programming-guide.md @@ -2037,7 +2037,7 @@ and configuring them to receive different partitions of the data stream from the For example, a single Kafka input DStream receiving two topics of data can be split into two Kafka input streams, each receiving only one topic. This would run two receivers, allowing data to be received in parallel, thus increasing overall throughput. These multiple -DStreams can be united together to create a single DStream. Then the transformations that were +DStreams can be unioned together to create a single DStream. Then the transformations that were being applied on a single input DStream can be applied on the unified stream. This is done as follows.