Skip to content

Commit

Permalink
[SPARK-13888][DOC] Remove Akka Receiver doc and refer to the DStream …
Browse files Browse the repository at this point in the history
…Akka project

## What changes were proposed in this pull request?

I have copied the docs of Streaming Akka to https://github.com/spark-packages/dstream-akka/blob/master/README.md

So we can remove them from Spark now.

## How was this patch tested?

Only document changes.

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Author: Shixiong Zhu <shixiong@databricks.com>

Closes #11711 from zsxwing/remove-akka-doc.
  • Loading branch information
zsxwing authored and rxin committed Mar 15, 2016
1 parent e649580 commit 43304b1
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 78 deletions.
61 changes: 0 additions & 61 deletions docs/streaming-custom-receivers.md
Original file line number Diff line number Diff line change
Expand Up @@ -256,64 +256,3 @@ The following table summarizes the characteristics of both types of receivers
<td></td>
</tr>
</table>

## Implementing and Using a Custom Actor-based Receiver

Custom [Akka Actors](http://doc.akka.io/docs/akka/2.3.11/scala/actors.html) can also be used to
receive data. Here are the instructions.

1. **Linking:** You need to add the following dependency to your SBT or Maven project (see [Linking section](streaming-programming-guide.html#linking) in the main programming guide for further information).

groupId = org.apache.spark
artifactId = spark-streaming-akka_{{site.SCALA_BINARY_VERSION}}
version = {{site.SPARK_VERSION_SHORT}}

2. **Programming:**

<div class="codetabs">
<div data-lang="scala" markdown="1" >

You need to extend [`ActorReceiver`](api/scala/index.html#org.apache.spark.streaming.akka.ActorReceiver)
so as to store received data into Spark using `store(...)` methods. The supervisor strategy of
this actor can be configured to handle failures, etc.

class CustomActor extends ActorReceiver {
def receive = {
case data: String => store(data)
}
}

// A new input stream can be created with this custom actor as
val ssc: StreamingContext = ...
val lines = AkkaUtils.createStream[String](ssc, Props[CustomActor](), "CustomReceiver")

See [ActorWordCount.scala](https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/ActorWordCount.scala) for an end-to-end example.
</div>
<div data-lang="java" markdown="1">

You need to extend [`JavaActorReceiver`](api/scala/index.html#org.apache.spark.streaming.akka.JavaActorReceiver)
so as to store received data into Spark using `store(...)` methods. The supervisor strategy of
this actor can be configured to handle failures, etc.

class CustomActor extends JavaActorReceiver {
@Override
public void onReceive(Object msg) throws Exception {
store((String) msg);
}
}

// A new input stream can be created with this custom actor as
JavaStreamingContext jssc = ...;
JavaDStream<String> lines = AkkaUtils.<String>createStream(jssc, Props.create(CustomActor.class), "CustomReceiver");

See [JavaActorWordCount.scala](https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/JavaActorWordCount.scala) for an end-to-end example.
</div>
</div>

3. **Deploying:** As with any Spark applications, `spark-submit` is used to launch your application.
You need to package `spark-streaming-akka_{{site.SCALA_BINARY_VERSION}}` and its dependencies into
the application JAR. Make sure `spark-core_{{site.SCALA_BINARY_VERSION}}` and `spark-streaming_{{site.SCALA_BINARY_VERSION}}`
are marked as `provided` dependencies as those are already present in a Spark installation. Then
use `spark-submit` to launch your application (see [Deploying section](streaming-programming-guide.html#deploying-applications) in the main programming guide).

<span class="badge" style="background-color: grey">Python API</span> Since actors are available only in the Java and Scala libraries, AkkaUtils is not available in the Python API.
24 changes: 7 additions & 17 deletions docs/streaming-programming-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -594,7 +594,7 @@ data from a source and stores it in Spark's memory for processing.
Spark Streaming provides two categories of built-in streaming sources.

- *Basic sources*: Sources directly available in the StreamingContext API.
Examples: file systems, socket connections, and Akka actors.
Examples: file systems, and socket connections.
- *Advanced sources*: Sources like Kafka, Flume, Kinesis, Twitter, etc. are available through
extra utility classes. These require linking against extra dependencies as discussed in the
[linking](#linking) section.
Expand Down Expand Up @@ -631,7 +631,7 @@ as well as to run the receiver(s).
We have already taken a look at the `ssc.socketTextStream(...)` in the [quick example](#a-quick-example)
which creates a DStream from text
data received over a TCP socket connection. Besides sockets, the StreamingContext API provides
methods for creating DStreams from files and Akka actors as input sources.
methods for creating DStreams from files as input sources.

- **File Streams:** For reading data from files on any file system compatible with the HDFS API (that is, HDFS, S3, NFS, etc.), a DStream can be created as:

Expand All @@ -658,17 +658,12 @@ methods for creating DStreams from files and Akka actors as input sources.

<span class="badge" style="background-color: grey">Python API</span> `fileStream` is not available in the Python API, only `textFileStream` is available.

- **Streams based on Custom Actors:** DStreams can be created with data streams received through Akka
actors by using `AkkaUtils.createStream(ssc, actorProps, actor-name)`. See the [Custom Receiver
Guide](streaming-custom-receivers.html) for more details.

<span class="badge" style="background-color: grey">Python API</span> Since actors are available only in the Java and Scala
libraries, `AkkaUtils.createStream` is not available in the Python API.
- **Streams based on Custom Receivers:** DStreams can be created with data streams received through custom receivers. See the [Custom Receiver
Guide](streaming-custom-receivers.html) and [DStream Akka](https://github.com/spark-packages/dstream-akka) for more details.

- **Queue of RDDs as a Stream:** For testing a Spark Streaming application with test data, one can also create a DStream based on a queue of RDDs, using `streamingContext.queueStream(queueOfRDDs)`. Each RDD pushed into the queue will be treated as a batch of data in the DStream, and processed like a stream.

For more details on streams from sockets, files, and actors,
see the API documentations of the relevant functions in
For more details on streams from sockets and files, see the API documentations of the relevant functions in
[StreamingContext](api/scala/index.html#org.apache.spark.streaming.StreamingContext) for
Scala, [JavaStreamingContext](api/java/index.html?org/apache/spark/streaming/api/java/JavaStreamingContext.html)
for Java, and [StreamingContext](api/python/pyspark.streaming.html#pyspark.streaming.StreamingContext) for Python.
Expand Down Expand Up @@ -2439,13 +2434,8 @@ that can be called to store the data in Spark. So, to migrate your custom networ
BlockGenerator object (does not exist any more in Spark 1.0 anyway), and use `store(...)` methods on
received data.

**Actor-based Receivers**: Data could have been received using any Akka Actors by extending the actor class with
`org.apache.spark.streaming.receivers.Receiver` trait. This has been renamed to
[`org.apache.spark.streaming.receiver.ActorHelper`](api/scala/index.html#org.apache.spark.streaming.receiver.ActorHelper)
and the `pushBlock(...)` methods to store received data has been renamed to `store(...)`. Other helper classes in
the `org.apache.spark.streaming.receivers` package were also moved
to [`org.apache.spark.streaming.receiver`](api/scala/index.html#org.apache.spark.streaming.receiver.package)
package and renamed for better clarity.
**Actor-based Receivers**: The Actor-based Receiver APIs have been moved to [DStream Akka](https://github.com/spark-packages/dstream-akka).
Please refer to the project for more details.

***************************************************************************************************
***************************************************************************************************
Expand Down

0 comments on commit 43304b1

Please sign in to comment.