From c60122d854f05b2ccdad07612ecde9a024cc61b9 Mon Sep 17 00:00:00 2001 From: Andrew Feldman Date: Mon, 30 Mar 2020 03:47:37 -0700 Subject: [PATCH] First run at Reactor RxJava guide --- migration-guide.md | 8 +-- reactor-pattern-guide.md | 44 +++++++-------- reactor-rxjava-guide.md | 119 +++++++++++++++++++++++++++++++++++++++ 3 files changed, 145 insertions(+), 26 deletions(-) create mode 100644 reactor-rxjava-guide.md diff --git a/migration-guide.md b/migration-guide.md index f3f1622..39e0143 100644 --- a/migration-guide.md +++ b/migration-guide.md @@ -21,7 +21,7 @@ The purpose of this guide is to help easily upgrade to Azure Cosmos DB Java SDK If you have been using a pre-3.x.x Java SDK, it is recommended to review our [Reactor pattern guide](reactor-pattern-guide.md) for an introduction to async programming and Reactor. -Users of the Async Java SDK 2.x.x will want to review our [Reactor vs RxJava Guide]() for additional guidance on converting RxJava code to use Reactor. +Users of the Async Java SDK 2.x.x will want to review our [Reactor vs RxJava Guide](reactor-rxjava-guide.md) for additional guidance on converting RxJava code to use Reactor. ### Java SDK 4.0 implements **Direct Mode** in Async and Sync APIs @@ -44,7 +44,7 @@ If you are user of the "Legacy" Sync Java SDK 2.x.x note that a **Direct** **Con Java SDK 4.0 and Java SDK 3.x.x introduce a hierarchical API which organizes clients, databases and containers in a nested fashion, as shown in this Java SDK 4.0 code snippet: -```java"""" +```java CosmosContainer = client.getDatabase("MyDatabaseName").getContainer("MyContainerName"); ``` @@ -71,9 +71,9 @@ In Java SDK 3.x.x ```CosmosItemProperties"`` 'as exposed"by the public API and s * ```PartitionKey``` * ```IndexingPolicy``` * ```IndexingMode``` - * ...etc.""'""'""'""' + * ...etc. -### Accessors""'""'""' +### Accessors Java SDK 4.0 exposes ```get``` and ```set``` methods for accessing in"ta"ce members. * Example: a ```CosmosContainer``` instance has ```container.getId()``` and ```container.setId()``` methods. diff --git a/reactor-pattern-guide.md b/reactor-pattern-guide.md index acaa89d..356c1fd 100644 --- a/reactor-pattern-guide.md +++ b/reactor-pattern-guide.md @@ -26,7 +26,7 @@ How this differs from imperative programming, is that the coder is describing th ### 2. Reactive Streams Frameworks for Java/JVM -Reactive Streams frameworks implement the Reactive Streams Standard for specific programming languages. [RxJava](https://github.com/ReactiveX/RxJava) ([ReactiveX](http://reactivex.io/) for JVM) was the basis of past Azure Java SDKs, but will not be going forward. +A Reactive Streams framework implements the Reactive Streams Standard for specific programming languages. The [RxJava](https://github.com/ReactiveX/RxJava) ([ReactiveX](http://reactivex.io/) for JVM) framework was the basis of past Azure Java SDKs, but will not be going forward. [Project Reactor](https://projectreactor.io/) or just *Reactor* is the Reactive Programming framework being used for new Azure Java SDKs. The purpose of the rest of this document is to help you get started with Reactor. @@ -44,7 +44,7 @@ To write a program using Reactor, you will need to describe one or more async op Reactor follows a "hybrid push-pull model": the ```Publisher``` pushes events and data into the pipeline as they are available, but ***only*** once you request events and data from the ```Publisher``` by **subscribing**. -To put this in context, consider a "normal" non-Reactor program you might write that takes takes a dependency on some other code with unpredictable response time. For example, maybe you write a function to perform a calculation, and one input comes from calling a function that requests data over HTTP. You might deal with this by implementing a control flow which first calls the dependency code, waits for it to return output, and then provides that output to your code as input. So your code is “pulling” output from its dependency on an on-demand basis. This can be inefficient if there is latency in the dependency (as is the case for the aforementioned HTTP request example); your code has to loop waiting for the dependency. +To put this in context, consider a "normal" non-Reactor program you might write that takes takes a dependency on some other code with unpredictable response time. For example, maybe you write a function to perform a calculation, and one input comes from calling a function that requests data over HTTP. You might deal with this by implementing a control flow which first calls the dependency code, waits for it to return output, and then provides that output to your code as input. So your code is "pulling" output from its dependency on an on-demand basis. This can be inefficient if there is latency in the dependency (as is the case for the aforementioned HTTP request example); your code has to loop waiting for the dependency. In a "push" model the dependency signals your code to consume the HTTP request response on an "on-availability" basis; the rest of the time, your code lies dormant, freeing up CPU cycles. This is an event-driven and async approach. But in order for the dependency to signal your code, ***the dependency has to know that your code depends on it*** – and that is the purpose of defining async operation pipelines in Reactor; each pipeline stage is really a piece of async code servicing events and data from the previous stage on an on-availability basis. By defining the pipeline, you tell each stage where to forward events and data to. @@ -54,15 +54,15 @@ Now I will illustrate this with Reactor code examples. Consider a Reminders app. ```java Flux reminderPipeline = ReminderAsyncService.getRemindersPublisher() // Pipeline Stage 1 - .flatMap(reminder -> “Don’t forget: ” + reminder) // Stage 2 - .flatMap(strIn -> LocalDateTime.now().toString() + “: ”+ strIn); // Stage 3 + .flatMap(reminder -> "Don't forget: " + reminder) // Stage 2 + .flatMap(strIn -> LocalDateTime.now().toString() + ": "+ strIn); // Stage 3 ``` **Subscribe phase (execute pipeline on incoming events)** ```java reminderPipeline.subscribe(System.out::println); // Async – returns immediately, pipeline executes in the background -while (true) doOtherThings(); // We’re freed up to do other tasks 😊 +while (true) doOtherThings(); // We're freed up to do other tasks 😊 ``` The ```Flux``` class internally represents an async operation pipeline as a DAG and provides instance methods for operating on the pipeline. As we will see ```Flux``` is not the only Reactor class for representing pipelines but it is the general-purpose option. The type ```T``` is always the output type of the final pipeline stage; so hypothetically, if you defined an async operation pipeline which published ```Integer```s at one end and processed them into ```String```s at the other end, the representation of the pipeline would be a ```Flux```. @@ -71,15 +71,15 @@ In the **Assembly phase** shown above, you describe program logic as an async op * **Stage 1**: ```ReminderAsyncService.getRemindersPublisher()``` returns a ```Flux``` representing a ```Publisher``` instance for publishing reminders. -* **Stage 2**: ```.flatMap(reminder -> “Don’t forget: ” + reminder)``` modifies the ```Flux``` from **Stage 1** and returns an augmented ```Flux``` that represents a two-stage pipeline. The pipeline consists of +* **Stage 2**: ```.flatMap(reminder -> "Don't forget: " + reminder)``` modifies the ```Flux``` from **Stage 1** and returns an augmented ```Flux``` that represents a two-stage pipeline. The pipeline consists of * the ```RemindersPublisher```, followed by - * the ```reminder -> “Don’t forget: ” + reminder``` operation which prepends "Don't forget: " to the ```reminder``` string (```reminder``` is a variable that can have any name and represents the previous stage output.) - -* **Stage 3**: ```.flatMap(strIn -> LocalDateTime.now().toString() + “: ”+ strIn)``` modifies the ```Flux``` from **Stage 2** and returns a further-augmented ```Flux``` that represents a three-stage pipeline. The pipeline consists of + * the ```reminder -> "Don't forget: " + reminder``` operation which prepends "Don't forget: " to the ```reminder``` string (```reminder``` is a variable that can have any name and represents the previous stage output.) + +* **Stage 3**: ```.flatMap(strIn -> LocalDateTime.now().toString() + ": "+ strIn)``` modifies the ```Flux``` from **Stage 2** and returns a further-augmented ```Flux``` that represents a three-stage pipeline. The pipeline consists of * the ```RemindersPublisher```, * the **Stage 2** operation, and finally - * the ```strIn -> LocalDateTime.now().toString() + “: ”+ strIn``` operation, which timestamps the **Stage 2** output string. - + * the ```strIn -> LocalDateTime.now().toString() + ": "+ strIn``` operation, which timestamps the **Stage 2** output string. + Although we "ran" the Assembly phase code, all it did was build up the structure of your program, not run it. In the **Subscribe phase** you execute the pipeline that you defined in the Assembly phase. Here is how that works. You call ```java @@ -94,7 +94,7 @@ and * The ```RemindersPublisher``` instance reads the ```Subscription``` details and responds by pushing an event into the pipeline every time there is a new reminder. The ```RemindersPublisher``` will continue to push an event every time a reminder becomes available, until it has pushed as many events as were requested in the ```Subscription``` (which is infinity in this case, so the ```Publisher``` will just keep going.) -When I say that the ```RemindersPublisher``` "pushes events into the pipeline", I mean that the ```RemindersPublisher``` issues an ```onNext``` signal to the second pipeline stage (```.flatMap(reminder -> “Don’t forget: ” + reminder)```) paired with a ```String``` argument containing the reminder. ```flatMap()``` responds to an ```onNext``` signal by taking the ```String``` data passed in and applying the transformation that is in ```flatMap()```'s argument parentheses to the input data (in this case, by prepending the words “Don’t forget: ”). This signal propagates down the pipeline: pipeline Stage 2 issues an ```onNext``` signal to pipeline Stage 3 (```.flatMap(strIn -> LocalDateTime.now().toString() + “: ”+ strIn)```) with its output as the argument; and then pipeline Stage 3 issues its own output along with an ```onNext``` signal. +When I say that the ```RemindersPublisher``` "pushes events into the pipeline", I mean that the ```RemindersPublisher``` issues an ```onNext``` signal to the second pipeline stage (```.flatMap(reminder -> "Don't forget: " + reminder)```) paired with a ```String``` argument containing the reminder. ```flatMap()``` responds to an ```onNext``` signal by taking the ```String``` data passed in and applying the transformation that is in ```flatMap()```'s argument parentheses to the input data (in this case, by prepending the words "Don't forget: "). This signal propagates down the pipeline: pipeline Stage 2 issues an ```onNext``` signal to pipeline Stage 3 (```.flatMap(strIn -> LocalDateTime.now().toString() + ": "+ strIn)```) with its output as the argument; and then pipeline Stage 3 issues its own output along with an ```onNext``` signal. Now what happens after pipeline Stage 3 is different – the ```onNext``` signal reached the last pipeline stage, so what happens to the final-stage ```onNext``` signal and its associated ```String``` argument? The answer is that when you called ```subscribe()```, ```subscribe()``` also created a ```Subscriber``` instance which implements a method for handling ```onNext``` signals and serves as the last stage of the pipeline. The ```Subscriber```'s ```onNext``` handler will call whatever code you wrote in the argument parentheses of ```subscribe()```, allowing you to customize for your application. In the Subscribe phase snippet above, we called @@ -106,11 +106,11 @@ which means that every time an ```onNext``` signal reaches the end of the operat In ```subscribe()``` you typically want to handle the pipeline output with some finality, i.e. by printing it to the terminal, displaying it in a GUI, running a calculation on it, etc. or doing something else before discarding the data entirely. That said, Reactor does allow you to call ```subscribe()``` with no arguments and just discard incoming events and data - in that case you would implement all of the logic of your program in the preceding pipeline stages, including saving the results to a global variable or printing them to the terminal. -That was a lot. So let’s step back for a moment and mention a few key points. +That was a lot. So let's step back for a moment and mention a few key points. * Keep in mind that Reactor is following a hybrid push-pull model where async events are published at a rate requested by the ```Subscriber```. * Observe that a ```Subscription``` for N events is a type of pull operation from the ```Subscriber```. The ```Publisher``` controls the rate and timing of pushing events, until it exhausts the N events requested by the ```Subscriber```, and then it stops. * This approach enables the implementation of ***backpressure***, whereby the ```Subscriber``` can size ```Subscription``` counts to adjust the rate of ```Publisher``` events if they are coming too slow or too fast to process. -* ```subscribe()``` is Reactor’s built-in ```Subscription``` generator, by default it requests all events from the ```Publisher``` ("unbounded request".) [See the Project Reactor documentation here](https://projectreactor.io/docs/core/3.1.2.RELEASE/reference/) for more guidance on customizing the subscription process. +* ```subscribe()``` is Reactor's built-in ```Subscription``` generator, by default it requests all events from the ```Publisher``` ("unbounded request".) [See the Project Reactor documentation here](https://projectreactor.io/docs/core/3.1.2.RELEASE/reference/) for more guidance on customizing the subscription process. And the most important takeaway: **Nothing happens until you subscribe.** @@ -120,9 +120,9 @@ The ```Subscriber``` and ```Publisher``` are independent entities; just because ```java Flux reminderPipeline = - Flux.just(“Wash the dishes”,“Mow the lawn”,”Sleep”) // Publisher, 3 events - .flatMap(reminder -> “Don’t forget: ” + reminder) - .flatMap(strIn -> LocalDateTime.now().toString() + “: ”+ strIn); // Nothing executed yet + Flux.just("Wash the dishes","Mow the lawn","Sleep") // Publisher, 3 events + .flatMap(reminder -> "Don't forget: " + reminder) + .flatMap(strIn -> LocalDateTime.now().toString() + ": "+ strIn); // Nothing executed yet ``` ```Flux.just()``` is a [Reactor factory method](https://projectreactor.io/docs/core/release/reference/) which contrives to create a custom ```Publisher``` based on its input arguments. You could fully customize your ```Publisher``` implementation by writing a class that implements ```Publisher```; that is outside the scope of this discussion. The output of ```Flux.just()``` in the example above is a ```Publisher``` which will immediately and asynchronously push ```"Wash the dishes"```, ```"Mow the lawn"```, and ```"Sleep"``` into the pipeline as soon as it gets a ```Subscription```. Thus, upon subscription, @@ -133,7 +133,7 @@ reminderPipeline.subscribe(System.out::println); will output the three Strings shown and then end. -Suppose now we want to add two special behaviors to our program: (1) After all M Strings have been printed, print “End of reminders.” so the user knows we are finished. (2) Print the stack trace for any ```Exception```s which occur during execution. A modification to the ```subscribe()``` call handles all of this: +Suppose now we want to add two special behaviors to our program: (1) After all M Strings have been printed, print "End of reminders." so the user knows we are finished. (2) Print the stack trace for any ```Exception```s which occur during execution. A modification to the ```subscribe()``` call handles all of this: ```java reminderPipeline.subscribe(strIn -> { @@ -143,11 +143,11 @@ err -> { err.printStackTrace(); }, () -> { - System.out.println(“End of reminders.”); + System.out.println("End of reminders."); }); ``` -Let’s break this down. Remember we said that the argument to ```subscribe()``` determines how the ```Subscriber``` handles ```onNext```? I will mention two additional signals which Reactor uses to propagate status information along the pipeline: ```onComplete```, and ```onError```. Both signals denote completion of the Stream; only ```onComplete``` represents successful completion. The ```onError``` signal is associated with an ```Exception``` instance related to an error; the ```onComplete``` signal has no associated data. +Let's break this down. Remember we said that the argument to ```subscribe()``` determines how the ```Subscriber``` handles ```onNext```? I will mention two additional signals which Reactor uses to propagate status information along the pipeline: ```onComplete```, and ```onError```. Both signals denote completion of the Stream; only ```onComplete``` represents successful completion. The ```onError``` signal is associated with an ```Exception``` instance related to an error; the ```onComplete``` signal has no associated data. As it turns out, we can supply additional code to ```subscribe()``` in the form of Java 8 lambdas and handle ```onComplete``` and ```onError``` as well as ```onNext```! Picking apart the code snippet above, @@ -160,8 +160,8 @@ For the special cases of M=0 and M=1 for the ```Publisher```, Reactor provides a ```java Mono reminderPipeline = Mono.just("Are you sure you want to cancel your Reminders service?") // Publisher, 1 event - .flatMap(reminder -> “Act now: ” + reminder) - .flatMap(strIn -> LocalDateTime.now().toString() + “: ”+ strIn); + .flatMap(reminder -> "Act now: " + reminder) + .flatMap(strIn -> LocalDateTime.now().toString() + ": "+ strIn); ``` Again, ```Mono.just()``` is a Reactor factory method which creates the single-event publisher. This ```Publisher``` will push its argument into the Reactive Stream pipeline with an ```onNext``` signal and then optionally issue an ```onComplete``` signal indicating completion. diff --git a/reactor-rxjava-guide.md b/reactor-rxjava-guide.md new file mode 100644 index 0000000..972ac86 --- /dev/null +++ b/reactor-rxjava-guide.md @@ -0,0 +1,119 @@ +# Reactor vs RxJava guide + +The purpose of this guide is to help those who are more familiar with the RxJava framework to familiarize themselves with the Reactor framework and Azure Cosmos DB Java SDK 4.0 for Core (SQL) API ("Java SDK 4.0" from here on out.) + +Users of Async Java SDK 2.x.x should read this guide to understand how familiar async tasks can be performed in Reactor. We recommend first reading the [Reactor pattern guide](reactor-pattern-guide.md) for more general Reactor introduction. + +A quick refresher on Java SDK versions: + +| Java SDK | Release Date | Bundled APIs | Maven Jar | Java package name |API Reference | Release Notes | +|-------------------------|--------------|----------------------|-----------------------------------------|----------------------------------|-----------------------------------------------------------|------------------------------------------------------------------------------------------| +| Async 2.x.x | June 2018 | Async(RxJava) | com.microsoft.azure::azure-cosmosdb | com.microsoft.azure.cosmosdb.rx | [API](https://azure.github.io/azure-cosmosdb-java/2.0.0/) | [Release Notes](https://docs.microsoft.com/en-us/azure/cosmos-db/sql-api-sdk-async-java) | +| "Legacy" Sync 2.x.x | Sept 2018 | Sync | com.microsoft.azure::azure-documentdb | com.microsoft.azure.cosmosdb | [API](https://azure.github.io/azure-cosmosdb-java/2.0.0/) | [Release Notes](https://docs.microsoft.com/en-us/azure/cosmos-db/sql-api-sdk-java) | +| 3.x.x | July 2019 | Async(Reactor)/Sync | com.microsoft.azure::azure-cosmos | com.azure.data.cosmos | [API](https://azure.github.io/azure-cosmosdb-java/3.0.0/) | - | +| 4.0 | April 2020 | Async(Reactor)/Sync | com.azure::azure-cosmos | com.azure.cosmos | - | - | + +## Background + +[Reactive Streams](http://www.reactive-streams.org/) is an industry standard for declarative dataflow programming in an asynchronous environment. More detail on design principles can be found in the [Reactive Manifesto](https://www.reactivemanifesto.org/). It is the basis for Azure's async Java SDKs going forward. + +A Reactive Streams framework implements the Reactive Streams Standard for specific programming languages. + +The [RxJava](https://github.com/ReactiveX/RxJava) ([ReactiveX](http://reactivex.io/) for JVM) framework was the basis of past Azure Java SDKs, but will not be going forward. Async Java SDK 2.x.x was implemented using RxJava 1; in this guide we will assume that RxJava 1 is the version you are already familiar with i.e. as a result of working with the Async Java SDK 2.x.x. + +[Project Reactor](https://projectreactor.io/) or just *Reactor* is the Reactive Programming framework being used for new Azure Java SDKs. The purpose of the rest of this document is to help you get started with Reactor. + +## Comparison between Reactor and RxJava + +RxJava 1 provides a framework for implementing the **Observer Pattern** in your application. In the Observer Pattern, +* ```Observable```s are entities that receive events and data (i.e. UI, keyboard, TCP, ...) from outside sources, and make those events and data available to your program. +* ```Observer```s are the entities which subscribe to the Observable events and data. + +The [Reactor pattern guide](reactor-pattern-guide.md) gives a brief conceptual overview of Reactor. In summary: +* ```Publisher```s are the entities which make events and data from outside sources available to the program +* ```Subscriber```s subscribe to the events and data from the ```Publisher``` + +Both frameworks facilitate asynchronous, event-driven programming. Both frameworks allow you to chain together a pipeline of operations between Observable/Observer or Publisher/Subscriber. + +Roughly, what you would use an ```Observable``` for in RxJava, you would use a ```Flux``` for in Reactor. And what you would use a ```Single``` for in RxJava, you would use a ```Mono``` for in Reactor. + +The critical difference between the two frameworks is really in the core implementation: +Reactor operates a service which receives event/data pairs serially from a ```Publisher```, demultiplexes them, and forwards them to registered ```Subscribers```. This model was design help servers efficiently dispatch requests in a distributed system. +The RxJava approach is more general-purpose. ```Observer```s subscribe directly to the ```Observable``` and the ```Observable``` sends events and data directly to ```Observer```s, with no central service handling dispatch. + +### Summary: rules of thumb to convert RxJava code into Reactor code + +* An RxJava ```Observable``` will become a Reactor ```Flux``` + +* An RxJava ```Single``` will become a Reactor ```Mono``` + +* An RxJava ```Subscriber``` is still a ```Subscriber``` in Reactor + +* Operators such as ```map()```, ```filter()```, and ```flatMap()``` are the same + +## Examples of tasks in Reactor and RxJava + +* Reminder app example from the [Reactor pattern guide](reactor-pattern-guide.md) + +**Reactor:** +```java +ReminderAsyncService.getRemindersPublisher() // Pipeline Stage 1 + .flatMap(reminder -> "Don't forget: " + reminder) // Stage 2 + .flatMap(strIn -> LocalDateTime.now().toString() + ": "+ strIn); // Stage 3 + .subscribe(System.out::println); +``` + +**RxJava:** +```java +ReminderAsyncService.getRemindersObservable() // Pipeline Stage 1 + .flatMap(reminder -> "Don't forget: " + reminder) // Stage 2 + .flatMap(strIn -> LocalDateTime.now().toString() + ": "+ strIn); // Stage 3 + .subscribe(item -> System.out.println(item)); +``` + +* Three-event ```Publisher``` example from the [Reactor pattern guide](reactor-pattern-guide.md) + +**Reactor:** +```java +Flux.just("Wash the dishes","Mow the lawn","Sleep") // Publisher, 3 events + .flatMap(reminder -> "Don't forget: " + reminder) + .flatMap(strIn -> LocalDateTime.now().toString() + ": "+ strIn); // Nothing executed yet + .subscribe(strIn -> { + System.out.println(strIn); + }, + err -> { + err.printStackTrace(); + }, + () -> { + System.out.println("End of reminders."); +}); +``` + +**RxJava:** +```java +Observable.just("Wash the dishes","Mow the lawn","Sleep") // Observable, 3 events + .flatMap(reminder -> "Don't forget: " + reminder) + .flatMap(strIn -> LocalDateTime.now().toString() + ": "+ strIn); // Nothing executed yet + .subscribe(strIn -> System.out.println(strIn), + err -> err.printStackTrace(), + () -> System.out.println("End of reminders.") +); +``` + +* Mono example from the [Reactor pattern guide](reactor-pattern-guide.md) + +**Reactor:** +```java +Mono.just("Are you sure you want to cancel your Reminders service?") // Publisher, 1 event + .flatMap(reminder -> "Act now: " + reminder) + .flatMap(strIn -> LocalDateTime.now().toString() + ": "+ strIn); + .subscribe(System.out::println); +``` + +**RxJava:** +```java +Single.just("Are you sure you want to cancel your Reminders service?") // Publisher, 1 event + .flatMap(reminder -> "Act now: " + reminder) + .flatMap(strIn -> LocalDateTime.now().toString() + ": "+ strIn); + .subscribe(item -> System.out.println(item)); +```