Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GroupBy "time gap" Issue #844

Closed
benjchristensen opened this issue Feb 10, 2014 · 16 comments
Closed

GroupBy "time gap" Issue #844

benjchristensen opened this issue Feb 10, 2014 · 16 comments

Comments

@benjchristensen
Copy link
Member

The groupBy operator has a "time gap" issue when used with subscribeOn and observeOn. This exists in Rx.Net as well and was written about at http://blogs.msdn.com/b/rxteam/archive/2012/06/14/testing-rx-queries-using-virtual-time-scheduling.aspx

However, if you introduce asynchrony in the pipeline – e.g. by adding an ObserveOn operator to the mix – you’re effectively introducing a time gap during which we’ve handed out the sequence to you, control has been released on the OnNext channel, but subscription happens at a later point in time, causing you to miss elements. We can’t do any caching of elements because we don’t know when – if ever – someone will subscribe to the inner sequence, so the cache could grow in an unbounded fashion.

In discussion with @headinthebox I have decided to alter the behavior to remove this "time gap" issue so that non-deterministic data loss does not happen for the common use cases of using observeOn and subscribeOn with GroupedObservables from groupBy.

Why? It is common to want to use observeOn or subscribeOn with GroupedObservable do process different groups in parallel.

It comes with a trade-off though: all GroupedObservable instances emitted by groupBy must be subscribed to otherwise it will block. The reason for this is that to solve the "time gap" one of two things must be done:

a) use unbounded buffering (such as ReplaySubject)
b) block the onNext calls until GroupedObservable is subscribed to and receiving the data

We can not choose (a) for the reasons given in the Rx.Net blog post because it breaks backpressure and could buffer bloat until the system fails.

In general it is an appropriate thing to expect people to subscribe to all groups, except in one case where it will be expected to work – using filter.

In this case we can solve the common case by special-casing filter to be aware of GroupedObservable. It's not decoupled or elegant, but it solves the common problem.

Thus, the trade-offs are:

  1. Allow for non-deterministic data loss if observeOn/subscribeOn are used and expect people to learn about this by reading docs.

  2. Behave deterministically when observeOn/subscribeOn are used but block if groups are manually skipped.

Option 2 seems to be easier for developers to run into during dev and solve than option 1 which could often show up randomly – in prod – and be difficult to figure out and solve.

@akarnokd
Copy link
Member

How about 1) and present a new operator to cover the parallel processing case directly.

@benjchristensen
Copy link
Member Author

That doesn't prevent people from using observeOn on GroupedObservable, such as for rendering to a UI.

@benjchristensen
Copy link
Member Author

It's a tough trade-off ... non-determinism when using things that should not inject non-determism ... or risk of blocking.

Another possible solution is we could special case observeOn and subscribeOn instead to not return from onNext until they have subscribed. This would mean I need a hook inside the Subscriber type that tells me once a subscription is registered. I've looked at that as well, not sure which is better yet.

@benjchristensen
Copy link
Member Author

Actually ... modifying observeOn and subscribeOn may not result in the deadlock risks so I'm going to test that implementation.

benjchristensen added a commit to benjchristensen/RxJava that referenced this issue Feb 11, 2014
benjchristensen added a commit to benjchristensen/RxJava that referenced this issue Feb 11, 2014
See ReactiveX#844

Not completely thrilled with or 100% confident in this solution, but it does make the groupBy unit tests pass.
benjchristensen added a commit to benjchristensen/RxJava that referenced this issue Feb 11, 2014
See ReactiveX#844

Not completely thrilled with or 100% confident in this solution, but it does make the groupBy unit tests pass.
@benjchristensen
Copy link
Member Author

I've submitted an attempt at a solution. The issue appears to be resolved but I don't completely like the solution nor do I trust it 100% yet.

I need to sleep and think about it again tomorrow, but I'd appreciate a review and feedback, or a better solution from someone :-)

benjchristensen added a commit to benjchristensen/RxJava that referenced this issue Feb 11, 2014
See ReactiveX#844

Not completely thrilled with or 100% confident in this solution, but it does make the groupBy unit tests pass.
@benjchristensen
Copy link
Member Author

There are trade-offs to solving this solution that probably are not worth it.

I'd like people's opinion on whether we should figure out how to make these unit tests pass, or if it's acceptable for them not to:

    @Test
    public void testGroupsWithNestedSubscribeOn() throws InterruptedException {
        final ArrayList<String> results = new ArrayList<String>();
        Observable.create(new OnSubscribe<Integer>() {

            @Override
            public void call(Subscriber<? super Integer> sub) {
                sub.onNext(1);
                sub.onNext(2);
                sub.onNext(1);
                sub.onNext(2);
                sub.onCompleted();
            }

        }).groupBy(new Func1<Integer, Integer>() {

            @Override
            public Integer call(Integer t) {
                return t;
            }

        }).flatMap(new Func1<GroupedObservable<Integer, Integer>, Observable<String>>() {

            @Override
            public Observable<String> call(final GroupedObservable<Integer, Integer> group) {
                return group.subscribeOn(Schedulers.newThread()).map(new Func1<Integer, String>() {

                    @Override
                    public String call(Integer t1) {
                        System.out.println("Received: " + t1 + " on group : " + group.getKey());
                        return "first groups: " + t1;
                    }

                });
            }

        }).toBlockingObservable().forEach(new Action1<String>() {

            @Override
            public void call(String s) {
                results.add(s);
            }

        });

        System.out.println("Results: " + results);
        assertEquals(4, results.size());
    }

@Test
    public void testFirstGroupsCompleteAndParentSlowToThenEmitFinalGroupsWhichThenSubscribesOnAndDelaysAndThenCompletes() throws InterruptedException {
        final CountDownLatch first = new CountDownLatch(2); // there are two groups to first complete
        final ArrayList<String> results = new ArrayList<String>();
        Observable.create(new OnSubscribe<Integer>() {

            @Override
            public void call(Subscriber<? super Integer> sub) {
                sub.onNext(1);
                sub.onNext(2);
                sub.onNext(1);
                sub.onNext(2);
                try {
                    first.await();
                } catch (InterruptedException e) {
                    sub.onError(e);
                    return;
                }
                sub.onNext(3);
                sub.onNext(3);
                sub.onCompleted();
            }

        }).groupBy(new Func1<Integer, Integer>() {

            @Override
            public Integer call(Integer t) {
                return t;
            }

        }).flatMap(new Func1<GroupedObservable<Integer, Integer>, Observable<String>>() {

            @Override
            public Observable<String> call(final GroupedObservable<Integer, Integer> group) {
                if (group.getKey() < 3) {
                    return group.map(new Func1<Integer, String>() {

                        @Override
                        public String call(Integer t1) {
                            return "first groups: " + t1;
                        }

                    })
                            // must take(2) so an onCompleted + unsubscribe happens on these first 2 groups
                            .take(2).doOnCompleted(new Action0() {

                                @Override
                                public void call() {
                                    first.countDown();
                                }

                            });
                } else {
                    return group.subscribeOn(Schedulers.newThread()).delay(400, TimeUnit.MILLISECONDS).map(new Func1<Integer, String>() {

                        @Override
                        public String call(Integer t1) {
                            return "last group: " + t1;
                        }

                    });
                }
            }

        }).toBlockingObservable().forEach(new Action1<String>() {

            @Override
            public void call(String s) {
                results.add(s);
            }

        });

        System.out.println("Results: " + results);
        assertEquals(6, results.size());
    }

@benjchristensen
Copy link
Member Author

Here is a subscribeOn unit test that fails when PublishSubject is used if the Scheduler is slower than the time to emit onNext:

    @Test
    public void testSubscribeOnPublishSubjectWithSlowScheduler() {
        PublishSubject<Integer> ps = PublishSubject.create();
        TestSubscriber<Integer> ts = new TestSubscriber<Integer>();
        ps.subscribeOn(new SlowScheduler()).subscribe(ts);
        ps.onNext(1);
        ps.onNext(2);
        ps.onCompleted();

        ts.awaitTerminalEvent();
        ts.assertReceivedOnNext(Arrays.asList(1, 2));
    }

@akarnokd
Copy link
Member

Maybe subscribeOn has to be smarter about its source (and not everyone else).
I.e., if subscribeOn detects that the source is PublishSubject, It enters a buffering mode and once the time of the actual subscription has come, replay the buffered values and resume as a regular Observer would. This is is different from my previous hack where only the first observer would get the values. Here, everyone who is on a delayed subscription would eventually get the values from the time gap.

@benjchristensen
Copy link
Member Author

That's an interesting idea, especially since PublishSubject is definitely the culprit.

What do you think about the lack of back pressure during that buffering period?

The reason is the same as observeOn being changed - not allowing buffer-bloat unless someone consciously chooses to do buffering or windowing.

@akarnokd
Copy link
Member

Bounded buffering like in observeOn seems to be a workable solution, since the source needs to be blocked until the replay catches up anyway. I'm not sure, however, that my idea is deadlock-free or not. I'll do some experiments tomorrow.

@benjchristensen
Copy link
Member Author

Thanks, I look forward to seeing your idea. I appreciate your back-and-forth on this with me.

benjchristensen added a commit to benjchristensen/RxJava that referenced this issue Mar 20, 2014
I ran head-on into the "time gap" (ReactiveX#844) issue while working on a stream processing use case (and new 'pivot' operator I'm writing).
This is a solution. It's still not ideal as the Javadocs of BufferUntilSubscriber mention, but this is working better than nothing and does not require blocking threads.
A better solution will come as part of the back pressure work where BufferUntilSubscriber will evolve to have a bounded buffer.
benjchristensen added a commit to benjchristensen/RxJava that referenced this issue Mar 20, 2014
I ran head-on into the "time gap" (ReactiveX#844) issue while working on a stream processing use case (and new 'pivot' operator I'm writing).
This is a solution. It's still not ideal as the Javadocs of BufferUntilSubscriber mention, but this is working better than nothing and does not require blocking threads.
A better solution will come as part of the back pressure work where BufferUntilSubscriber will evolve to have a bounded buffer.
@akarnokd
Copy link
Member

akarnokd commented May 8, 2014

Is this settled?

@benjchristensen
Copy link
Member Author

I believe so.

@lJoublanc
Copy link

Hi Ben, I found your post after worrying that I could run into this situation in my own code. It is implied in your post, but I wanted to be entirely sure: Can this problem arise if you call subscribe on the same scheduler? i.e. if I do

grpdObs subscribe { keyAndObs => 
  val (key,obs) = keyAndObs
  obs subscribe( someObserver )
 }

is it guaranteed that the notification that evokes the first (outer) subscribe will also be pushed to someObserver? in other words, that the notification on the inner subscription won't be skipped on someObserver?
Now, the issue discussed here only applies when the inner subscription is done on another scheduler? correct?

@benjchristensen
Copy link
Member Author

As of 0.18 this problem shouldn’t exist any longer as it will buffer until the subscriber arrives.

If the subscribe is occurring synchronously, then it is never a problem, only when the subscribe happens asynchronously (such as using subscribeOn).

Now the choice of same or different scheduler is nuanced if using subscribeOn. Even if it’s the same scheduler, it could schedule it to be done later, so the question is not about what scheduler it is, it is about whether there is a delay (async queueing/scheduling) of the subscribe occurring - which typically is going to happen when using a scheduler.

Your example code would be fine, because the obs subscribe is happening synchronously. If however you changed it to the following it would be vulnerable:

grpdObs subscribe { keyAndObs => 
  val (key,obs) = keyAndObs
  obs subscribeOn(aScheduler) subscribe( someObserver )
 }

The reason it becomes vulnerable is that the function returns and groupBy continues forward while the obs subscribe then happens sometime later.

We put a solution in place for this in #975. The downside is it makes it possible to have a memory leak if groups are skipped (purposefully not subscribed to), and we don’t have a solution for that yet.

We decided that we'd rather avoid the non-deterministic data loss and risk a memory leak as that's easier to explain to people. At some point we'll need to solve the memory leak issue as well (such as emit an error, auto-unsubscribe or something similar if a group is emitted and not subscribed to).

@benjchristensen
Copy link
Member Author

I opened #1280 to cover the side-effect of this solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants