Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time-based replication filtering #836

Closed
amazkovoi opened this issue May 6, 2015 · 7 comments
Closed

Time-based replication filtering #836

amazkovoi opened this issue May 6, 2015 · 7 comments

Comments

@amazkovoi
Copy link

amazkovoi commented May 6, 2015

The reason for this feature is to minimise the number of documents that are stored on mobile clients, as our users do not want to see thousands of documents on their mobile devices.

To achieve this, our mobile device users would like to configure criteria which is used to sync only a sub-set of documents they are allowed to access.

Our documents (which represent audits) have a completed and uncompleted state. The state is recorded as a simple boolean flag in the documents.

Users would like to specify a configurable amount of time for the modified date of the documents which restricts what documents should be sync'ed to the client (i.e. Couchbase Lite). They would like to specify a different period of time (e.g. 3 days, 1 week, 2 weeks, 1 month, 3 months, 6 months, unrestricted) for completed and uncompleted docs.

So, for example, sync completed documents that have been modified in the last 1 week, and uncompleted documents which have been modified in the last 3 months.

The parameters for the criteria should be passed from the client, as different users would like to set different period lengths.

Using channels for this is not scalable, as the criteria is time based, and as time goes on:

  1. The number of channels will become extremely large
  2. Documents need to be moved between channels. For large number of documents this will not scale.

Some discussion on this:
https://forums.couchbase.com/t/date-based-sync/3651/7

@leonid-s-usov
Copy link

I have been thinking about a similar problem, which is just a little simpler than what s stated here - I want to limit the number of documents based on some time dependent criteria.
The solution I have found (not implemented yet, but thought of it) might be good for the case as well, with some adjustments.

First of all it is clear that as time goes the client will be responsible for purging the documents that are of no interest to him. This narrows the problem down to the initial pull, which has to be limited.
Now there is already a mechanism that limits the content of changes stream. It is the sequence number.

My idea is to have a custom view that would map sequences in my database to a time key of my logic. Hence before performing an initial pull I will get the earliest interesting sequence number and make the local database sync starting from that one instead of 0.

Until now I haven't researched deep how to best force minimal sequence, but my working suggestion is to manually introduce the corresponding _local document where the sync algorithm keeps the last sequence, so that first time it starts the document is already there, providing the starting sequence I want.

In your case you could have several sequences mapping per criteria and then you could choose. For example, download from oldest and the purge others locally
Or, if the idea would work, maybe it is possible to put this min sequence per channel and people here would advise on how to accomplish this.

@ndouglas
Copy link

ndouglas commented May 7, 2015

+1. Filtering pulls might be necessary for the scenario I'm working with right now. At the very least filtering pulls would immensely simplify the work I need to do.

(Filtering by sequence number won't work for my scenario, just FWIW.)

@leonid-s-usov
Copy link

I just hope that it won't be anything like the original CouchDB filtering, which is on my opinion absolutely not scalable.

I think that if there is some really complex logic that depends on external state then it should be accomplished as a separate server side process constantly revisiting documents and changing them according to the external logic. Otherwise this will have to be done every time pull is invoked which will kill the client interface

@tleyden tleyden assigned tleyden and unassigned tleyden May 22, 2015
@zgramana zgramana self-assigned this May 22, 2015
@zgramana zgramana added the ready label May 22, 2015
@zgramana zgramana added backlog and removed ready labels Jun 8, 2015
@sagarrao
Copy link

sagarrao commented Mar 8, 2016

Has this feature been incorporated in the newer versions? We are looking at a scenrio where we get a lot of data from IoT devices and we just want to show data which is like just 1-2 hours old(random number). One option is to add channels based on timestamp and then pushing data accordingly but I guess something like this would be a cleaner approach?

@adamcfraser adamcfraser changed the title Allow mobile clients (e.g. Couchbase Lite) to ask the server to filter which docs are sync'ed Time-based replication filtering Oct 12, 2016
@basememara
Copy link

+1

The client doesn't have to purge local documents in my opinion, that can be handled by another customized maintenance task by the end developer.

If the user logs out (which deletes local database) or switches devices, login would be greatly improved since it would only grab the last 30 days of data for example.

Currently, a user with 2 years worth of data would have a hard time logging into a reinstalled app or new device or after logging out.

@djpongh djpongh added this to the 2.2.0 milestone Jan 31, 2018
@djpongh djpongh modified the milestones: Iridium, Cobalt Aug 28, 2018
@djpongh djpongh removed the P3: low label Aug 30, 2018
@djpongh djpongh modified the milestones: Cobalt, Mercury Dec 13, 2018
@Fujio-Turner
Copy link

Fujio-Turner commented Jul 1, 2019

SG puts timestamps in the _sync meta data of when it wrote a document into the CB. Can we do _changes?sg_time=true and have the changes pass in the timestamp to let CBL figure out what it wants. Note SG's timestamp it not related to when the document ,maybe on CBL, was created.

@adamcfraser
Copy link
Collaborator

Closing based on age, and that there isn't an obvious solution that scales well and maintains client consistency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants