Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delta chain - when announcement fails - consumer gets confused [bug/question] #164

Open
rpalcolea opened this issue Jan 25, 2018 · 0 comments

Comments

@rpalcolea
Copy link
Member

rpalcolea commented Jan 25, 2018

Hi @toolbear

Writing some tests around scenarios that we could face, for example when announcement fails for a delta state:

  def 'incremental failure on announcer'() {
    setup:
    config.producerConfig.incremental = true
    HollowProducer.Announcer spyAnnouncer = Spy(config.producerConfig.announcer.get().class, constructorArgs: [config.getDefaultRootDirectory()])
    def localProducer = getProducer(new Module() {
      @Override
      void configure(Binder binder) {
        binder.bind(HollowProducer.Announcer.class).toInstance(spyAnnouncer)
      }
    })
    localProducer.hollowProducer
    consumer = getConsumer()
    List<Map> dataSet = (1..2).collect { createRandomAutomobile() }

    when: 'initial snapshot cycle succeeds'
    List<Automobile> automobiles = dataSet.collect { autoJson ->
      parseAutomobile(autoJson)
    }
    localProducer.runCycle(automobiles)
    consumer.triggerRefresh()

    then: 'data should be there'
    consumer.allDomainObjects.size() == 2

    when: 'delta cycle with fails on announce'
    List<Automobile> redAutomobiles = dataSet.collect { autoJson ->
      autoJson.trim = 'red'
      parseAutomobile(autoJson)
    }
    localProducer.runCycle(redAutomobiles)

    then: 'announcer should throw exception'
    1 * spyAnnouncer.announce(_) >> {
      throw new RuntimeException('ouch') }
    thrown(RuntimeException)

    when:
    consumer.triggerRefresh()

    then: 'consumer should reflect old data and trim should stay yellow'
    consumer.allDomainObjects.size() == 2
    consumer.allDomainObjects.collect { it.getObject('trim') }.unique().first().toString() == 'yellow'

    when: 'delta cycle with addition'
    List<Automobile> purpleAutomobiles = dataSet.collect { autoJson ->
      autoJson.trim = 'purple'
      parseAutomobile(autoJson)
    }
    long version = localProducer.runCycle(purpleAutomobiles)
    consumer.triggerRefresh()

    then: 'addition should be there'
    consumer.allDomainObjects.size() == 2
    consumer.allDomainObjects.collect { it.getObject('trim') }.unique().first().toString() == 'purple'
  }

In this case what we observe is that Hollow will generate deltas from version 20180125180054001 to 20180125180054002 and 20180125180054001 to 20180125180054003. Because 20180125180054002 failed, the 3rd cycle will try to write a delta from "1" to "3".

screen shot 2018-01-25 at 12 01 51 pm

Then if trigger a refresh in the consumer, since the latest announced version is "20180125180054003", it will try to apply that, however, HollowUpdatePlanner determines that the next deltaDestinationVersion should be "20180125180054002"

long deltaDestinationVersion = deltaPlan.destinationVersion(currentVersion);

Looks like destinationVersion in HollowUpdatePlan only knows about 1 transition:

screen shot 2018-01-25 at 12 06 25 pm

In this case when triggerRefresh happens, it will go to version 20180125180054002 and if you do another triggerRefresh it will fail because there is no update plan from 20180125180054002 to 20180125180054003.

Our workaround for now is to restart the producer that way it creates a snapshot once it's restored.

We don't know if this is an issue or if we are doing something wrong. Also, wonder if we should not fail when announcement fails. Wonder if it makes sense to commit a change and then announce?

. Guess the consumers will pick up the "missing delta" on the next announcement since the file should be available.

Wonder if this would also be a use case for HollowStateDeltaPatcherTest

Any thoughts?

Sunjeet pushed a commit that referenced this issue Apr 10, 2019
…rride to master

* commit 'b24c3267ae5a5b6de30f27f7eb13eceb46ba66c3':
  Use NOT_STACK to specify dataspec namespace not to use stack
  Use special stack value "default" to specify don't use stack
  Use NetflixConfiguration instead of System.getProperty
  DataSpec Stack override
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant