Skip to content

Add mechanism for invalidating property cache in server processes #4995

@keith-turner

Description

@keith-turner

Is your feature request related to a problem? Please describe.

When table configuration and system configuration is updated in zookeeper there is no way to know when all servers in the cluster have seen this update. This can lead to problems like the following.

  1. Table config is updated this could be table iterators, classloader context, etc
  2. Scans are started that access tables where the config update was made
  3. Only a subset of the tablet/scan servers have seen the config update when the scan request comes in.

PR #4990 is a very narrow fix for a single situation.

Describe the solution you'd like

Create a new server RPC that allows invalidating specified property caches (like system or specific tables) if the versions on those are below a certain level. Would need to consider that the version is an int that can wrap if doing this.

Unsure of the best way to expose this new server RPC for use. Below are some possible ways this could be done.

  1. Always call this new RPC after setting a property. This could make something like setting 10 properties in the shell much slower as it would reach out to all servers after setting each property.
  2. Add a new API like invalidatePropertyCache(Set<ServerId> servers) which would work well with the changes in Changes to public API to expose resource groups #4851 and could be called after making many property updates on a subset of servers. This would make an RPC to each server. However this API does not narrow what caches are cleared, for example may want to clear a single table. This would need an associated shell command.
  3. Create new property mutation API methods that allow specifying if caches should be invalidated, possibly on which servers. This would need options on the shell command to set properties.

Option 3 seems to offer a good balance between correctness and performance. Option 1 is good for correctness, but could cause performance problems for existing code. Option 2 seems good for performance, but not correctness.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementThis issue describes a new feature, improvement, or optimization.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions