Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(events): Fix paused events for resources due to pausing application #842

Merged
merged 16 commits into from
Mar 4, 2020

Conversation

luispollo
Copy link
Contributor

@luispollo luispollo commented Feb 29, 2020

Introduces the following changes:

  • Generalizes the event model so it can be used for other "scopes" beyond resources
  • Modifies the database structure to repurpose the existing resource_event table into a generic event table as per the above
  • Introduces a new abstract ApplicationEvent base class for application-level events
  • Introduces new ApplicationActuationPaused and ApplicationActuationResumed events
  • Modifies the logic in the EventController to add a matching and "immutable" ResourceActuationPaused event for every ApplicationActuationPaused in the event history in the right position of the timestamp-ordered list

Sample table records:

scope        uid                          type                           timestamp
RESOURCE     01E26DWCV4STWP39WQ06RPA8AJ   "ResourceValid"                2020-02-29 02:24:57
RESOURCE     01E26DWCV4STWP39WQ06RPA8AJ   "ResourceActuationResumed"     2020-02-29 02:24:40
APPLICATION  serverlablpollo              "ApplicationActuationResumed"  2020-02-29 02:24:37
APPLICATION  serverlablpollo              "ApplicationActuationPaused"   2020-02-29 02:23:57
RESOURCE     01E26DWCV4STWP39WQ06RPA8AJ   "ResourceValid"                2020-02-29 01:48:00
RESOURCE     01E26DWCV4STWP39WQ06RPA8AJ   "ResourceActuationResumed"     2020-02-29 01:47:52
APPLICATION  serverlablpollo              "ApplicationActuationResumed"  2020-02-29 01:47:48
APPLICATION  serverlablpollo              "ApplicationActuationPaused"   2020-02-29 01:39:04
RESOURCE     01E26DWCV4STWP39WQ06RPA8AJ   "ResourceValid"                2020-02-28 17:35:42
RESOURCE     01E26DWCV4STWP39WQ06RPA8AJ   "ResourceCreated"              2020-02-28 17:35:37

Matching API response:

---
- type: "ResourceValid"
  apiVersion: "titus.spinnaker.netflix.com/v1"
  kind: "cluster"
  id: "titus:cluster:titustestvpc:serverlablpollo"
  application: "serverlablpollo"
  timestamp: "2020-02-29T02:24:56.684Z"
  scope: "RESOURCE"
- type: "ResourceActuationResumed"
  apiVersion: "titus.spinnaker.netflix.com/v1"
  kind: "cluster"
  id: "titus:cluster:titustestvpc:serverlablpollo"
  application: "serverlablpollo"
  timestamp: "2020-02-29T02:24:39.906Z"
  scope: "RESOURCE"
- type: "ResourceActuationPaused"
  apiVersion: "titus.spinnaker.netflix.com/v1"
  kind: "cluster"
  id: "titus:cluster:titustestvpc:serverlablpollo"
  application: "serverlablpollo"
  reason: "Resource actuation paused at the application level"
  timestamp: "2020-02-29T02:23:57.265Z"
  scope: "RESOURCE"
- type: "ResourceValid"
  apiVersion: "titus.spinnaker.netflix.com/v1"
  kind: "cluster"
  id: "titus:cluster:titustestvpc:serverlablpollo"
  application: "serverlablpollo"
  timestamp: "2020-02-29T01:47:59.870Z"
  scope: "RESOURCE"
- type: "ResourceActuationResumed"
  apiVersion: "titus.spinnaker.netflix.com/v1"
  kind: "cluster"
  id: "titus:cluster:titustestvpc:serverlablpollo"
  application: "serverlablpollo"
  timestamp: "2020-02-29T01:47:51.579Z"
  scope: "RESOURCE"
- type: "ResourceActuationPaused"
  apiVersion: "titus.spinnaker.netflix.com/v1"
  kind: "cluster"
  id: "titus:cluster:titustestvpc:serverlablpollo"
  application: "serverlablpollo"
  reason: "Resource actuation paused at the application level"
  timestamp: "2020-02-29T01:39:03.582Z"
  scope: "RESOURCE"
- type: "ResourceValid"
  apiVersion: "titus.spinnaker.netflix.com/v1"
  kind: "cluster"
  id: "titus:cluster:titustestvpc:serverlablpollo"
  application: "serverlablpollo"
  timestamp: "2020-02-28T17:35:42.055Z"
  scope: "RESOURCE"
- type: "ResourceCreated"
  apiVersion: "titus.spinnaker.netflix.com/v1"
  kind: "cluster"
  id: "titus:cluster:titustestvpc:serverlablpollo"
  application: "serverlablpollo"
  timestamp: "2020-02-28T17:35:37.426Z"
  scope: "RESOURCE"

Closes #837
Closes #634

@luispollo luispollo marked this pull request as ready for review March 3, 2020 01:21
@luispollo
Copy link
Contributor Author

@lorin If you want to review... (I think someone with admin permissions needs to add you to the org -- @robzienert @ajordens?)

import java.time.Clock
import java.time.Instant

abstract class PersistentEvent {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the new base class for all persistent events. It defines the common properties I expect to see in all such events regardless of the sub-class.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again I find the name to be a bit confusing... maybe it's just me :) but can't it be just event? curios to hear other opinions.

resource.apiVersion,
resource.kind,
resource.id,
resource.application,
reason,
clock.instant()
)

constructor(resource: Resource<*>, reason: String? = null, timestamp: Instant) : this(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to allow the synthetic events of this type to be added with a specific timestamp in EventController.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand my kotlin correctly, I think you can avoid defining a second constructor by using default arguments.

  constructor(resource: Resource<*>, reason: String? = null, timestamp: Instant = clock.instant()) : this(

(I haven't figured out how to suggest a change as a diff yet).

@@ -31,7 +34,7 @@ import org.springframework.context.ApplicationEventPublisher
import org.springframework.stereotype.Component

@Component
class ResourcePauser(
class ActuationPauser(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed this since it handles pausing of things other than resources.

* @param application the name of the application.
* @param limit the maximum number of events to return.
*/
fun applicationEventHistory(application: String, limit: Int = DEFAULT_MAX_EVENTS): List<ApplicationEvent>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have an ApplicationRepository since we don't store application metadata, so I decided to add these here for now. Let me know if there's a better place.

@@ -4,6 +4,7 @@ plugins {
`java-library`
id("kotlin-spring")
id("ch.ayedo.jooqmodelator") version "3.6.0"
id("org.liquibase.gradle") version "2.0.2"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this so we can run liquibase migration stuff from the command line.

@luispollo luispollo self-assigned this Mar 3, 2020
@@ -201,28 +202,27 @@ abstract class ResourceRepositoryTests<T : ResourceRepository> : JUnit5Minutests
context("updating the state again") {
before {
tick()
// TODO: ensure persisting a map with actual data
subject.appendHistory(ResourceDeltaDetected(resource, emptyMap(), clock))
subject.appendHistory(ResourceValid(resource, clock))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was unrelated, but ResourceDeltaDetected has ignoreRepeatedInHistory set to false, so I changed this to an event that had it set to true.

Copy link
Contributor

@robfletcher robfletcher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good. Couple of minor comments/questions

Comment on lines 50 to 53
events.remove(it)
resourceEvents.remove(it)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove application level events too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that, but wasn't 100% sure we weren't using application references elsewhere (meaning, those events might still be relevant even if all the resources were removed). Also, the only place this method is called is from deleteByApplication, which in turn is only called by CombinedRepository.deleteResourcesByApplication, which, in turn... is not called from anywhere.

I think this may have been a side-effect of the change Emily made from delete-by-application to be delete-by-delivery-config. We might be able to remove this method altogether...

Copy link
Contributor

@lorin lorin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a few comments and questions, but nothing that would block a merge.

resource.apiVersion,
resource.kind,
resource.id,
resource.application,
reason,
clock.instant()
)

constructor(resource: Resource<*>, reason: String? = null, timestamp: Instant) : this(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand my kotlin correctly, I think you can avoid defining a second constructor by using default arguments.

  constructor(resource: Resource<*>, reason: String? = null, timestamp: Instant = clock.instant()) : this(

(I haven't figured out how to suggest a change as a diff yet).

* @param application the name of the application.
* @param downTo the time of the oldest event to return.
*/
fun applicationEventHistory(application: String, downTo: Instant): List<ApplicationEvent>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My instinct is to add a limit parameter to all API calls that return a list of objects, just like there is for the previous call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided not to add this now because the whole point of this variant of the method was not to be bound by an arbitrary limit and instead make sure we look into the application event history in the exact same time window as the resource events we're returning. Hopefully, since those are subject to a limit, the number of application events returned will not be a problem.

* Records an event associated with an application.
* TODO: adding this here as there's no ApplicationRepository or EventRepository, but might want to move it.
*/
fun appendHistory(event: ApplicationEvent)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an advantage to having separate appendHistory calls for the different subclasses of PersistentEvent versus just having one method for the base class:

fun appendHistory(event: PersistentEvent)

Copy link
Contributor Author

@luispollo luispollo Mar 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually tried to merge them, but the issue I ran into is that we are dynamically determining the value to insert in the uid column for ResourceEvents by doing a select into the resources table, which prevents adding events for resources that don't exist in the database (which is a good thing since we don't have FKs anywhere). We could add the uid to insert as a parameter, but I think this would complicate life unnecessarily for callers.

Maybe worth a refactor in a separate PR?

@luispollo luispollo force-pushed the fix-paused-events branch 2 times, most recently from 2471d58 to 9e05ea0 Compare March 4, 2020 01:25
private val lastCheckTimes = mutableMapOf<String, Instant>()

override fun deleteByApplication(application: String): Int {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was not used anywhere...

@@ -35,37 +40,6 @@ open class SqlResourceRepository(
private val sqlRetry: SqlRetry
) : ResourceRepository {

override fun deleteByApplication(application: String): Int {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was not used anywhere...

@luispollo
Copy link
Contributor Author

@robfletcher Hopefully this is in decent shape now, if you want to take a final look.

@gal-yardeni
Copy link
Contributor

btw, here's the issue for this: #634

@gal-yardeni
Copy link
Contributor

Look good to me and makes total sense. Small comments about naming :)

Copy link
Contributor

@gal-yardeni gal-yardeni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto merged ready to merge Approved and ready for merge
Projects
None yet
4 participants