Skip to content
This repository has been archived by the owner on Jul 11, 2022. It is now read-only.

BZ1093948 - Metric traits, call times in Cassandra #70

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

genman
Copy link
Contributor

@genman genman commented Jun 30, 2014

Traits are stored into two tables, call times one. There is a secondary index
on call name in Cassandra. (Unclear if this is really needed, but required
when doing filtered queries.)

TTL is used for expiry. No more need to purge this data hourly.

Known bugs:

  • The UI shows the latest trait value and timestamp when data was last
    reported, not when it changed. To display properly, need to access history
    table, but this might be too many round trips for this piece of data.

Possible issues:

  • Paging control of results is basically ignored. Is paging really necessary?
    Cassandra does have some support for ranges on hashed fields but would need to build
    infrastructure around this. PageControl could hold this hash.
  • Secondary filtering and sorting needs testing.
  • Hard to say if the UI is badly affected. The APIs are a bit
    eccentric; they pull extra data in without any clear explanation.
  • Purging of duplicate history items happens weekly. The reason is that traits
    are updated usually every hour to 24 hours. With a small window, there are
    few duplicates. The window to look back is about 8 days. Even if a trait
    doesn't change, the history table will hae duplicaes.

Questions:

  • Should the measurement request contain resource ID? It seems like this isn't
    really necessary, although I do like the idea of grouping traits by resource.
  • Should data be denormalized into the traits table, like trait name? How could
    this work? This may be unneeded optimization.
  • How to support runtime change of TTL? Currently the server must be restarted
    to change the TTL. Older data not changed.
  • Merge two trait tables into one?

TODOs:

  • Migration tooling (support copying data to new state)
  • Remove old SQL stuff and remove entities from appearing
  • Lots of sub-optimal queries to adjust; possibly dupe data in traits table
  • Paging support (necessary?)
  • Documentation (of course)

Traits are stored into two tables, call times one. There is a secondary index
on call name in Cassandra. (Unclear if this is really needed, but required
when doing filtered queries.)

TTL is used for expiry. No more need to purge this data hourly.

Known bugs:
* The UI shows the latest trait value and timestamp when data was last
reported, not when it changed. To display properly, need to access history
table, but this might be too many round trips for this piece of data.

Possible issues:
* Paging control of results is basically ignored. Is paging really necessary?
Cassandra does have some support for ranges on hashed fields but would need to build
infrastructure around this. PageControl could hold this hash.
* Secondary filtering and sorting needs testing.
* Hard to say if the UI is badly affected. The APIs are a bit
eccentric; they pull extra data in without any clear explanation.
* Purging of duplicate history items happens weekly. The reason is that traits
are updated usually every hour to 24 hours. With a small window, there are
few duplicates. The window to look back is about 8 days. Even if a trait
doesn't change, the history table will hae duplicaes.

Questions:
* Should the measurement request contain resource ID? It seems like this isn't
really necessary, although I do like the idea of grouping traits by resource.
* Should data be denormalized into the traits table, like trait name? How could
this work? This may be unneeded optimization.
* How to support runtime change of TTL? Currently the server must be restarted
to change the TTL. Older data not changed.
* Merge two trait tables into one?

TODOs:
* Migration tooling (support copying data to new state)
* Remove old SQL stuff and remove entities from appearing
* Lots of sub-optimal queries to adjust; possibly dupe data in traits table
* Paging support (necessary?)
* Documentation (of course)
@genman
Copy link
Contributor Author

genman commented Jun 30, 2014

I'm putting this out there because:

  1. Personally, I'd love if traits (if not calltimes) moved to Cassandra
  2. It does need to be discussed seriously and this seems to be the venue for it.

I don't really want to see the patch nit-picked unless somebody's really serious about merging it.

@rhqci
Copy link

rhqci commented Oct 16, 2017

Can one of the admins verify this patch?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants