Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add custom metrics and metric history fork to LiveDashboard #613

Merged
merged 7 commits into from Jun 3, 2020

Conversation

bglusman
Copy link
Contributor

Hey @doomspork / @SophieDeBenedetto This is a working v1 of discussed custom metrics and use of historical data fork of LiveDashboard. Right now it's only saving history for two of the pages of metrics, and one of them is a kind of "duplicate"/fork of the existing Ecto telemetry metrics, so one of them has history one doesn't. Could remove the old/non-saved one or not, but up to you guys! Let me know any thoughts and feel free to review the code on my fork of LiveDashboard as well as part of this, opened a PR into my own master here bglusman/phoenix_live_dashboard#2 !

Also rename metric and simplify history to include metadata

poller was based on previous custom count logic,
doesn’t make sense for shadowing something that
works as is without poller.
@bglusman bglusman requested a review from a team as a code owner May 22, 2020 15:26
@@ -56,6 +56,9 @@ config :companies, CompaniesWeb.Endpoint,
]
]

# Do not run appsignal in local dev
config :appsignal, :config, active: false
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is technically out of scope, but without this I accidentally got some errors submitted to my work AppSignal account, because I guess I set the env var for APPSIGNAL_PUSH_API_KEY at some point and it was still set... very confusing seeing errors from an unrelated project show up in work project 🙃

@bglusman bglusman marked this pull request as draft May 22, 2020 15:36
defp pruned_metadata(metadata) do
# for now keep it all, reminder to either keep or drop selected fields based on dashboard usage to conserve memory,
# ideally via some published source of truth hook from dahsboard module to ensure correctness
metadata
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

José expressed concern about memory, and I think this will be the main way we address that concern in practice, though this method will likely need to consider the metric as well as the metadata to decide what to save vs discard... but I think for now we can either leave this as is, or kill, and look at memory usage without discarding anything before deciding if it's a practical issue for us, though we may want to tackle even if it's not a big issue here so that visibility is in place in LiveDashboard to support hooks doing this pruning intelligently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On newer versions of my fork, I'm actually just leaving out the metadata entirely and most of the metrics seem to work exactly the same, so possibly we can do the same here, though I'm curious to see the difference in memory impact both ways/it's entirely possible updates to LiveDashboard may start making use of this metadata in new releases that it's not using now.

@bglusman bglusman marked this pull request as ready for review May 22, 2020 15:56
@GregMefford
Copy link

GregMefford commented May 22, 2020

I'm curious if it would make more sense to leverage something existing for the lower-level task of aggregating the metrics so that they're available for querying by LiveDashboard (to fill in the historical data from before the live streaming part takes over). For example, this seems like it does exactly what you'd want to do, but in a more full-featured way: https://github.com/beam-telemetry/telemetry_metrics_prometheus_core

I think it's something that you'd want to make configurable if it were to become a built-in part of LiveDashboard, for example, because it's going to store the aggregate data in ETS, but from a brief skim, it looks pretty similar to what you're proposing to do "manually" in a GenServer's state.

EDIT: Sorry if I should have posted this to the PR on your actual fork of LiveDashboard. 😅

@bglusman
Copy link
Contributor Author

Hey Greg, no worries! I’ve looked at that stuff a tiny bit, but thought that was specific to StatsD and getting telemetry immediately to an outside data store so I didn’t look at using it much for this... if you think we can do that more cleanly I’m definitely open to it, but I might need more guidance what you have in mind.. you are talking about the code here, not in live dashboard fork right? The idea is no storage in live dashboard will ever occur, only the hook and formatting of data coming in to the hook on that side, so I think you’re commenting in the right place, but Github was down when I saw this note at my computer and now I’m at the park on mobile, so I’ll look a little more later and see if I understand any better what you have in mind, but either way thanks for the thought!

@bglusman
Copy link
Contributor Author

bglusman commented May 22, 2020

Looking a tiny bit more @GregMefford I think I maybe partly understand what you're thinking for the aggregation purposes, in that I see that package is sort of agnostic about consumption of the metrics... I'm guessing you're talking mostly about the aggregation going on in CompaniesWeb.ViewingStats module, not the CompaniesWeb.RepoMetricsHistory module? I guess it's possible that this package could be used in place of the CompaniesWeb.Telemetry and/or ViewingStats module and passed to live_dashboard in router in place of it, but I don't see how/if that would do anything for gaining metric history, and the main part of my motivation for doing this is to get real world usage of my live_dashboard fork, which was a condition to improve docs and guidance around managing memory footprint for users of it, since only the hook and not a GenServer or ETS table will ever be part of live_dashboard proper, based on José's comments in my original PR/discussion here phoenixframework/phoenix_live_dashboard#110

Does that clarify/did I get anything wrong there in what you meant?

Also, I need to play more, but though I saw the RepoWebHistory charts working for historical data, I also had them fail to load as expected once or twice after I opened the PR while playing around to check memory consumption ... the viewing stats charts are working well and reliably for me, but there may still be something funny with the repo charts, if anyone else can run a little locally and see if a) they see consistently working or not working, and b) if they have any idea what the difference is between when it works and doesn’t, let me know, either way I'll investigate further. We could pull that part out and just merge the view stats and history for now while I keep working on the other stuff for a future PR, up to you guys of course.

Copy link
Collaborator

@gemantzu gemantzu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to work ok. My testing methodology might be dumb, please inform me if so:

for i in `seq 1 20`; do curl http://localhost:4000/en/companies; done

Idk if it's relevant, the only issue is that in dev mode, I saw the graph blank for a few seconds, and then it came back as:

Screenshot from 2020-06-01 19-43-26

The second time I tried it, it didn't blank. Also, the beam memory stayed cool during the whole process.

LGTM.

@bglusman
Copy link
Contributor Author

bglusman commented Jun 1, 2020

Thanks @gemantzu ! Was going to add an update to also add VM history, but up to you guys if want that here or a seperate PR... also can probably use style there and stop re-emitting events under a different namespace, not sure why I bothered with that, it's needed for the poller stuff in page_views but not for the repo or VM history

@gemantzu
Copy link
Collaborator

gemantzu commented Jun 1, 2020

I think you should not add more to this PR. @SophieDeBenedetto can you take a look at this before we merge please? You are far more experienced than me on telemetry :)

@bglusman
Copy link
Contributor Author

bglusman commented Jun 2, 2020

I updated my fork with latest upstream master on LiveDashboard, the commit referenced in the mix.lock here 9259f448 is no longer part of the history of that branch because of rebase, but I pushed up that commit in a new branch historical_data_elixir_companies_PR in case anyone is playing/in case mix has trouble pulling it down, don't think there will be any issue, but just in case...

@doomspork
Copy link
Member

@bglusman given folks limited time these days how do you feel about merging this in now to unblock your other related efforts and we can collectively make improvements to the implementation as we go forward.

We'd be glad to have you use this application as a platform to explore LiveDashboard and share those learnings with the community. Happy to discuss other ways we can generate metrics 😁

@bglusman
Copy link
Contributor Author

bglusman commented Jun 3, 2020 via email

@doomspork
Copy link
Member

I'm good to merge this today and go from there. I'll merge now and it should autodeploy. If we run into any hickups I can rollback or ping you.

@doomspork doomspork merged commit bd96d16 into beam-community:master Jun 3, 2020
@bglusman
Copy link
Contributor Author

bglusman commented Jun 3, 2020

@doomspork hmm, I guess some issue in the deploy? LMK if I can help, but I don't see it deployed live yet in any case...

@doomspork
Copy link
Member

@bglusman it looks like there is a compilation issue:

== Compilation error in file lib/companies_web/historical_data.ex ==

** (CompileError) lib/companies_web/historical_data.ex:12: unsupported option :reduce given to for

    lib/companies_web/historical_data.ex:11: (module)

 !     Push rejected, failed to compile Elixir app.

 !     Push failed

We may need to update the Elixir version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants