Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add support for MeterReportService #231

Merged
merged 11 commits into from
Sep 11, 2022

Conversation

jiang1997
Copy link
Contributor

@jiang1997 jiang1997 commented Aug 22, 2022

Add support for MeterReportService

Behavior

This service runs along with the agent like other services and reports roughly every 20 seconds.

Syntactic sugar

Counter

c = Counter('c2', CounterMode.INCREMENT)
c.build()

# increase Counter c by the time the with-wrapped codes consumed
with c.create_timer():
    # some codes may consume a certain time
c = Counter('c3', CounterMode.INCREMENT)
c.build()

# increase Counter c by num once counter_decorator_test gets called
@Counter.increase(name='c3', num=2)
def counter_decorator_test():
    # some codes
c = Counter('c4', CounterMode.INCREMENT)
c.build()

# increase Counter c by the time counter_decorator_test consumed
@Counter.timer(name='c4')
def counter_decorator_test(s):
    # some codes may consume a certain time

Histogram

h = Histogram('h3', [i / 10 for i in range(10)])
h.build()

# Histogram h will record the time the with-wrapped codes consumed
with h.create_timer():
    # some codes may consume a certain time
h = Histogram('h2', [i / 10 for i in range(10)])
h.build()

# Histogram h will record the time histogram_decorator_test consumed
@Histogram.timer(name='h2')
def histogram_decorator_test(s):
    time.sleep(s)

Simple test

insert following codes below this line agent/__init__.py#L151

from skywalking.meter.gauge import Gauge
import gc

def generator_for_gc_g0():
    while(True):
        yield gc.get_stats()[0]['collected']

def generator_for_gc_g1():
    while(True):
        yield gc.get_stats()[1]['collected']

def generator_for_gc_g2():
    while(True):
        yield gc.get_stats()[2]['collected']


gc_g0_meter = Gauge("gc_g0", generator_for_gc_g0())
gc_g0_meter.build()

gc_g1_meter = Gauge("gc_g1", generator_for_gc_g1())
gc_g1_meter.build()

gc_g2_meter = Gauge("gc_g2", generator_for_gc_g2())
gc_g2_meter.build()

@jiang1997 jiang1997 changed the title feat: meter service feat: Add support for MeterReportService Aug 22, 2022
@Superskyyy Superskyyy added feature New feature core labels Aug 22, 2022
@Superskyyy Superskyyy added this to the 1.0.0 milestone Aug 22, 2022
@Superskyyy
Copy link
Member

Please try to run the linter with make lint or make dev-check(if you don't want venv egg. to be purged) command. It will be quicker to fix things.

super().__init__(name, tags)
self.count = 0
self.previous = 0
self.mode = mode
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incompatible attribute type: Attribute mode declared in class Counter has type BoundMethod[typing.Callable(Counter.mode)[[Named(self, Counter), Named(mode, unknown)], typing.Any], Counter] but is used as type CounterMode.


Reply with "@sonatype-lift help" for info about LiftBot commands.
Reply with "@sonatype-lift ignore" to tell LiftBot to leave out the above finding from this PR.
Reply with "@sonatype-lift ignoreall" to tell LiftBot to leave out all the findings from this PR and from the status bar in Github.

When talking to LiftBot, you need to refresh the page to see its response. Click here to get to know more about LiftBot commands.


Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]

@@ -28,6 +28,9 @@ class TraceSegmentReportService(object):
def report(self, generator):
raise NotImplementedError()

class MeterReportService(object):
def report(self, generator):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also report_batch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it could be used. These two report receive different generators. The former needs a generator that generates MeterData and the latter MeterDataCollection.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiang1997 I collected the call times of an interface through Counter and reported it to skywalking. Now I want to display this metric on the ui page. How do I do it? Thank you!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@jiang1997 jiang1997 Mar 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@March225
This is just a simple supplement to wu-sheng's advice for the sake of simplicity.

  1. Try to append your metric's definition to the server-side config file like these predefined metrics as below
    https://github.com/apache/skywalking/blob/e2f1c0eadb52082e16f3abfc29dd8860000b596b/oap-server/server-starter/src/main/resources/meter-analyzer-config/python-runtime.yaml#L31-L36
  2. Then edit the dashboard and add a new widget to show your metric.
    If nothing goes wrong, you should find your metric here.
    image

skywalking/meter/meter.py Outdated Show resolved Hide resolved
@jiang1997 jiang1997 force-pushed the feat_meter branch 2 times, most recently from 99a357d to 5c05dcd Compare August 23, 2022 11:31
@@ -0,0 +1,120 @@
#
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parsing failure: Could not parse file at skywalking/meter/histogram.py


Reply with "@sonatype-lift help" for info about LiftBot commands.
Reply with "@sonatype-lift ignore" to tell LiftBot to leave out the above finding from this PR.
Reply with "@sonatype-lift ignoreall" to tell LiftBot to leave out all the findings from this PR and from the status bar in Github.

When talking to LiftBot, you need to refresh the page to see its response. Click here to get to know more about LiftBot commands.


Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]

@jiang1997
Copy link
Contributor Author

Sorry for messing up the commit history.

@@ -37,7 +38,7 @@
__started = False
__protocol = Protocol() # type: Protocol
__heartbeat_thread = __report_thread = __log_report_thread = __query_profile_thread = __command_dispatch_thread \
= __send_profile_thread = __queue = __log_queue = __snapshot_queue = __finished = None
= __send_profile_thread = meter_service_thread = __queue = __log_queue = __snapshot_queue = __finished = None
Copy link
Contributor Author

@jiang1997 jiang1997 Aug 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to name meter_service_thread as __meter_service_thread to comply with the naming convention here. But only to find that double underscore has its own specific behaviour when importing double underscore variable from another file. More details here. https://stackoverflow.com/questions/1301346/what-is-the-meaning-of-single-and-double-underscore-before-an-object-name

Copy link
Member

@Superskyyy Superskyyy Aug 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jiang1997 Thanks, good finding! I believe __ prefixed variables are not supposed to be referred to by external classes, but when it is defined outside of classes, it doesn't do much other than the meaning of "you probably shouldn't use (import) this".

Anyway, I see the current implementation bit out of sync with the previous reporter styles.

That is, you provided a new MeterService(Thread), and the result of defining threads outside of agent/__init__.py is that the code base becomes weird in terms of calling topology, and there are some odd references because of it.

Maybe try two steps to make our agent startup behaviour better organized, given __init__.py was never intended to be this big and also not supposed to have logic scattered around in other modules.

  1. Follow the existing style to see how you can write the MeterService class without the Threading part, like the existing profiling service (put the threading part back to agent/__init__.py. (If you understand this already, look at step 2)
  2. Try to make the existing __init__.py into more organized modules at the agent/ directory; essentially, minimize agent/__init__.py and decouple the current mess. (If you meet obstacles, revert to step 1, and it will be good enough)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

@Superskyyy
Copy link
Member

Sorry for messing up the commit history.

It's okay, everything will be squashed in the end.

@jiang1997
Copy link
Contributor Author

Sorry for messing up the commit history.

It's okay, everything will be squashed in the end.

Yes, it makes sense for both the end of this pr and the universe.

@wu-sheng
Copy link
Member

Sorry for messing up the commit history.

It's okay, everything will be squashed in the end.

Let's clear this policy. The result is YES, we would squash all the commits to merge as one single commit in the main branch.
But this is only a tradeoff, as many new contributors don't know how to make sure the commit logs clear and readable.

A real good commit log should be like this

apache/skywalking@8a80700

Introducing the clear context of why this PR changed in this way and how we apply the changes. Multiple commit logs could be automatically merged into one commit log in merging.

@Superskyyy
Copy link
Member

A real good commit log should be like this

apache/skywalking@8a80700

Introducing the clear context of why this PR changed in this way and how we apply the changes. Multiple commit logs could be automatically merged into one commit log in merging.

Really good example, thanks!

@jiang1997
Copy link
Contributor Author

Sorry for messing up the commit history.

It's okay, everything will be squashed in the end.

Let's clear this policy. The result is YES, we would squash all the commits to merge as one single commit in the main branch. But this is only a tradeoff, as many new contributors don't know how to make sure the commit logs clear and readable.

A real good commit log should be like this

apache/skywalking@8a80700

Introducing the clear context of why this PR changed in this way and how we apply the changes. Multiple commit logs could be automatically merged into one commit log in merging.

Thanks for elaborating!

@jiang1997
Copy link
Contributor Author

It's kind of weird. I passed make lint in my local environment but still failed in Actions.
Here is the message from CI / License and Lint (pull_request)

venv/bin/flake8 .
./tests/plugin/base.py:35:1: B024 TestPluginBase is an abstract base class, but it has no abstract methods. Remember to use @abstractmethod, @abstractclassmethod and/or @abstractproperty decorators.
class TestPluginBase(ABC):
^
./skywalking/agent/protocol/__init__.py:22:1: B024 Protocol is an abstract base class, but it has no abstract methods. Remember to use @abstractmethod, @abstractclassmethod and/or @abstractproperty decorators.
class Protocol(ABC):
^
./skywalking/trace/span.py:36:1: B024 Span is an abstract base class, but it has no abstract methods. Remember to use @abstractmethod, @abstractclassmethod and/or @abstractproperty decorators.
class Span(ABC):
^
3     B024 Protocol is an abstract base class, but it has no abstract methods. Remember to use @abstractmethod, @abstractclassmethod and/or @abstractproperty decorators.
3
make: *** [Makefile:46: lint] Error 1
Error: Process completed with exit code 2.

@Superskyyy
Copy link
Member

It's kind of weird. I passed make lint in my local environment but still failed in Actions. Here is the message from CI / License and Lint (pull_request)

Indeed weird, but what the linter pointed out was right. I'm not sure what happened.

@jiang1997
Copy link
Contributor Author

It's kind of weird. I passed make lint in my local environment but still failed in Actions. Here is the message from CI / License and Lint (pull_request)

Indeed weird, but what the linter pointed out was right. I'm not sure what happened.

How about creating a new pr to fix these errors that linter reported? I'm willing to.

@Superskyyy
Copy link
Member

It's kind of weird. I passed make lint in my local environment but still failed in Actions. Here is the message from CI / License and Lint (pull_request)

Indeed weird, but what the linter pointed out was right. I'm not sure what happened.

How about creating a new pr to fix these errors that linter reported? I'm willing to.

Well technically you could, but the PR cannot be allowed to merge until CI passes, unless you can bypass it.

@kezhenxu94
Copy link
Member

@jiang1997 can you please fix the linter errors in another PR. This should be because we didn’t pin the linter version and the linter updated recently. I just tried rerun a previously successful CI and it now failed. Anyway, the linter errors in this PR look reasonable to be fixed. Please fix them in another PR

@Superskyyy
Copy link
Member

Generally looks good; please add the PVM metrics in the following PR and provide additional E2E test cases (refer to the currently commented TODO in the e2e .yaml file)

@kezhenxu94 Please also take a look :)

@jiang1997
Copy link
Contributor Author

Generally looks good; please add the PVM metrics in the following PR and provide additional E2E test cases (refer to the currently commented TODO in the e2e .yaml file)

@kezhenxu94 Please also take a look :)

Got it.

@jiang1997 jiang1997 marked this pull request as ready for review September 8, 2022 04:50
@wu-sheng
Copy link
Member

wu-sheng commented Sep 8, 2022

I think you should update the docs about meter APIs. @Superskyyy We also miss the tracing APIs doc.

@Superskyyy
Copy link
Member

I think you should update the docs about meter APIs. @Superskyyy We also miss the tracing APIs doc.

Ok, we will provide both API docs next!

@wu-sheng
Copy link
Member

wu-sheng commented Sep 9, 2022

reports roughly every one second.

I noticed this, is it configurable? Usually metrics don't report so frequently considering backend load and necessarily.
10-20s period is a better default value

@jiang1997
Copy link
Contributor Author

reports roughly every one second.

I noticed this, is it configurable? Usually metrics don't report so frequently considering backend load and necessarily. 10-20s period is a better default value

Thanks for the suggestion. I will make it configurable later.

@Superskyyy
Copy link
Member

Superskyyy commented Sep 10, 2022

Thanks! Looking good.

Copy link
Member

@kezhenxu94 kezhenxu94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jiang1997, thanks for the patience and hard work!! I'm going to merge this PR to unblock your next steps, also I left some comments that you might want address in a following minor PR, or in your next large PR with the PVM metrics.

@@ -0,0 +1,110 @@
# Python Agent Meter Reporter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please also add a link in the menu to this doc?

catalog:
- name: "Log Reporter"
path: "/en/setup/advanced/LogReporter"
- name: "Manual Instrumentation"
path: "/en/setup/advanced/API"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

## Counter
* `Counter` API represents a single monotonically increasing counter, automatic collect data and report to backend.
```python
tg_ls = [MeterTag("key", "value")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be awesome if we can just pass in a Python tuple

Suggested change
tg_ls = [MeterTag("key", "value")]
tg_ls = [("key", "value")]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a good idea.

### Syntactic sugars
```python
c = Counter('c2', CounterMode.INCREMENT)
c.build()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like every meter should invoke build after construction, what about just embed the build inside the construction of the meters? So that users won't forget to call the build method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. This is the sequelae of imitating java agent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason we need build method because we separate tags definition from construction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason we need build method because we separate tags definition from construction.

Thanks for reminding. Here also separate tags definition from construction, so seems build is still needed.

@kezhenxu94 kezhenxu94 merged commit 473ece7 into apache:master Sep 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core feature New feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants