Preliminary refactor for supporting spans over HTTP #186

ProvoK · 2018-05-31T07:35:22Z

Reporter now delegate span sending to new Sender classes .

Related to #176 but made backwards compatible.

I'm not sure this is the best approach to solve backwards compatibility though

Signed-off-by: vitto vittorio.camisa@gmail.com

coveralls · 2018-05-31T07:42:43Z

Coverage increased (+0.05%) to 95.425% when pulling 8f070d1 on ProvoK:send_spans_over_http into 2e0b5bd on jaegertracing:master.

black-adder · 2018-05-31T14:07:14Z

jaeger_client/reporter.py

@@ -75,12 +74,14 @@ def report_span(self, span):

 class Reporter(NullReporter):
    """Receives completed spans from Tracer and submits them out of process."""
-    def __init__(self, channel, queue_capacity=100, batch_size=10,
+    def __init__(self, channel, sender=None, queue_capacity=100, batch_size=10,


shouldn't the new sender parameter be the last parameter since users could be calling this function without explicitly naming(?) the parameters?

and add a TODO to fix that in the next major rev.

@black-adder you're right! Should have been fixed with also the todo @yurishkuro asked for.
Is it maybe a good thing open the issue for that and assign it to the v4 milestone (if it exists)?

Let me know if anything else is necessary :)

black-adder · 2018-06-01T21:16:04Z

jaeger_client/reporter.py

@@ -78,9 +77,11 @@ class Reporter(NullReporter):
    def __init__(self, channel, queue_capacity=100, batch_size=10,
                 flush_interval=DEFAULT_FLUSH_INTERVAL, io_loop=None,
                 error_reporter=None, metrics=None, metrics_factory=None,
-                 **kwargs):
+                 sender=None, **kwargs):


would making sender a kwargs be the safest bet? Won't this still run into the issue where sender might be incorrectly interpreted as kwargs?

I think sender=None, **kwargs is good

@black-adder I think that putting sender in kwargs will hid it more than necessary, without any benefit.
Honestly, if a developer just writes down 8/9 positional arguments, all without naming (aka kwarg), he actually deserves some kind of error 😆

yurishkuro

An open question is whether we want to have parity between the Sender in Python and all other Jaeger clients. The main difference is that Python is the only client that does not pre-compute the size of the thrift batch, which is a sender-specific function, e.g. UDP has a hard 65000 bytes limit on the packet size, while HTTP can send larger chunks. Instead in Python the Reporter is responsible for batching N spans (usually 10), which is ok in most cases but often doesn't use UDP packets efficiently and in some cases might drop packets because their size exceeds 65000.

In contrast, other clients have Sender expose two methods, append and flush. Even if we don't implement byte counting right now, I think it makes sense to move in that direction - it simply means the batching is now done in the Sender, and Reporter is only responsible for queueing and async behavior.

yurishkuro · 2018-06-03T15:46:24Z

jaeger_client/reporter.py


        if queue_capacity < batch_size:
            raise ValueError('Queue capacity cannot be less than batch size')

-        self.io_loop = io_loop or channel.io_loop
+        self.io_loop = io_loop or channel.io_loop or self.sender.io_loop


if channel is None this will blow up?

I'm gonna make a test case with this fails scenario (sender not None, but io_loop and channel None) and then fix the code. Sorry I missed that

I did it, and I fixed the linting problem that broke the pipeline.
About your open question, I think that you are right, if there are those differences it's better that the sender should to that job.
In that case, I really should look at the code and do another "refactor".

yurishkuro · 2018-06-03T16:02:50Z

jaeger_client/reporter.py

@@ -97,19 +98,19 @@ def __init__(self, channel, queue_capacity=100, batch_size=10,
        """
        from threading import Lock

-        self._channel = channel
+        # TODO for next major rev: remove channel param in favor of sender
+        self.sender = sender or self._create_default_sender(channel)


make it private, _sender

yurishkuro · 2018-06-03T16:28:14Z

jaeger_client/reporter.py

@@ -213,7 +213,7 @@ def _send(self, batch):
        Send batch of spans out via thrift transport. Any exceptions thrown
        will be caught above in the exception handler of _submit().
        """
-        return self.agent.emitBatch(batch)
+        return self.sender.send(batch)


This is a breaking change it the user still passes the channel. There needs to be an adapter to wrap the old channel and make it look like a sender with send() method.

Actually, if the user passes the channel, the channel itself is used to generate a default UDPSender (_create_default_sender(channel)).
I dont understand were is here the breaking change

`Reporter` now delegate span sending to new `Sender` classes . Signed-off-by: Vittorio Camisa <vittorio.camisa@gmail.com>

codecov · 2018-07-13T15:52:06Z

Codecov Report

Merging #186 into master will increase coverage by 0.47%.
The diff coverage is 98.92%.

@@            Coverage Diff             @@
##           master     #186      +/-   ##
==========================================
+ Coverage   94.75%   95.23%   +0.47%     
==========================================
  Files          25       26       +1     
  Lines        1773     1824      +51     
  Branches      224      227       +3     
==========================================
+ Hits         1680     1737      +57     
+ Misses         60       55       -5     
+ Partials       33       32       -1

Impacted Files	Coverage Δ
jaeger_client/config.py	`90.69% <ø> (ø)`	⬆️
jaeger_client/local_agent_net.py	`95.55% <100%> (+0.2%)`	⬆️
jaeger_client/reporter.py	`95.86% <100%> (+3.98%)`	⬆️
jaeger_client/senders.py	`98.43% <98.43%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 00d3d4a...fbefb07. Read the comment docs.

* Adopt Sender.append() and flush() Signed-off-by: Ryan Fitzpatrick <rmfitzpatrick@signalfx.com>

ProvoK · 2018-07-22T15:41:42Z

Hello there,
@rmfitzpatrick was very kind and continued the work and PR to me the changes for adopting what I think @yurishkuro asked for.

EDIT: There is only a slight problem with the DCO that I'll fix later, just after changes are validated and accepted

rmfitzpatrick · 2018-08-01T12:01:14Z

@yurishkuro, has there potentially been any update on this review?

black-adder · 2018-08-01T12:55:48Z

@rmfitzpatrick sorry about the delay, I'll take a look at this today

black-adder

Sorry for the long delay; I think the changes are reasonable given we want to maintain backwards compatibility.

I started another ticket #205 where I discuss whether we want to just make all the breaking changes as needed and release 4.0.0.

I think we can continue with the approach outlined in this PR. I think offloading all the batching to the Senders will be a good next step before we tackle adding a HTTP sender.

black-adder · 2018-08-07T15:18:25Z

jaeger_client/reporter.py

        self.queue_capacity = queue_capacity
        self.batch_size = batch_size
        self.metrics_factory = metrics_factory or LegacyMetricsFactory(metrics or Metrics())
        self.metrics = ReporterMetrics(self.metrics_factory)
        self.error_reporter = error_reporter or ErrorReporter(Metrics())
        self.logger = kwargs.get('logger', default_logger)
-        self.agent = Agent.Client(self._channel, self)


this is technically a breaking change since it was already public before

black-adder · 2018-08-07T15:46:03Z

jaeger_client/local_agent_net.py

@@ -69,6 +69,8 @@ def __init__(self, host, sampling_port, reporting_port, io_loop=None, throttling
        # IOLoop
        self._thread_loop = None
        self.io_loop = io_loop or self._create_new_thread_loop()
+        self.reporting_port = reporting_port


given that the sole purpose for saving these is for the initializing the UDP sender, I'd rather make these private. We're going to nuke this class in the future anyway, let's not increase the public API if we don't have to.

black-adder · 2018-08-07T15:48:13Z

jaeger_client/reporter.py

-        self._process_lock = Lock()
-        self._process = None
+    @staticmethod
+    def fetch_io_loop(channel, sender):


can this be private?

black-adder · 2018-08-07T15:54:02Z

jaeger_client/reporter.py

        while not stopped:
-            while len(spans) < self.batch_size:
+            while self._sender.span_count < self.batch_size:


as Yuri mentioned, this is fine for now but ideally the reporter only appends to the sender and both the http sender and the udp sender maintains it's own batch size.

black-adder · 2018-08-07T15:55:21Z

jaeger_client/reporter.py


-    # method for protocol factory
-    def getProtocol(self, transport):


breaking change

black-adder · 2018-08-07T16:48:51Z

jaeger_client/senders.py

+        logger.info('Initializing Jaeger Tracer with UDP reporter')
+        return LocalAgentSender(
+            host=self.host,
+            sampling_port=DEFAULT_SAMPLING_PORT,


since we're only using the LocalAgentSender here as a means to send spans and not using any of the HTTP functionality, let's not make DEFAULT_SAMPLING_PORT a global and just hard code here for now.

black-adder · 2018-08-07T16:49:23Z

jaeger_client/senders.py

+class UDPSender(Sender):
+    def __init__(self, host, port, io_loop=None):
+        super(UDPSender, self).__init__(io_loop=io_loop)
+        self.host = host


can we make all these variables private?

black-adder · 2018-08-07T16:52:16Z

jaeger_client/senders.py

+class Sender(object):
+    def __init__(self, io_loop=None):
+        from threading import Lock
+        self.io_loop = io_loop or self._create_new_thread_loop()


can these variables be private?

black-adder · 2018-08-07T17:03:27Z

jaeger_client/senders.py

+    def flush(self):
+        """Examine span and process state before yielding to _flush() for batching and transport."""
+        if self.spans:
+            with self._process_lock:


for another PR, but the locking here seems a little weird looking at it now. I wonder why we only lock on the process but skip locking on the spans which could be flushed and emptied in another thread.

rmfitzpatrick · 2018-08-27T14:20:34Z

At @ProvoK's request, I'm attempting to carry the baton to #208 to continue this effort.

shuaichang · 2020-09-30T18:41:21Z

#208

Just curious, is this PR still an active effort?

black-adder reviewed May 31, 2018

View reviewed changes

ProvoK force-pushed the send_spans_over_http branch from 25b493c to b37df0a Compare May 31, 2018 15:38

black-adder reviewed Jun 1, 2018

View reviewed changes

ghost assigned black-adder Jun 3, 2018

ghost added the review label Jun 3, 2018

black-adder approved these changes Jun 3, 2018

View reviewed changes

yurishkuro suggested changes Jun 3, 2018

View reviewed changes

ProvoK force-pushed the send_spans_over_http branch 2 times, most recently from 5e72939 to 8f070d1 Compare June 6, 2018 12:49

This was referenced Jul 11, 2018

Add Zipkin API V2 Reporter #197

Closed

Adopt Sender.append() and flush() ProvoK/jaeger-client-python#1

Merged

Preliminary refactor for supporting spans over HTTP

435b532

`Reporter` now delegate span sending to new `Sender` classes . Signed-off-by: Vittorio Camisa <vittorio.camisa@gmail.com>

ProvoK force-pushed the send_spans_over_http branch from 8f070d1 to 435b532 Compare July 13, 2018 15:46

Adopt Sender.append() and flush() (#1)

fbefb07

* Adopt Sender.append() and flush() Signed-off-by: Ryan Fitzpatrick <rmfitzpatrick@signalfx.com>

yurishkuro mentioned this pull request Aug 2, 2018

Support sending spans over HTTP #98

Closed

black-adder reviewed Aug 7, 2018

View reviewed changes

rmfitzpatrick mentioned this pull request Aug 10, 2018

Remove breaking Reporter changes ProvoK/jaeger-client-python#2

Closed

rmfitzpatrick mentioned this pull request Aug 27, 2018

Initial Sender implementation #208

Closed

yurishkuro closed this in c403500 Jan 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preliminary refactor for supporting spans over HTTP #186

Preliminary refactor for supporting spans over HTTP #186

ProvoK commented May 31, 2018

coveralls commented May 31, 2018 •

edited

Loading

black-adder May 31, 2018

yurishkuro May 31, 2018

ProvoK May 31, 2018 •

edited

Loading

black-adder Jun 1, 2018

yurishkuro Jun 1, 2018

ProvoK Jun 2, 2018

yurishkuro Jun 2, 2018

yurishkuro left a comment

yurishkuro Jun 3, 2018

ProvoK Jun 3, 2018

ProvoK Jun 6, 2018

yurishkuro Jun 3, 2018

yurishkuro Jun 3, 2018

ProvoK Jun 3, 2018

yurishkuro Jun 3, 2018

codecov bot commented Jul 13, 2018 •

edited

Loading

ProvoK commented Jul 22, 2018 •

edited

Loading

rmfitzpatrick commented Aug 1, 2018

black-adder commented Aug 1, 2018

black-adder left a comment

black-adder Aug 7, 2018

black-adder Aug 7, 2018

black-adder Aug 7, 2018

black-adder Aug 7, 2018

black-adder Aug 7, 2018

black-adder Aug 7, 2018

black-adder Aug 7, 2018

black-adder Aug 7, 2018

black-adder Aug 7, 2018

rmfitzpatrick commented Aug 27, 2018

shuaichang commented Sep 30, 2020


		# method for protocol factory
		def getProtocol(self, transport):

Preliminary refactor for supporting spans over HTTP #186

Preliminary refactor for supporting spans over HTTP #186

Conversation

ProvoK commented May 31, 2018

coveralls commented May 31, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ProvoK May 31, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yurishkuro left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jul 13, 2018 • edited Loading

Codecov Report

ProvoK commented Jul 22, 2018 • edited Loading

rmfitzpatrick commented Aug 1, 2018

black-adder commented Aug 1, 2018

black-adder left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rmfitzpatrick commented Aug 27, 2018

shuaichang commented Sep 30, 2020

coveralls commented May 31, 2018 •

edited

Loading

ProvoK May 31, 2018 •

edited

Loading

codecov bot commented Jul 13, 2018 •

edited

Loading

ProvoK commented Jul 22, 2018 •

edited

Loading