Skip to content
This repository has been archived by the owner on Jul 11, 2022. It is now read-only.

Preliminary refactor for supporting spans over HTTP #186

Closed
wants to merge 2 commits into from

Conversation

ProvoK
Copy link
Contributor

@ProvoK ProvoK commented May 31, 2018

Reporter now delegate span sending to new Sender classes .

Related to #176 but made backwards compatible.

I'm not sure this is the best approach to solve backwards compatibility though

Signed-off-by: vitto vittorio.camisa@gmail.com

@coveralls
Copy link

coveralls commented May 31, 2018

Coverage Status

Coverage increased (+0.05%) to 95.425% when pulling 8f070d1 on ProvoK:send_spans_over_http into 2e0b5bd on jaegertracing:master.

@@ -75,12 +74,14 @@ def report_span(self, span):

class Reporter(NullReporter):
"""Receives completed spans from Tracer and submits them out of process."""
def __init__(self, channel, queue_capacity=100, batch_size=10,
def __init__(self, channel, sender=None, queue_capacity=100, batch_size=10,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't the new sender parameter be the last parameter since users could be calling this function without explicitly naming(?) the parameters?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and add a TODO to fix that in the next major rev.

Copy link
Contributor Author

@ProvoK ProvoK May 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@black-adder you're right! Should have been fixed with also the todo @yurishkuro asked for.
Is it maybe a good thing open the issue for that and assign it to the v4 milestone (if it exists)?

Let me know if anything else is necessary :)

@@ -78,9 +77,11 @@ class Reporter(NullReporter):
def __init__(self, channel, queue_capacity=100, batch_size=10,
flush_interval=DEFAULT_FLUSH_INTERVAL, io_loop=None,
error_reporter=None, metrics=None, metrics_factory=None,
**kwargs):
sender=None, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would making sender a kwargs be the safest bet? Won't this still run into the issue where sender might be incorrectly interpreted as kwargs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think sender=None, **kwargs is good

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@black-adder I think that putting sender in kwargs will hid it more than necessary, without any benefit.
Honestly, if a developer just writes down 8/9 positional arguments, all without naming (aka kwarg), he actually deserves some kind of error 😆

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree

@ghost ghost assigned black-adder Jun 3, 2018
@ghost ghost added the review label Jun 3, 2018
Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An open question is whether we want to have parity between the Sender in Python and all other Jaeger clients. The main difference is that Python is the only client that does not pre-compute the size of the thrift batch, which is a sender-specific function, e.g. UDP has a hard 65000 bytes limit on the packet size, while HTTP can send larger chunks. Instead in Python the Reporter is responsible for batching N spans (usually 10), which is ok in most cases but often doesn't use UDP packets efficiently and in some cases might drop packets because their size exceeds 65000.

In contrast, other clients have Sender expose two methods, append and flush. Even if we don't implement byte counting right now, I think it makes sense to move in that direction - it simply means the batching is now done in the Sender, and Reporter is only responsible for queueing and async behavior.


if queue_capacity < batch_size:
raise ValueError('Queue capacity cannot be less than batch size')

self.io_loop = io_loop or channel.io_loop
self.io_loop = io_loop or channel.io_loop or self.sender.io_loop
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if channel is None this will blow up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna make a test case with this fails scenario (sender not None, but io_loop and channel None) and then fix the code. Sorry I missed that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it, and I fixed the linting problem that broke the pipeline.
About your open question, I think that you are right, if there are those differences it's better that the sender should to that job.
In that case, I really should look at the code and do another "refactor".

@@ -97,19 +98,19 @@ def __init__(self, channel, queue_capacity=100, batch_size=10,
"""
from threading import Lock

self._channel = channel
# TODO for next major rev: remove channel param in favor of sender
self.sender = sender or self._create_default_sender(channel)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make it private, _sender

@@ -213,7 +213,7 @@ def _send(self, batch):
Send batch of spans out via thrift transport. Any exceptions thrown
will be caught above in the exception handler of _submit().
"""
return self.agent.emitBatch(batch)
return self.sender.send(batch)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking change it the user still passes the channel. There needs to be an adapter to wrap the old channel and make it look like a sender with send() method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, if the user passes the channel, the channel itself is used to generate a default UDPSender (_create_default_sender(channel)).
I dont understand were is here the breaking change

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

`Reporter` now delegate span sending to new `Sender` classes .

Signed-off-by: Vittorio Camisa <vittorio.camisa@gmail.com>
@codecov
Copy link

codecov bot commented Jul 13, 2018

Codecov Report

Merging #186 into master will increase coverage by 0.47%.
The diff coverage is 98.92%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #186      +/-   ##
==========================================
+ Coverage   94.75%   95.23%   +0.47%     
==========================================
  Files          25       26       +1     
  Lines        1773     1824      +51     
  Branches      224      227       +3     
==========================================
+ Hits         1680     1737      +57     
+ Misses         60       55       -5     
+ Partials       33       32       -1
Impacted Files Coverage Δ
jaeger_client/config.py 90.69% <ø> (ø) ⬆️
jaeger_client/local_agent_net.py 95.55% <100%> (+0.2%) ⬆️
jaeger_client/reporter.py 95.86% <100%> (+3.98%) ⬆️
jaeger_client/senders.py 98.43% <98.43%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 00d3d4a...fbefb07. Read the comment docs.

* Adopt Sender.append() and flush()

Signed-off-by: Ryan Fitzpatrick <rmfitzpatrick@signalfx.com>
@ProvoK
Copy link
Contributor Author

ProvoK commented Jul 22, 2018

Hello there,
@rmfitzpatrick was very kind and continued the work and PR to me the changes for adopting what I think @yurishkuro asked for.

EDIT: There is only a slight problem with the DCO that I'll fix later, just after changes are validated and accepted

@rmfitzpatrick
Copy link
Contributor

@yurishkuro, has there potentially been any update on this review?

@black-adder
Copy link
Contributor

@rmfitzpatrick sorry about the delay, I'll take a look at this today

Copy link
Contributor

@black-adder black-adder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the long delay; I think the changes are reasonable given we want to maintain backwards compatibility.

I started another ticket #205 where I discuss whether we want to just make all the breaking changes as needed and release 4.0.0.

I think we can continue with the approach outlined in this PR. I think offloading all the batching to the Senders will be a good next step before we tackle adding a HTTP sender.

self.queue_capacity = queue_capacity
self.batch_size = batch_size
self.metrics_factory = metrics_factory or LegacyMetricsFactory(metrics or Metrics())
self.metrics = ReporterMetrics(self.metrics_factory)
self.error_reporter = error_reporter or ErrorReporter(Metrics())
self.logger = kwargs.get('logger', default_logger)
self.agent = Agent.Client(self._channel, self)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is technically a breaking change since it was already public before

@@ -69,6 +69,8 @@ def __init__(self, host, sampling_port, reporting_port, io_loop=None, throttling
# IOLoop
self._thread_loop = None
self.io_loop = io_loop or self._create_new_thread_loop()
self.reporting_port = reporting_port
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that the sole purpose for saving these is for the initializing the UDP sender, I'd rather make these private. We're going to nuke this class in the future anyway, let's not increase the public API if we don't have to.

self._process_lock = Lock()
self._process = None
@staticmethod
def fetch_io_loop(channel, sender):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be private?

while not stopped:
while len(spans) < self.batch_size:
while self._sender.span_count < self.batch_size:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as Yuri mentioned, this is fine for now but ideally the reporter only appends to the sender and both the http sender and the udp sender maintains it's own batch size.


# method for protocol factory
def getProtocol(self, transport):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

breaking change

logger.info('Initializing Jaeger Tracer with UDP reporter')
return LocalAgentSender(
host=self.host,
sampling_port=DEFAULT_SAMPLING_PORT,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we're only using the LocalAgentSender here as a means to send spans and not using any of the HTTP functionality, let's not make DEFAULT_SAMPLING_PORT a global and just hard code here for now.

class UDPSender(Sender):
def __init__(self, host, port, io_loop=None):
super(UDPSender, self).__init__(io_loop=io_loop)
self.host = host
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make all these variables private?

class Sender(object):
def __init__(self, io_loop=None):
from threading import Lock
self.io_loop = io_loop or self._create_new_thread_loop()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can these variables be private?

def flush(self):
"""Examine span and process state before yielding to _flush() for batching and transport."""
if self.spans:
with self._process_lock:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for another PR, but the locking here seems a little weird looking at it now. I wonder why we only lock on the process but skip locking on the spans which could be flushed and emptied in another thread.

@rmfitzpatrick
Copy link
Contributor

At @ProvoK's request, I'm attempting to carry the baton to #208 to continue this effort.

@shuaichang
Copy link

#208

Just curious, is this PR still an active effort?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants