-
Notifications
You must be signed in to change notification settings - Fork 107
Enable HTTP log reporting #149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: YihaoChen <Superskyyy@outlook.com>
Not sure what the INTENDED usage of that endpoint was, but the same individual send inefficiency applies to how the spans are sent currently to "The Session object allows you to persist certain parameters across requests. It also persists cookies across all requests made from the Session instance, and will use urllib3’s connection pooling. So if you’re making several requests to the same host, the underlying TCP connection will be reused, which can result in a significant performance increase (see HTTP persistent connection)." In any case, if it works but may not yet be optional it is a step forward. |
skywalking/client/http.py
Outdated
def report(self, generator): | ||
for log_data in generator: | ||
json_string = json_format.MessageToJson(log_data) | ||
res = self.session.post(self.url_report, json=[json.loads(json_string)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, looking at this suggests that the /v3/logs
endpoint can take an array of logs so a batch send could be done. Change to:
def report(self, generator):
json = [json_format.MessageToJson(log_data) for log_data in generator]
res = self.session.post(self.url_report, json=json)
Your call if you want to do now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Umm, should I make a new config entry allowing the user to choose whether batch or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, batch is the right way to go (assuming this endpoint does take arrays, which is what you should check).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, it does take arrays.
Yes it's correct usage. Http protocol is provided for those language that don't (or hard to ) support gRPC protocol.
Yes, Java agent can eliminate all possible side effects via shading the gRPC libs but for Python, we still need http protocol in case that users' applications are using a different (incompatible) gRPC package version, we decided to implement http protocol in Python agent from day one to give the users a secondary choice. |
So far looks good to me, looking forward to your Kafka part, thank you. |
I would say merge this first then do Kafka as a separate PR. |
I agree. @Superskyyy let's do one thing at a time, in a single PR |
No problem, but let me add a commit on the batch reporting first. Then lets merge. |
Oops, messed up a bit. |
It's ok, we will squash the commits into one when merging |
@Superskyyy ping me when it's ready to merge |
Signed-off-by: YihaoChen <Superskyyy@outlook.com>
@kezhenxu94 Checks done, ready to merge. |
Thank you @Superskyyy very much 🙇🏻 , excellent work! |
This is a work in progress.
Now reporting logs in JSON through HTTP to
http://oap/v3/logs
does work.But, I'm not sure if the
oap/v3/logs
endpoint is for such usage(It seems designed for fluent-bit batch reporting?). Reporting logs one by one through HTTP may not be ideal in terms of performance. I'm not sure whether the Java agent only implements gRPC reporter intentionally out of this reason.Please advise.
Signed-off-by: YihaoChen Superskyyy@outlook.com