Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a section in Components about "agents" & what to use instead of "agent"? #1689

Closed
svrnm opened this issue Aug 31, 2022 · 36 comments
Closed
Labels
discussion Input from everyone is helpful to drive this forward docs

Comments

@svrnm
Copy link
Member

svrnm commented Aug 31, 2022

This is a follow-up to @trask's question (#1661):

Sub-projects like opentelemetry-java-instrumentation, opentelemetry-dotnet-instrumentation & python's opentelemetry-distro provide a solution that is described as "layer that adds OpenTelemetry instrumentation to a service without modifying the source code for that service." in OTEPS-0001.

I would like to add a section to Components that calls out the existence of those, what value they bring to the end-user ("zero source code modification.", auto instrumentation for all my libraries, packaging of SDK, exporters, resource detectors & other building blocks, extensibility & some), when to use them and when to go for a manual instrumentation, etc.

Coming from APM I would like to call them "Agents" but as stated in the OTEP "that term is overloaded and ambiguous", so throughout our docs we try to avoid it so far. Im okay with that, but to add that section to the docs, I first need a canonical term that describes that layer properly. There are a few alternatives throughout the docs & code:

  • java: OpenTelemetry (Automatic) Instrumentation for <Language> (plus Javaagent of course)
  • .NET OpenTelemetry <Language> Automatic Instrumentation
  • Python OpenTelemetry Distro for <Language> based on what is described here: Distributions

So, how is that "layer that adds OpenTelemetry instrumentation to a service without modifying the source code for that service." called?

cc @open-telemetry/docs-approvers

@svrnm svrnm added the question Further information is requested label Aug 31, 2022
@svrnm svrnm added docs discussion Input from everyone is helpful to drive this forward and removed question Further information is requested labels Sep 8, 2022
@svrnm
Copy link
Member Author

svrnm commented Sep 15, 2022

@martinjt have a look here :-)

@reyang
Copy link
Member

reyang commented Sep 23, 2022

Here is my suggestion:

  1. We should avoid the word "agent" while talking about auto-instrumentation, given "auto-instrumentation" is the official name recommended by OpenTelemetry. I understand that "JVM Agent" is a technology that has been used for decades, I would vote that in the context of Java, always use "JVM Agent" (or "JVM agent").
  2. I would suggest that we use "agent" to refer to something that runs as a separate process (have different PID) or something that is not a process at all (e.g. a device driver running in the kernel mode with it's own thread, or a firmware that hook into the hardware interruptions).

@svrnm
Copy link
Member Author

svrnm commented Sep 26, 2022

  1. We should avoid the word "agent" while talking about auto-instrumentation, given "auto-instrumentation" is the official name recommended by OpenTelemetry. I understand that "JVM Agent" is a technology that has been used for decades, I would vote that in the context of Java, always use "JVM Agent" (or "JVM agent").
  2. I would suggest that we use "agent" to refer to something that runs as a separate process (have different PID) or something that is not a process at all (e.g. a device driver running in the kernel mode with it's own thread, or a firmware that hook into the hardware interruptions).

I am ok with avoiding the word "agent", but I don't see "auto-instrumentation" being the exact equivalent for that today:

  • For an APM end-user an agent is not only that "layer that adds instrumentation to a service without modifying the source code for that service.", but also some packaging around it that takes care of backend connection (exporters), processing (filtering, batching, etc.), runtime configuration, self-telemetry, plugin support and some more things.
  • The glossary of the spec defines Automatic Instrumentation as: Refers to telemetry collection methods that do not require the end-user to write or access application code to use the OpenTelemetry APIs. Methods vary by programming language, and examples include bytecode injection or monkey patching.
  • If I look into different language documentations of the otel projects auto-instrumentation can mean different things:
    • Java does the "APM agent" for Automatic Instrumentation, i.e. there are instrumentation libraries for a bunch of frameworks, there are exporters & there is configuration
    • DotNet docs states that "OpenTelemetry .NET Automatic Instrumentation does the following: (1) injects and configures the OTel SDK into the application. (2) Adds OTel Instrumation to key packages and APIs used by the application", which is also more like an "APM agent"
    • JavaScript has a package called auto-instrumentations-node that bundles instrumentation libraries into one package. The wiring (exporters, config, etc.) of everything else happens via the SDK setup.
    • Python docs describe Automatic Instrumentation as "You only need to install a few Python packages to successfully instrument your application’s code." The wiring, e.g. setting up exporters, is done independently. some of the more common options for users, which follows what is also described in the concept page of Distributions
    • Ruby docs describe Automatic instrumentation as "Automatic instrumentation in ruby is done via instrumentation packages, and most commonly, the opentelemetry-instrumentation-all package. These are called Instrumentation Libraries."
  • There is also the term Distribution, which is described as "a wrapper around an upstream OpenTelemetry repository with some customizations" and used by python to have that "agent-like" experience, and by the collector for pre-built packaging & vendor-specific packaging. It's also used by some vendors to describe their "agent-like" wrappers, but since it is also used for the collector and for the overall vendor-specific wrapping of the otel projects, it's also a confusing term

To make a long story short and in combination with your ask at @ #1777, what we need to avoid confusion with the otel end-users, is the following:

  • A consistent name for the language specific project (see Consistent naming? #1777 for details & examples)
  • A consistent word for that "layer that adds instrumentation to a service without modifying the source code for that service." (sounds like that's what should be called "Auto Instrumentation")
  • A consistent word for that auto-instrument plus X "apm agent"-like experience (Distro? Agent?)
  • A consistent word for that vendor-specific "distribution" of all (or many) otel projects.

What I don't know is how to go about that, is this something the GC/TC needs to pick up, do we need a spec change, an OTEP?

@cartermp
Copy link
Contributor

Just a nit about .NET - it's explicitly not an agent (@pellared can confirm) but rather a custom profiler & assorted components that injects the instrumentation. So we'd want to use a different name for that.

@pellared
Copy link
Member

pellared commented Sep 26, 2022

I would stand that it is simply "automatic-instrumentation"

Take notice that for Python auto-instrumentation you do not need to write code as you can use env vars to configure stuff (see here).

Maybe the same is true for Node.js and Ruby? It is possible that the automatic instrumentation term is abused there. but it is something that should be addressed in their projects.

I like the current description that I see here and here.

Probably this issue can be closed.

@svrnm
Copy link
Member Author

svrnm commented Sep 26, 2022

In this blog post says "now with the .NET Automatic Instrumentation project, developers can run the agent side-by-side with their application and get telemetry automatically.

We confuse our end-users, and that's why I raised this issue. This adds to the constant complaint of end-users that it is so complicated to get started with OpenTelemetry.

And yes, the term automatic instrumentation is abused in some places, but my point of view is that this comes from the fact that there is no better term for it, if we avoid "agent".

My question is, how do we call the piece of software that is attached to an application (ideally without touching the code) and then doing the following as an all-in-one solution:

  • auto-instrumentation (following the definitions linked above, [especially from the spec](runtime configuration)
  • backend connection (exporters, connection management, encryption)
  • telemetry processing (sensitive data filters, sampling, batching, ...)
  • debugging options / self-telemetry (writing logs, otel on otel, ...)
  • extensions (like java has it described here)
  • runtime configuration (SDK Configuration, Exporter Configuration, Instrumentation Configuration, Debugging Configuration, ...)
  • decent defaults
  • packaging for all of that.

@pellared
Copy link
Member

pellared commented Sep 26, 2022

I see your concern... But right now I have no good answer (nor opinion) for it 😢

My question is, how do we call the piece of software that is attached to an application (ideally without touching the code) and then doing the following as an all-in-one solution:

"auto-instrumentation" (sic!)? I mean like it is hard to create a hard-boundaries here as it is very language/technology specific how and what is getting automatically instrumented. Some of the things you mentioned are provided by "auto-instrumentation", some by the SDKs, and it depends on the implementation. I think it is even possible that the auto-instrumentation has different defaults than the SDK.

My only idea is "auto-instrumentation distribution". I do not think it violates the term described here. However, I am not sure if it will not bring even more confusion...

@svrnm
Copy link
Member Author

svrnm commented Sep 27, 2022

I see your concern... But right now I have no good answer (nor opinion) for it 😢

Makes us two :D

My question is, how do we call the piece of software that is attached to an application (ideally without touching the code) and then doing the following as an all-in-one solution:

"auto-instrumentation" (sic!)? I mean like it is hard to create a hard-boundaries here as it is very language/technology specific how and what is getting automatically instrumented. Some of the things you mentioned are provided by "auto-instrumentation", some by the SDKs, and it depends on the implementation. I think it is even possible that the auto-instrumentation has different defaults than the SDK.

I hear you, but I am not happy with that: based on the conversation we have, I learned that "auto-instrumentation" is well-defined in the spec as telemetry collection methods that do not require the end-user to modify application's source code, so it's only about the part where instrumentation libraries are injected into the service without touching the code. I like that very much.

My only idea is "auto-instrumentation distribution". I do not think it violates the term described here. However, I am not sure if it will not bring even more confusion...

Yes, I think that's probably what's the term I am looking for, I mean it follows the description of Distribution from what we have already and python is doing exactly that. Also, doing some checks on vendors I see many having a " Distribution of OpenTelemetry (for) Language" and that includes the things I called out above, so maybe it's something like

  • Distribution of OpenTelemetry for Language
  • Reference Distribution of OpenTelemetry for Language
  • Community Distribution of OpenTelemetry for Language
  • Vanilla Distribution of OpenTelemetry for Language
  • OpenTelemetry Distribution of OpenTelemetry for Language (just kidding...)
  • ...

Is this the right direction???

@austinlparker
Copy link
Member

I don't have full commentary available for this yet, but I wanted to briefly discuss the 'distribution' concept and why it's been bothering me.

We spent a significant amount of time in the planning, ideation, and initial implementation of OpenTelemetry to preserve the OpenTracing goal of decomposition between interface and implementation. The goal behind this was to reduce the chance that a full-SDK wrap would be used for end-user implementations of OpenTelemetry.

This, however, isn't necessarily how things have passed. I think we need to find wording that talks about extensions to OpenTelemetry, and perhaps that can be the sword that cuts this Gordian knot.

If we assume that the OpenTelemetry API and SDK are, respectively, the interface and the implementation, then how do we classify instrumentation libraries today? We don't do a great job at it, we toss them in contrib and colloquially refer to them as 'instrumentation libraries'. This is a distinction without difference; The API itself is the 'instrumentation library' as it's the interface to the instrumentation methods that produce telemetry data.

My suggestion is that we classify components somewhat as such:

  • AGENTS (JVM agent, .net agent, etc.) are distributions of the OpenTelemetry API and SDK that provide configuration, initialization, instrumentation, and registration helpers. They allow for "zero code change" implementations of OpenTelemetry and are always external processes.
  • INSTRUMENTATION EXTENSIONS (net/http wrapper, various contrib libraries, etc.) are plug-ins for the OpenTelemetry API and SDK that add telemetry to existing libraries.
  • DISTRIBUTIONS (lightstep launcher/honeycomb launcher, aws distro, etc.) are bundles of the api, sdk, extensions, and agents that aid in specific goals such as vendor compatibility

@reyang
Copy link
Member

reyang commented Sep 29, 2022

My suggestion is that we classify components somewhat as such:

  • AGENTS (JVM agent, .net agent, etc.) are distributions of the OpenTelemetry API and SDK that provide configuration, initialization, instrumentation, and registration helpers. They allow for "zero code change" implementations of OpenTelemetry and are always external processes.

The term "agent" is already super confusing https://en.wikipedia.org/wiki/Software_agent, I hope we can avoid adding more confusions.

@svrnm
Copy link
Member Author

svrnm commented Sep 30, 2022

  • AGENTS (JVM agent, .net agent, etc.) are distributions of the OpenTelemetry API and SDK that provide configuration, initialization, instrumentation, and registration helpers. They allow for "zero code change" implementations of OpenTelemetry and are always external processes.

The term "agent" is already super confusing https://en.wikipedia.org/wiki/Software_agent, I hope we can avoid adding more confusions.

The more I think about that topic, the more I get back to thinking that "agent" is a good term:

  • It's an industry standard used across vendors for plenty of years now. People know what to expect from an agent and people will use that term anyways. I tried for a long time now to tell people that there is no OTel agent, but everybody around me uses that term anyways.
  • You can read those definitions by wikipedia also as supporting that: the agent does instrumentation on behave of the end users. The subsection on that page on Monitoring-and-surveillance (predictive) agents links to the page Monitoring and surveillance agents, which states "Monitoring and surveillance agents are often used to monitor complex computer networks to predict when a crash or some other defect may occur."
  • We can help removing some of that confusion by being specific about the type of agent we are talking about, so in writing let's call it "(auto)instrumentation agent".

If we decide to not use it, we need to provide something alternative to describe that bundle of "things"

@svrnm
Copy link
Member Author

svrnm commented Sep 30, 2022

  • AGENTS (JVM agent, .net agent, etc.) are distributions of the OpenTelemetry API and SDK that provide configuration, initialization, instrumentation, and registration helpers. They allow for "zero code change" implementations of OpenTelemetry and are always external processes.
  • INSTRUMENTATION EXTENSIONS (net/http wrapper, various contrib libraries, etc.) are plug-ins for the OpenTelemetry API and SDK that add telemetry to existing libraries.
  • DISTRIBUTIONS (lightstep launcher/honeycomb launcher, aws distro, etc.) are bundles of the api, sdk, extensions, and agents that aid in specific goals such as vendor compatibility

To clarify that a little bit, what's the difference for you between AGENT and DISTRIBUTION?

@svrnm
Copy link
Member Author

svrnm commented Sep 30, 2022

A third one on "auto-instrumentation" based on our slack discussion at #otel-comms yesterday & this existing discussion on that topic via @cartermp.

I start to understand that using the term "auto-instrumentation" for different things is OK, because it's a broad term that's just saying "I didn't instrument that manually". As @cartermp said: [...] the presence of many different ways to get some degree of instrumentation created for you makes it better to just call it “automatic instrumentation”. Since there’s several mechanisms by which you can get this instrumentation, some offering more instrumentation than others, it’s important to distinguish those mechanisms, but I don’t think we should be in the business of only calling something automatic instrumentation if it comes from some “agent-like” thingy (which we do today)

So, you can say "I auto-instrumented my application with an agent" or "I auto-instrumented my dependencies by loading instrumentation libraries" or "I use library x which has otel natively, so I get instrumentation automatically", etc.

What we (aka the docs team) need to do eventually, is making sure that when an end user reads on "automatic instrumentation" for Java, for Python, for Ruby, for Node.JS they should be presented with something that gives them details on the different "automatic instrumentations":

  • How they can use a library that is natively instrumented
  • How they can use a single instrumentation library
  • How they can use a bundle of instrumentation libraries
  • How they can use an „agent“ / „distribution“ that does some things extra

If this makes sense, I would like to gap this out from this discussion into a separate issue and we can keep the discussion here going on agents, instrumentation extensions & distributions :D

@pellared
Copy link
Member

Writing my thoughts before I forget.

I think that also each "auto-instrumentation" tool/agent/library/whatever should provide a high-level description of what this auto-instrumentation is and how it works. Here is how we try to do it for .NET.

@austinlparker
Copy link
Member

To clarify that a little bit, what's the difference for you between AGENT and DISTRIBUTION?

The primary distinction between an agent and a distribution in my mind is who is providing it. I would suggest that the OpenTelemetry project will never release a distribution of OpenTelemetry. However, I would speculate that eventually a company like GCP or AWS will release a distribution of OpenTelemetry for their clouds. Similarly, I could see point monitoring solutions creating a distribution of OpenTelemetry for their tool, or server less frameworks, etc. etc.

An agent, meanwhile, could be distributed by OpenTelemetry or third parties; The distinction there is that an agent must provide instrumentation of a system or service with no code changes.

On Auto-Instrumentation as a Noun vs. a Verb

As someone that originally was a strong proponent of grouping agents and instrumentation extensions into the catch-call term of 'automatic instrumentation', my mind has been changed by speaking to some users who are very confused by our usage of the word. Effectively, the "auto-instrumentation == instrumentation libraries == agents" comes from the concept that using a noun to describe a verb is sus. However, this isn't something that's readily apparent to the lay community. People hear 'auto instrumentation' as a noun; It describes a class of thing, not an action.

On not using 'instrumentation library' as a class

Instrumentation library is problematic as a class descriptor for libraries that perform automatic instrumentation because OpenTelemetry SDK is also an instrumentation library. To illustrate this point, I'm going to write the same statement twice but drop proper nouns in the second.

  1. "In order to generate telemetry data from a service, you need to instrument it for observability. First, you'll install the OpenTelemetry SDK, as well as instrumentation libraries for frameworks and clients that you may be using."
  2. "In order to generate telemetry data from a service, you need to instrument it for observability. First, you'll install the instrumentation library, as well as instrumentation libraries for frameworks and clients that you may be using."

See the problem? If we rework this, though...

  1. "In order to generate telemetry data from a. service, you need to instrument it for observability. First, you'll install the instrumentation library, as well as instrumentation extensions for frameworks and clients that you may be using."

@tsloughter
Copy link
Member

@svrnm I don't see how your examples of Java, Ruby, Python and Javascript are different meanings for "automatic instrumentation". All of them require no code added for a particular instrumentation, the only difference I see is (maybe, this isn't clear) whether the SDK requires code changes to be initialized or not?

(And yes, please do not use "agent" for anything :)

@pellared
Copy link
Member

pellared commented Sep 30, 2022

whether the SDK requires code changes to be initialized or not

This is how I understand the main difference. It should be possible to set up without making ANY changes in the application code 😉 E.g. I would say that JavaScript has instrumentation libraries (but no auto-instrumentation).

I would say that Python has auto-instrumentation. Ruby at first glance looks like library instrumentation.

@svrnm
Copy link
Member Author

svrnm commented Oct 4, 2022

@austinlparker thanks for the clarification on AGENT vs DISTRIBUTION, makes sense to me, a few comments nevertheless:

The primary distinction between an agent and a distribution in my mind is who is providing it. I would suggest that the OpenTelemetry project will never release a distribution of OpenTelemetry.

Python is doing that already

However, I would speculate that eventually a company like GCP or AWS will release a distribution of OpenTelemetry for their clouds. Similarly, I could see point monitoring solutions creating a distribution of OpenTelemetry for their tool, or server less frameworks, etc. etc.

Those exist already right? We have ADOT from amazon and a bunch of vendors having their "distribution"

Effectively, the "auto-instrumentation == instrumentation libraries == agents" comes from the concept that using a noun to describe a verb is sus.

Agreed.

"In order to generate telemetry data from a. service, you need to instrument it for observability. First, you'll install the instrumentation library, as well as instrumentation extensions for frameworks and clients that you may be using."

Now I get it, and I agree with that one as well. I see 2 problems:

  1. the word "instrumentation library" is out there already, so we would need to verify if we can change that to something else without breaking things (e.g. if a language has a package called framework-instrumentation-library things get complicated)
  2. I am not 100% sure about extension, we have Extensions for java already, so you have to be very clear that it is "instrumentation extension"

@tsloughter

@svrnm I don't see how your examples of Java, Ruby, Python and Javascript are different meanings for "automatic instrumentation". All of them require no code added for a particular instrumentation, the only difference I see is (maybe, this isn't clear) whether the SDK requires code changes to be initialized or not?

Similar to what @austinlparker said, they are all "automatic instrumentation", they actually do not have different meaning, but the difference in mechanisms (using @cartermp's word here:) ) is still relevant to the end user. At the end we want them to understand that they have multiple ways of accomplishing "auto instrumentation", and all of them are valid based on where they are coming from:

  • automatic instrumentation + all the things listed before via an agent (the "all-in-one" solution) is great for Ops, to get otel out of a "black-box" application
  • automatic instrumentation through bundles of instrumentation extensions/libraries are great for "grey-box" applications, i.e. where an ops or dev doesn't know for sure what is in that application so they just add everything and see what happens, but they still do code changes like SDK initialization, etc.
  • automatic instrumentation through a singular instrumentation extensions/libraries are great for developers when they know exactly what libraries they use and they just want to add instrumentation to libs that don't have it yet ("white box")
  • automatic instrumentation coming from libraries that have otel natively is our happy place :)

@pellared

whether the SDK requires code changes to be initialized or not

This is how I understand the main difference. It should be possible to set up without making ANY changes in the application code 😉 E.g. I would say that JavaScript has instrumentation libraries (but no auto-instrumentation).

I would say that Python has auto-instrumentation. Ruby at first glance looks like library instrumentation.

If I understand you correctly, you would say, that this would mean that "auto-instrumentation" is only given, if no code is touched at all? Which comes down to "auto instrumentation == agent". I have my issues with that, for the same reasons what @austinlparker said above, saying/writing things can get confusing:

  • Auto-instrumentation of an application => true auto instrumentation?
    vs
  • Auto-instrumentation of a library => don't say that?

(And yes, please do not use "agent" for anything :)

Sorry for this flippant comment: give me a better word and I am going to use it -- that has been the purpose of this ticket all along, right? And "auto instrumentation" is not a good replacement: seeing the discussion we have on that word, I think agent is the lesser evil of two. (A combination of both "auto instrumentation agent" is something I use alot lately as well)

I understand that agent as a term is ambiguous (and disliked by many in the otel community), but I tried it for a year now within my circles to steer people away from "agent" and everybody (including myself) gets back to using that word, because (a) we do not have a better word and (b) it has been around for 15+ years now, used by APM vendors (AppDynamics, Dynatrace, NewRelic ...) & Oss projects (SkyWalking, PinPoint) a like. Changing that in the mind of the end-users is a gigantic task.

Note: There are some vendors using different terms, because the reserve agent for that out-of-process piece (DataDog with tracers, instana sensors if I checked both correctly), so those are alternatives, we just have to agree on one :-) (I don't like tracer & sensor for a variety of reasons...)

@pellared
Copy link
Member

pellared commented Oct 4, 2022

@svrnm @austinlparker

I am missing something. Where auto-instrumentation is used as a verb? For me it is a noun. Personally, I understand auto-instrumentation as "a method of getting the application instrumented without touching the application's source code". Using a .NET Profiler is a method to instrument C# apps, using a JVM Agent is a method to instrument Java apps, using eBPF uprobes is a method to instrument Go apps.

The agent suggests that it is a "separate process". I personally think that e.g. providing a compiler that would build an application with instrumentation is also auto-instrumentation. Would you call it an agent?

EDIT:

I do not have a better word than "auto-instrumentation". I think we should clarify the word and maybe give more examples in the definition. I already tried it (open-telemetry/opentelemetry-specification#2700), but maybe someone else could do it a lot better than me 😉

EDIT 2:

Another try open-telemetry/opentelemetry-specification#2853 😄

PS. OTel Collector is an agent 😄

@theletterf
Copy link
Member

Agent is overloaded (as "component"), old, and not great, but it does the job, in my opinion. It also bears nasty negative SecOps connotations, but everyone seems to understand what an agent is in the context of automatic instrumentation. I've asked users and they understood what agent implied. So I wrote this definition for the Splexicon:

A software tool or component that processes and forwards software telemetry to an observability back end. In the context of application monitoring, agents instrument applications to collect spans, traces, logs, and metrics.

There are two types of observability agents:

  • Agents that collect infrastructure telemetry, which run in the background as daemons or services
  • Agents that instrument software applications by attaching to the application as packages or components

On the other hand, auto-instrumentation is long, not entirely true, and has a hyphen that causes lots of trouble to documentarians.

I'd stick with agent whenever we're talking about automatic or semi-automatic layers that help applying instrumentation to software. Another option is going the AWS Lambda route and call them "Layers". But it'll take years of promotion and user education to achieve a change like that.

Back to the original question: if we don't have an alternative to agent or layer, the problem cannot be solved right now and we should continue with what we have, that is, agent or layer.

@svrnm
Copy link
Member Author

svrnm commented Oct 4, 2022

@svrnm @austinlparker

I am missing something. Where auto-instrumentation is used as a verb? For me it is a noun. Personally, I understand auto-instrumentation as "a method of getting the application instrumented without touching the application's source code". Using a .NET Profiler is a method to instrument C# apps, using a JVM Agent is a method to instrument Java apps, using eBPF uprobes is a method to instrument Go apps.

I agree that instrumentation achieved via byte code instrumentation, monkey-patching, ebpf uprobes or any similar means are "auto-instrumentation", for me it's the overarching term, but for what I want a term is that layer of software that is applied at runtime to code so that instrumentation (+everything else) is accomplished for me automatically.

The agent suggests that it is a "separate process".

I disagree, a JVM Agent for example never has been a "separate process".

I personally think that e.g. providing a compiler that would build an application with instrumentation is also auto-instrumentation. Would you call it an agent?

No. Here's where I would like to use agent, for an in-process at-runtime code-changing layer that is injected into your application without touching your code yourself -> the agent is acting on your behave to accomplish auto instrumentation.

EDIT:

I do not have a better word than "auto-instrumentation". I think we should clarify the word and maybe give more examples in the definition. I already tried it (open-telemetry/opentelemetry-specification#2700), but maybe someone else could do it a lot better than me 😉

I am fine with "auto-instrumentation" as overarching term, what I want to have is saying that there are different building blocks/mechanisms and one of them is an "agent".

EDIT 2:

Another try open-telemetry/opentelemetry-specification#2853 😄

👍

PS. OTel Collector is an agent 😄

Yes, but a different kind of agent;-)

@svrnm
Copy link
Member Author

svrnm commented Oct 4, 2022

Agent is overloaded (as "component"), old, and not great, but it does the job, in my opinion.

💯

It also bears nasty negative SecOps connotations,

That's not a bug, it's a feature: agents ARE a security issues and I had many customers who were not keen on having self-installing self-updating agents ...

but everyone seems to understand what an agent is in the context of automatic instrumentation. I've asked users and they understood what agent implied.

👍

On the other hand, auto-instrumentation is long, not entirely true, and has a hyphen that causes lots of trouble to documentarians.

I think that's what @austinlparker meant when he said "a noun vs a verb": The agent is doing auto-instrumentation, aka the agent is the acting noun, the auto-instrumentation is the thing the noun is doing (the verb), even if it's not used as a verb in that sentence.

I'd stick with agent whenever we're talking about automatic or semi-automatic layers that help applying instrumentation to software. Another option is going the AWS Lambda route and call them "Layers". But it'll take years of promotion and user education to achieve a change like that.

And layer is close to "thing"...

Back to the original question: if we don't have an alternative to agent or layer, the problem cannot be solved right now and we should continue with what we have, that is, agent or layer.

:-)

@reyang
Copy link
Member

reyang commented Oct 4, 2022

How do we describe the following scenario:

Process A                             Process B
+--------------------------------+    +------------+
|                                |    |            |
| +------------------+           |    |            |
| | User application |           +----> Collector  |
| +------------------+           |    |            |
|                                |    |            |
| Some auto-instrument mechanism |    |            |
+--------------------------------+    +------------+

Many existing solutions call Process B "agent", if we call the "auto-instrumentation mechanism" and "agent", it'll be very confusing. In addition, if the auto-instrumentation technology is something like code weaving, do we still call it "agent"?

@tsloughter
Copy link
Member

I disagree, a JVM Agent for example never has been a "separate process".

That may be unique to the JVM. I always think of an agent as a separate process -- though I may be the "unique" one :)

@reyang
Copy link
Member

reyang commented Oct 4, 2022

I disagree, a JVM Agent for example never has been a "separate process".

That may be unique to the JVM. I always think of an agent as a separate process -- though I may be the "unique" one :)

@tslougher You're definitely not the "unique" one, I also feel that JVM Agent is a special case as I've suggested here #1689 (comment)

@theletterf
Copy link
Member

theletterf commented Oct 5, 2022

@reyang @tsloughter I suspect we might be falling into a "curse of knowledge" issue here, in that "agent" has very specific meanings for certain software development environments or languages. But so do other terms, like "component" or "method" or "layer". We have to compromise a bit.

The diagram by @reyang is a good starting point for a language conversation. In my opinion, it's not a big issue to have more than one agent in the picture: one is an APM agent, the other is an infrastructure / forwarding agent. Pardon the frivolity here, but: Let there be agents. :-)

image

The picture is actually quite related to @svrnm 's excellent remark, which is a benefit of using the word agent:

That's not a bug, it's a feature: agents ARE a security issues and I had many customers who were not keen on having self-installing self-updating agents ...

@svrnm
Copy link
Member Author

svrnm commented Oct 5, 2022

@tsloughter you are unique 🤩 but not because you're thinking that "agent" has to be a separate process. I read & hear this a lot, although I disagree. I used the JVM Agent just as an example. It's common among APM vendors to call the ".NET Profiler" or the equivalent in-process layer doing instrumentation for Python, PHP, Ruby, Node.JS an "agent" as well.

Thanks @theletterf for bringing up that point that there can be many agents (like on that picture) and this has been the case in monitoring forever (Infra Agents, Forwarding Agents, Synthetic Agents, Browser Agents, APM Agents, Log Collection Agent, etc.)

I am OK with getting rid of the word "agent", if we have an alternative word for a software layer that is injected into an application to modify code at runtime to accomplish not only auto-instrumentation, but also initialization, exporting, runtime configuration, self-telemetry & some more. I am happy to brainstorm on that (some candidates so far are "auto-instrumentation", "profiler", "tracer" and "sensor"), but until then I will stick with agent :-)

The purpose of this issue is that, I want to write a few words in the documentation for end-users that look for their Java,.NET,Python,Ruby,PHP,Node.JS Agent, to say "there is no agent, but there is X"

Right now I need to write: Hey dear APM user, who is used to "throw an agent against an application and what falls down into your backend are traces&metrics", in OpenTelemetry we don't have an agent for all the languages, what we have is an JVM Agent for Java, a CLR Profiler for .NET, a python tool called "opentelemetry-instrument", a tracing.js for nodejs you have to cobble together yourself, etc.

@theletterf
Copy link
Member

theletterf commented Oct 5, 2022

Entering brainstorming mode...

@svrnm Some alternatives to agent that might fit your description:

  • Launcher (my favorite)
  • Enabler
  • Sidecar
  • Driver
  • Catalyst
  • Companion
  • Carrier

@reyang
Copy link
Member

reyang commented Oct 5, 2022

The purpose of this issue is that, I want to write a few words in the documentation for end-users that look for their Java,.NET,Python,Ruby,PHP,Node.JS Agent, to say "there is no agent, but there is X"

Right now I need to write: Hey dear APM user, who is used to "throw an agent against an application and what falls down into your backend are traces&metrics", in OpenTelemetry we don't have an agent for all the languages, what we have is an JVM Agent for Java, a CLR Profiler for .NET, a python tool called "opentelemetry-instrument", a tracing.js for nodejs you have to cobble together yourself, etc.

I think that thing is called auto-instrumentation? Here is my simple proposal:

Hey dear APM user, if you want to instrument your applications without having to manually instrument your code, use the OpenTelemetry auto-instrumentation.

It seems multiple implementation SIGs already kind of chose to use it in the repo URI:

Process A                             Process B
+--------------------------------+    +---------------+
|                                |    |               |
| Your application, where you    |    | OpenTelemetry |
| can use auto-instrumentation,  +----> Collector     |
| manual-instrumentation, or a   |    |               |
| combination of both.           |    |               |
|                                |    |               |
+--------------------------------+    +---------------+

@pellared
Copy link
Member

pellared commented Oct 6, 2022

I think we are nowhere near changing the name to "agent".

Still, this issue uncovers some problems that we currently have (as a whole community).

One of the problems is that the repository names for automatic instrumentation have -instrumentation suffixes. Is it manual? Or an instrumentation library? I try to address it here

I also want to point out that the term agent is not even mentioned in the spec's glossary. If we want to change the terminology then the Specification SIG is the place to do proceed.

Take notice that some OTel components like https://opentelemetry.io/docs/instrumentation/java/ and https://opentelemetry.io/docs/collector/ and indeed agents and it explained in the description.

As a community, we should also pay more attention to properly naming things according to our defined and agreed terminology. For example, if someone says something about OTel .NET Agent, then we must tell that there is no such thing. We should describe that in OTel we use the term "auto-instrumentation" to stress that it is not a standalone process but a way of instrumenting the application without touching the app's source code. At last, the word "auto-instrumentation" is more precisely defined than "agent". One of the goals of OpenTelemetry is to establish standards that help in communication. For me naming "agent" would not help that in long term.

@svrnm
Copy link
Member Author

svrnm commented Oct 6, 2022

I think we are nowhere near changing the name to "agent".

I can live with that

Still, this issue uncovers some problems that we currently have (as a whole community).

Agreed

As a community, we should also pay more attention to properly naming things according to our defined and agreed terminology.

💯

That's the thing we are circling right now! And I called it out before, that different projects use the term "auto instrumentation" for things that are similar but not the same:

  • Java Auto Instrumentation is that "all in one" thing what I have in mind when I talk about an agent (it does not only include do-not-touch-my-code instrumentation of my application but also "do-not-touch-my-code" SDK initialization, exporter setup, sampling setup, resource detection, runtime configuration, self-telemetry, extensions (java has them), eventually integration of a control plane client (like OpAMP), etc. etc. )
  • .NET seems to move towards a similar direction
  • Python confuses me a little bit to be honest, there is the opentelemetry_instrument tool for auto instrumentation but there is also the Distro which sets defaults for exporters and some. But opentelemetry-instrumentation is the package that loads all the instrumentation libraries which then are doing the real instrumentation.
  • Node.JS asks you to wire everything together via the SDK, and then use a package that bundles all instrumentation libraries and that one is called auto instrumentation
  • ruby says that Automatic instrumentation in ruby is done via instrumentation packages, and most commonly, the opentelemetry-instrumentation-all package.
  • PHP & Go just started their journey on automatic instrumentation
  • Erlang has a section on Library Instrumentation which stats that Library instrumentations, broadly speaking, refers to instrumentation code that you didn’t write but instead include through another library. -- the same thing what Ruby & Node.JS call "Auto Instrumentation"

One of the problems is that the repository names for automatic instrumentation have -instrumentation suffixes. Is it manual? Or an instrumentation library? I try to address it here

Adding to that: there is no consistency what you can find in an opentelemetry--<core|contrib|instrumentation> repository

We should describe that in OTel we use the term "auto-instrumentation" to stress that it is not a standalone process but a way of instrumenting the application without touching the app's source code. At last, the word "auto-instrumentation" is more precisely defined than "agent".

Again, I agree that "agent" is a shitty term, but I disagree that "auto-instrumentation" is more precisely defined. Copying a little bit from what I wrote above: does auto instrumentation only mean instrumentation of my code via instrumentation libraries. Or, does it also include do-not-touch-my-code for everything else what the end-user needs (SDK initialization, exporter setup, sampling setup, resource detection, runtime configuration, self-telemetry, extensions, control plane client?

Right now we have no common ground for that. I see two options now:

  1. Automatic Instrumentation = All-In-One Solution , <new word> = bundle to instrument all my libraries
  2. <new word> = All-In-One Solution, Automatic Instrumentation = bundle to instrument all my libraries

@tsloughter
Copy link
Member

Erlang has a section on Library Instrumentation which stats that Library instrumentations, broadly speaking, refers to instrumentation code that you didn’t write but instead include through another library. -- the same thing what Ruby & Node.JS call "Auto Instrumentation"

We try to be clear that this isn't auto-instrumentation with the following setence:

OpenTelemetry for Erlang/Elixir supports this process through wrappers and helper functions around many popular frameworks and libraries.

Maybe it should explicitly say it is "not auto-instrumentation".

@pellared
Copy link
Member

pellared commented Oct 6, 2022

@svrnm Based on what you described Ruby io docs should be changed from "Automatic" to "Libraries" (or at least "Instrumentation").

@open-telemetry/ruby-maintainers Do you agree? 👆

@svrnm Regarding auto in package names. You can try creating an issue but I think it would be a breaking change 😒

@svrnm Regarding

Or, does it also include do-not-touch-my-code for everything else what the end-user needs (SDK initialization, exporter setup, sampling setup, resource detection, runtime configuration, self-telemetry, extensions, control plane client?

I think all current (Java, .NET, Python) auto-instrumentation does all of that, right?

@pellared
Copy link
Member

pellared commented Oct 6, 2022

-> Automatic Instrumentation = All-In-One Solution , = bundle to instrument all my libraries

You found the name "instrumentation libraries bundle" 🎉
I think they could be named e.g. like lang-instrumentation-all 😉

@svrnm
Copy link
Member Author

svrnm commented Oct 6, 2022

@tsloughter I don't think that you need to change anything in the erlang doc here, I just raised it in comparison to ruby/nodejs.

Based on what you described Ruby io docs should be changed from "Automatic" to "Libraries" (or at least "Instrumentation").

From and end-user perspective I find this problematic, people are looking for "Automatic", and would not expect what they are looking for in a page called "Libraries" ... "Instrumentation" might work (think we have it with some languages) but it's still confusing because we also have "Manual Instrumentation". So

  1. we need an "Auto-Instrumentation" for all languages eventually.
  2. we need something good to close the gap until we have it. This is a more a documentation than semantics issue, maybe we can keep "Automatic" and call out that language X does not have this yet, but in the meantime you can an as close as possible experience by doing A,B,C

Regarding auto in package names. You can try creating an issue but I think it would be a breaking change 😒

We should at least raise it with the communities.

I think all current (Java, .NET, Python) auto-instrumentation does all of that, right?

Java, .NET: yes
python: as said, I am still not sure if I understand exactly how things are cobbled together here, since there are multiple building blocks (opentelemetry-distro, opentelemetry-instrumentation, opentelemetry-instrument)

@svrnm
Copy link
Member Author

svrnm commented Dec 12, 2022

the initial purpose of this ticket is accomplished, the concepts now mention that someone who is looking for an apm agent should look for Automatic Instrumentation, the rest of the discussion remains with the spec issue (2866)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Input from everyone is helpful to drive this forward docs
Projects
None yet
Development

No branches or pull requests

7 participants