Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identify and move components from core to contrib to support trace stability milestone #3474

Closed
alolita opened this issue Jun 21, 2021 · 32 comments
Assignees
Labels
priority:p0 Critical

Comments

@alolita
Copy link
Member

alolita commented Jun 21, 2021

After discussion with the Collector maintainers, we plan to identify and move components from collector core to collector-contrib to support an imminent trace stability goal for the core Collector. This move will help us declare a stable Collector core.

This includes:

  • Identifying components to be moved (e.g. processors) to contrib
  • Building a list of these components to be moved
  • Track the move of each of these components via individual issues
  • Communicate this move on Slack channels and with a blog post
  • Ensure build scripts are updated to reflect a smaller core and larger contrib
@tigrannajaryan
Copy link
Member

tigrannajaryan commented Jun 21, 2021

I suggest that we:

Keep in the core:

Exporters

  • jaeger
  • zipkin
  • opencensus
  • otlp
  • otlphttp
  • logging

Receivers:

  • jaeger
  • opencensus
  • zipkin
  • otlp
  • prometheus (?) <-- Not sure about this one. It may be considered essential since this is the only way to scrape Collector's own metrics.

Processors

  • batch
  • memorylimiter

Extensions:

  • ballast
  • healthcheck
  • pprof
  • zpages

Move to contrib:

Exporters:

  • file
  • kafka
  • prometheus
  • prometheusremotewrite

Receivers:

  • hostmetrics
  • kafka

Processors

  • attributes
  • filter
  • probabilisticsampler
  • resource
  • span

Extensions:

  • bearertokenauth
  • oidcauth

Thoughts @bogdandrutu @alolita ?

@bogdandrutu
Copy link
Member

bogdandrutu commented Jun 22, 2021

  • I think "ballast" is an important functionality and we want to keep it in core (unless we don't finish the implementation).
  • Do we need OpenCensus in core? Maybe we can consider to move it to contrib?

@jrcamp
Copy link
Contributor

jrcamp commented Jun 22, 2021

  • Do we need OpenCensus in core? Maybe we can consider to move it to contrib?

Prometheus is still using OpenCensus so would need to be migrated to pdata or both go to contrib. (edit: it's only using OpenCensus modules so actually unrelated to receiver) I think Prometheus should go to contrib though since it is responsible for a large number of the dependencies in core (10's of MB as I recall).

@tigrannajaryan
Copy link
Member

I think Prometheus should go to contrib though since it is responsible for a large number of the dependencies in core (10's of MB as I recall).

I don't mind moving. Everybody happy with moving Prometheus receiver to contrib?

@tigrannajaryan
Copy link
Member

  • Do we need OpenCensus in core? Maybe we can consider to move it to contrib?

Seems like OpenCensus is a pretty stable codebase and does not require much maintenance. I don't mind moving to contrib though if that's what we want.

@alolita
Copy link
Member Author

alolita commented Jun 22, 2021

@tigrannajaryan this list looks like a good first start. A couple of concerns -

  1. Will moving the Prometheus receiver to contrib still enable us to continue using Collector releases for Prometheus pipelines? What is the impact on the build and release process?
  2. Will moving Open Census break Prometheus pipelines? What is the impact on releases?
    I'd like to have a better handle on the release workflows.

@alolita
Copy link
Member Author

alolita commented Jun 22, 2021

Since we are working on ensuring first-class Prometheus interoperability, I propose -

Preferred Option: we keep the Prometheus components in core as-is where the Prometheus receiver, Prometheus remote write exporter (push-based) and Prometheus export (pull-based) stay in Collector core repo. At a minimum, we would like to see the Prometheus receiver and Prometheus remote write exporter in core to maintain stability criteria to ensure all PRW compliance tests are passing.

Alternative Option - Since we are aiming for stability for core Collector functionality, I suggest we have a repo hosting all Prometheus components - Prometheus receiver, PRW exporter and Prometheus exporter (which needs some TLC).

I don't think moving Prometheus components into contrib would be an optimal choice.

Look forward to discussing any concerns, thoughts.

@jmacd
Copy link
Contributor

jmacd commented Jun 23, 2021

It may be considered essential since this is the only way to scrape Collector's own metrics.

This is a bit surprising. Is there a long term plan to remove this dependency? Could we imagine using an OTel-Go SDK?

@tigrannajaryan
Copy link
Member

It may be considered essential since this is the only way to scrape Collector's own metrics.

This is a bit surprising. Is there a long term plan to remove this dependency? Could we imagine using an OTel-Go SDK?

Yes, most likely that's what we will do. IIRC, we had some dependency conflicts that prevented us to do this easily.

@alolita
Copy link
Member Author

alolita commented Jun 24, 2021

Can we have a wider audience for these discussions on metrics scraping rearchitecture? Sorry I missed the IIRC discussion.

@jrcamp
Copy link
Contributor

jrcamp commented Jun 24, 2021

See #1093

@alolita
Copy link
Member Author

alolita commented Jun 29, 2021

I don't think we will be ready to move Prometheus receiver into contrib until the Go SDK is 1.0 (stable).

@alolita
Copy link
Member Author

alolita commented Jun 29, 2021

Here is the current snapshot that @bogdandrutu and I discussed for list of components in core and to be moved to contrib.

Stays in opentelemetry-collector

CORE Receivers Processors Exporters Extensions
client/ jaeger batch jaeger ballast
cmd/ zipkin memorylimiter zipkin healthcheck
component/ otlp processorhelper otlp pprof
config/ prometheus*   otlphttp zpages
consumer/     logging extensionhelper
internal/     exporterhelper  
model/        
obsreport/        
service/        
translator/        
website_docs/        
docs/        
examples/        

Move to opentelemetry-collector-contrib:

CONTRIB Receivers Processors Exporters Extensions
testbed/ #3636 hostmetrics #3638 attributes #3643 file #3648 bearertokenauth #3651
testutil/ #3637 kafka #3639 filter #3644 kafka #3649 oidcauth #3652
opencensus #3640 probabilisticsampler #3645 prometheus*   
scrapererror #3641 resource #3646 prometheusremotewrite*  
scraperhelper #3642 span #3647 opencensus #3650  

NOTE: Please note we're still discussing location of Prometheus components.
Prometheus receiver
Prometheus exporter
Prometheusremotewrite exporter

@bogdandrutu
Copy link
Member

@alolita we also discussed that we need to look more deeper into config/

@jpkrohling
Copy link
Member

We talked in the past about having distributions as a concept that is unrelated to the code repositories. With that, we could keep the core without concrete receivers/exporters/processors, serving only as an API for downstream distributions (such as Jaeger v2). The core components could be located in a different repository, with distributions being built using the OpenTelemetry Collector Builder. The core distribution would include the components listed in this ticket here. Later on, we could have officially supported alternative distributions, such as a "sidecar" distribution containing only OTLP receivers/exporters.

Are we still considering doing this?

@alolita
Copy link
Member Author

alolita commented Jun 30, 2021

@tigrannajaryan
Copy link
Member

Why do we want to move testbed to contrib? I think it is a critical part of component tests. If we keep the components in the core then we need to keep their tests in the core. If we decide to move all components to contrib (like it was discussed in SIG meeting today) then in that case I agree we need to move the testbed too.

@jpkrohling
Copy link
Member

As discussed during this week's SIG call, here's a draft proposal for splitting the concerns (API consumers vs. distribution users): https://docs.google.com/document/d/1XyYyaYtc1R8-es0BL65XE21dORLQpCPM1Pinw4MyEYs/edit

@jrcamp
Copy link
Contributor

jrcamp commented Jul 2, 2021

I'd like to make a separate proposal: I believe we should combine core and contrib. core would be the API as a separate go module and then each component that was previously in core (zipkin, prometheus, etc.) would become its own go.mod component just like every other contrib module works. This could live in the same repository and prevent the whole "update core then go update contrib". I believe it'd streamline a lot of development. Things would still stay logically separated as they are separate go modules.

I also described this here https://docs.google.com/document/d/1XyYyaYtc1R8-es0BL65XE21dORLQpCPM1Pinw4MyEYs/edit?disco=AAAANPPTudk

@bogdandrutu
Copy link
Member

@jrcamp why is it important that the core and contrib are in the same repository especially when core will become stable?

@bogdandrutu
Copy link
Member

@jpkrohling

We talked in the past about having distributions as a concept that is unrelated to the code repositories. With that, we could keep the core without concrete receivers/exporters/processors, serving only as an API for downstream distributions (such as Jaeger v2). The core components could be located in a different repository, with distributions being built using the OpenTelemetry Collector Builder. The core distribution would include the components listed in this ticket here. Later on, we could have officially supported alternative distributions, such as a "sidecar" distribution containing only OTLP receivers/exporters.

I would like to see this, maybe all distributions are built from the "builder" repository. Maybe a short diagram of where things will be and in what module would help clarifying.

Few things we need to do:

  • The builder should be able to build docker images, linux binaries, macos binaries, windows msi, etc. Need to find where to publish these binaries (right now we publish them to github release as part of the release, but that may not work, or it may).
  • Right now we are discussing mostly about code structure not distributions in this ticket, but we need to determine even with distributions do we enforce OTLP recv/exp to always be present (a.k.a. embed in the core module)?

That's why I don't like issues in github and I think a short document would help us answer all the questions, I feel that everyone has the ideas in their mind but nobody put them all together to see the full picture.

@jrcamp
Copy link
Contributor

jrcamp commented Jul 2, 2021

@jrcamp why is it important that the core and contrib are in the same repository especially when core will become stable?

Just because it's stable doesn't mean new API's won't be added. What that rate will be though I don't really know. I guess I would flip the question and ask what benefit are we gaining by having them be separate?

@bogdandrutu
Copy link
Member

This is done.

@bogdandrutu
Copy link
Member

For the work left we filed separate issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority:p0 Critical
Projects
None yet
Development

No branches or pull requests

7 participants