Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add observability capabilities to functions #9

Closed
tpiperatgod opened this issue Dec 2, 2021 · 39 comments
Closed

Add observability capabilities to functions #9

tpiperatgod opened this issue Dec 2, 2021 · 39 comments
Assignees
Labels
enhancement New feature or request

Comments

@tpiperatgod
Copy link
Member

We need to add observability capabilities to functions, which facilitate observing and tracking the operation of functions in large-scale scenarios.

Referring to #7, we can take the form of a plugin in functions-framework to wake up the observability component to run at a specific node.

For example, in functions-framework-go, we can add a plugin hook before and after the function is run, and run the logic related to the observability component in the hook.

Reference this:

func registerHTTPFunction(path string, fn func(http.ResponseWriter, *http.Request), h *http.ServeMux) error {
	h.HandleFunc(path, func(w http.ResponseWriter, r *http.Request) {
		defer recoverPanicHTTP(w, "Function panic")
		// execute pre-run plugins
		fn(w, r)
		// execute post-run plugins
	})
	return nil
}

We should also consider as much as possible the consistency of the scheme's implementation in different languages.

We can complete the details of the design of this solution in this document.

@benjaminhuo
Copy link
Member

benjaminhuo commented Dec 3, 2021

We might need to adjust all sync function and async function signature to the same as below, this way we can put tracing options such as skywalking, opentelemetry or none into OpenFunctionContext.

And then we can use these tracing parameters in OpenFunctionContext to create a wrapper function to wrap user function with tracing ability.

The tracing options skywalking, opentelemetry or none can be put into function crd maybe.

What do you think?
@tpiperatgod @wanjunlei @FeynmanZhou @arugal :

func func1(ctx *ofctx.OpenFunctionContext, in []byte) ofctx.RetValue 

@tpiperatgod
Copy link
Member Author

I think it would make sense to use the OpenFunctionContext to pass tracing options to the functions-framework, which is the job the OpenFunctionContext should take on.

And I agree with putting the options about function tracing in function crd.

@wu-sheng
Copy link

wu-sheng commented Dec 3, 2021

I want to give a heads up to the OpenFunction team. I am going to put a core-level proposal to SkyWalking project, which means we are going to officially move to SkyWalking v9.
About the immigration part, OpenFunction project doesn't need to worry about the break, because we are going to do that. All agents, go2sky, nodejs and python, are still as same as before, v3 tracing protocol will not be changed.

The thing I want to mention is, a new concept is going to be added in v9 core, which is layer. I suggest to add layer=faas as a specific tag into the root span of segment, which would help SkyWalking to ship the logic service and endpoint into FAAS page.

More information will be share next week or this weekend. Once the 8.9.0 release(In releasing process) is done, the new proposal will be out.

@wu-sheng
Copy link

wu-sheng commented Dec 3, 2021

Besides the APIs you are discussing, we also should consider

  1. Shipping logs, some agents(SkyWalking, not OpenTelemetry) have bundled channel to forward this directly, rather than collecting logs from K8s or files.
  2. Manual instrumentation Metrics APIs. There should be some kinds Prometheus concepts, but more closing to a metric API rather than implementation.
  3. Tracing part, beside before/after/context as mentioned, we should provide at least manually tagging APIs to add more custom information when needed.

@benjaminhuo
Copy link
Member

Besides the APIs you are discussing, we also should consider

  1. Shipping logs, some agents(SkyWalking, not OpenTelemetry) have bundled channel to forward this directly, rather than collecting logs from K8s or files.
  2. Manual instrumentation Metrics APIs. There should be some kinds Prometheus concepts, but more closing to a metric API rather than implementation.
  3. Tracing part, beside before/after/context as mentioned, we should provide at least manually tagging APIs to add more custom information when needed.

Thanks a lot for these suggestions @wu-sheng! we'll think about these points

@benjaminhuo
Copy link
Member

I want to give a heads up to the OpenFunction team. I am going to put a core-level proposal to SkyWalking project, which means we are going to officially move to SkyWalking v9. About the immigration part, OpenFunction project doesn't need to worry about the break, because we are going to do that. All agents, go2sky, nodejs and python, are still as same as before, v3 tracing protocol will not be changed.

The thing I want to mention is, a new concept is going to be added in v9 core, which is layer. I suggest to add layer=faas as a specific tag into the root span of segment, which would help SkyWalking to ship the logic service and endpoint into FAAS page.

More information will be share next week or this weekend. Once the 8.9.0 release(In releasing process) is done, the new proposal will be out.

Great to have this info! Sure, we'll add layer=faas tag to the root span

@benjaminhuo
Copy link
Member

I've created an initial proposal for tracing : https://hackmd.io/@UrcJbEg9R_mxQy4aRXO5tA/H1A4vDe9K
@wu-sheng @arugal @webup @tpiperatgod @wanjunlei @FeynmanZhou

@wu-sheng
Copy link

I think we needs to provide tags(in the proposal) for users, and also consider Correlation context. OpenTracing(OpenTelemetry should have too) has a same concept called baggage.

Also, to @arugal , we should consider how to add timestamp of previous function end, to propagate through sw8-x. Then SkyWalking server could have the scheduling latency from functionA to functionB.

@benjaminhuo
Copy link
Member

benjaminhuo commented Dec 10, 2021

Correlation context

I think we needs to provide tags(in the proposal) for users, and also consider Correlation context. OpenTracing(OpenTelemetry should have too) has a same concept called baggage.

Also, to @arugal , we should consider how to add timestamp of previous function end, to propagate through sw8-x. Then SkyWalking server could have the scheduling latency from functionA to functionB.

There is customTags section to add tags a user want to add, change it to tags ?

      customTags:
      - func: function-with-tracing
      - layer: faas
      - tag1: value1
      - tag2: value2

@wu-sheng
Copy link

I think tags work.

@benjaminhuo
Copy link
Member

Changed customTags to tags already

@benjaminhuo
Copy link
Member

Added baggage like below:

apiVersion: core.openfunction.io/v1alpha2
kind: Function
metadata:
  name: function-with-tracing
spec:
  serving:
    runtime: "OpenFuncAsync"
    tracing:
      # Switch to tracing, default to false
      enabled: true
      # Provider name can be set to "skywalking", "opentelemetry"
      # A valid provider must be set if tracing is enabled.
      provider: 
        name: "skywalking"
      # Custom tags to add to tracing
      tags:
      - func: function-with-tracing
      - layer: faas
      - tag1: value1
      - tag2: value2
      baggage:
      # baggage key is `sw8-correlation` for skywalking and `baggage` for opentelemetry
      # Correlation context for skywalking: https://skywalking.apache.org/docs/main/latest/en/protocols/skywalking-cross-process-correlation-headers-protocol-v1/
      # baggage for opentelemetry: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/baggage/api.md
      # W3C Baggage Specification/: https://w3c.github.io/baggage/
        key: sw8-correlation # key should be baggage for opentelemetry
        value: "base64(string key):base64(string value),base64(string key2):base64(string value2)"

@wu-sheng
Copy link

Make sense. Good to me.

@benjaminhuo
Copy link
Member

benjaminhuo commented Dec 17, 2021

Proposal for plugin mechanism to function framework by @tpiperatgod https://hackmd.io/O8o01-mjT6uv6L9F25pYsA?view=

@wu-sheng
Copy link

@arugal apache/skywalking#8367 SkyWalking v9 core upgrade is almost done.

Do we have any update about OpenFunction side? As we are going to add tracing(go2sky) to it first, @arugal you need to follow this v9 update, and we need to make sure OpenFunction's trace could be identified as a faas layer service, instance.

Also, we need a definition about what are the service and instance in the OpenFunction or general FAAS scope.
@benjaminhuo @tpiperatgod Any suggestion about this?

@benjaminhuo
Copy link
Member

benjaminhuo commented Dec 31, 2021

@wu-sheng,Thanks very much for the reminder.

To add skywalking tracing we need to refactor functions-framework and add a plugin mechanism and the design is almost finished.

With the previous tracing proposal and this design, the skywalking tracing function is now our current most important work to do.

Once the coding of the plugin mechanism is finished, we'll need @arugal's help to add skywalking tracing code as a plugin.

From my understanding, a function is a service and its replica is an instance. Do we need to add the service and instance tag to skywalking tracing?

@wu-sheng
Copy link

SkyWalking has service and instance fields directly(not need tags) to declare that. The reason I am asking for this, usually, an FAAS level function seems(from my little FAAS understanding, please CMIIW) more closing to an endpoint concept in SkyWalking.

So, I just recheck, whether OpenFunction has a higher level concept for a group of function replica(instance) grouped as a unit or something. There is no issue for function as service, it is just if we are using like this, the SkyWalking's endpoint concept seems not very useful for OpenFunction case. Or do I miss anything in the OpenFunction could be defined as a subset of function to be a function.

@benjaminhuo
Copy link
Member

OpenFunction has sync functions and it can be accessed through HTTP, endpoint could be valuable for sync functions.
Regarding async functions, it's triggered by events from middleware like MQ and maybe it's not applicable here.
We'll take a look at skywalking's Service/Instance/Endpoint concept to find out how to integrate with it.

@wu-sheng
Copy link

Sync and async all work in SkyWalking. We have Kafka consumer or async scheduled task in SkyWalking is defined as an endpoint.
My question is more focusing on, should we have endpoint still works in OpenFunction, as here, Function is the executable unit. Do we have larger concept for service?

@benjaminhuo
Copy link
Member

Got you, we'll add serviceless workflow capability and it's a set of related functions, so maybe a serverless workflow is a skywalking servcie

@wu-sheng
Copy link

Is a workflow always running in one process(OS level)? Because service-to-service is better to measure network performance comparing to endpoint-to-endpoint in today's SkyWalking.

@benjaminhuo
Copy link
Member

A workflow itself will run in different processes (functions) actually.

@arugal
Copy link
Member

arugal commented Dec 31, 2021

To add skywalking tracing we need to refactor functions-framework and add a plugin mechanism and the design is almost finished.

With the previous tracing proposal and this design, the skywalking tracing function is now our current most important work to do.
Once the coding of the plugin mechanism is finished, we'll need @arugal's help to add skywalking tracing code as a plugin.

Good to me, I'll start after the framework is complete :)

@wu-sheng
Copy link

A workflow itself will run in different processes (functions) actually.

OK, then, we need to consider more how to define service in OpenFunction. Let's set each function as service for now as a PoC version.

@benjaminhuo
Copy link
Member

Sure, the refactoring of functions framework is almost done.
Maybe we can start the integration with Skywalking the first community meeting of 2022(Jan 6th)
@tpiperatgod @arugal

A workflow itself will run in different processes (functions) actually.

OK, then, we need to consider more how to define service in OpenFunction. Let's set each function as service for now as a PoC version.

@wu-sheng
Copy link

Sure, the refactoring of functions framework is almost done.
Maybe we can start the integration with Skywalking the first community meeting of 2022(Jan 6th)

This seems good. SkyWalking's first release plans on March.
For developers, new backend core should be available in the first week of Jan., and the first draft will be around Chinese New Year. @Fine0830 Do you have a solid timeline for booster UI?

@Fine0830
Copy link

Sure, the refactoring of functions framework is almost done.
Maybe we can start the integration with Skywalking the first community meeting of 2022(Jan 6th)

This seems good. SkyWalking's first release plans on March. For developers, new backend core should be available in the first week of Jan., and the first draft will be around Chinese New Year. @Fine0830 Do you have a solid timeline for booster UI?

Uh...I don't sure about the timeline. Probably March is okay for me.

@wu-sheng
Copy link

Uh...I don't sure about the timeline. Probably March is okay for me.

OK, let's see. Anyway, I think OpenFunction will move faster than SkyWalking itself :)

@benjaminhuo
Copy link
Member

Now the functions-framework refactoring proposal is ready:
https://github.com/OpenFunction/OpenFunction/blob/main/docs/proposals/202112_functions_framework_refactoring.md

Skywalking tracing will be implemented as a plugin of the above functions-framework
https://github.com/OpenFunction/OpenFunction/blob/main/docs/proposals/202112_support_function_tracing.md

@tpiperatgod
Copy link
Member Author

Skywalking tracing will be implemented as a plugin of the above functions-framework https://github.com/OpenFunction/OpenFunction/blob/main/docs/proposals/202112_support_function_tracing.md

On the basis of this proposal, how about setting the configuration of the plugin section to this?

apiVersion: core.openfunction.io/v1alpha2
kind: Function
metadata:
  name: function-with-tracing
  annotations:
    plugins.pre:
      - pluginA
      - pluginB
      - pluginC
    plugins.post:
      - pluginC
      - pluginA
    plugins.tracing: |
      # Switch for tracing, default to false
      enabled: true
      # Provider name can be set to "skywalking", "opentelemetry"
      # A valid provider must be set if tracing is enabled.
      provider:
        name: "skywalking"
        oapServer: "localhost:xxx"
      # Custom tags to add to tracing
      tags:
      - func: function-with-tracing
      - layer: faas
      - tag1: value1
      - tag2: value2
      baggage:
      # baggage key is `sw8-correlation` for skywalking and `baggage` for opentelemetry
      # Correlation context for skywalking: https://skywalking.apache.org/docs/main/latest/en/protocols/skywalking-cross-process-correlation-headers-protocol-v1/
      # baggage for opentelemetry: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/baggage/api.md
      # W3C Baggage Specification/: https://w3c.github.io/baggage/
        key: sw8-correlation # key should be baggage for opentelemetry
        value: "base64(string key):base64(string value),base64(string key2):base64(string value2)"

@benjaminhuo
Copy link
Member

Skywalking tracing will be implemented as a plugin of the above functions-framework https://github.com/OpenFunction/OpenFunction/blob/main/docs/proposals/202112_support_function_tracing.md

On the basis of this proposal, how about setting the configuration of the plugin section to this?

apiVersion: core.openfunction.io/v1alpha2
kind: Function
metadata:
  name: function-with-tracing
  annotations:
    plugins.pre:
      - pluginA
      - pluginB
      - pluginC
    plugins.post:
      - pluginC
      - pluginA
    plugins.tracing: |
      # Switch for tracing, default to false
      enabled: true
      # Provider name can be set to "skywalking", "opentelemetry"
      # A valid provider must be set if tracing is enabled.
      provider:
        name: "skywalking"
        oapServer: "localhost:xxx"
      # Custom tags to add to tracing
      tags:
      - func: function-with-tracing
      - layer: faas
      - tag1: value1
      - tag2: value2
      baggage:
      # baggage key is `sw8-correlation` for skywalking and `baggage` for opentelemetry
      # Correlation context for skywalking: https://skywalking.apache.org/docs/main/latest/en/protocols/skywalking-cross-process-correlation-headers-protocol-v1/
      # baggage for opentelemetry: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/baggage/api.md
      # W3C Baggage Specification/: https://w3c.github.io/baggage/
        key: sw8-correlation # key should be baggage for opentelemetry
        value: "base64(string key):base64(string value),base64(string key2):base64(string value2)"

Annotation is just a map[string]string, make sure you know how to fit the data you want to this data structure

@wu-sheng
Copy link

As annotation is just a map[string]string, this seems the value(such as key=plugins.tracing) has to be parsed by tracer implementation. I am not sure what does OpenFunction recommends? Do you prefer key/value pairs or proto-obj oriented like Envoy?

@benjaminhuo
Copy link
Member

@wu-sheng Yes, the value of the key has to be parsed by OpenFunction itself before passing it to skywalking.
No need for skywalking to parse it in my opinion.

@tpiperatgod
Copy link
Member Author

tpiperatgod commented Jan 21, 2022

My mistake. It has been adjusted to the following format:

apiVersion: core.openfunction.io/v1alpha2
kind: Function
metadata:
  name: function-with-tracing
  annotations:
    plugins: |
      # Default order option. During the preHooks phase the plugins will be executed in the following order:
      #   pluginA -> pluginB -> pluginC
      # In the postHooks phase the plugins will be executed in the following order:
      #   pluginC -> pluginB -> pluginA
      order:
      - pluginA
      - pluginB
      - pluginC
      # The "pre" and "post" options will override the order in the "order" option,
      # and you can specify the order of execution of the plugins in the prehooks and posthooks phases separately
      pre:
      - pluginA
      - pluginC
      - pluginB
      post:
      - pluginB
      - pluginA
    plugins.tracing: |
      # Switch for tracing, default to false
      enabled: true
      # Provider name can be set to "skywalking", "opentelemetry"
      # A valid provider must be set if tracing is enabled.
      provider:
        name: "skywalking"
        oapServer: "localhost:xxx"
      # Custom tags to add to tracing
      tags:
      - func: function-with-tracing
      - layer: faas
      - tag1: value1
      - tag2: value2
      baggage:
      # baggage key is `sw8-correlation` for skywalking and `baggage` for opentelemetry
      # Correlation context for skywalking: https://skywalking.apache.org/docs/main/latest/en/protocols/skywalking-cross-process-correlation-headers-protocol-v1/
      # baggage for opentelemetry: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/baggage/api.md
      # W3C Baggage Specification/: https://w3c.github.io/baggage/
        key: sw8-correlation # key should be baggage for opentelemetry
        value: "base64(string key):base64(string value),base64(string key2):base64(string value2)"

@wu-sheng
Copy link

So, there will be an object to define the configuration in the OpenFunction codebase, carrying the parsed configurations. Then SkyWalking tracer accepts it and sets it to the go2sky kernal.

@benjaminhuo
Copy link
Member

So, there will be an object to define the configuration in the OpenFunction codebase, carrying the parsed configurations. Then SkyWalking tracer accepts it and sets it to the go2sky kernel.

Yes, that's correct!

@benjaminhuo
Copy link
Member

benjaminhuo commented Mar 8, 2022

@wu-sheng @arugal OpenFunction v0.6.0-rc.0 has been released and now SkyWalking has a perfect integration with OpenFunction Async and Sync functions!
Thanks, @arugal for the huge effort on this integration!

Skywalking tracing can be enabled either as a global option or as a per-function option as described in https://github.com/OpenFunction/OpenFunction/blob/main/docs/proposals/202112_support_function_tracing.md

@wu-sheng
Copy link

wu-sheng commented Mar 8, 2022

Fantastic! We are going to prepare the v9 release in the next 2 weeks, I will ask @arugal to set up the FAAS dashboard for you. This dashboard will be included as a default active function(on the top-level menu), I think you would love that.

I will update after we have that.

@benjaminhuo
Copy link
Member

Looking forward to SkyWalking v9!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Done
Development

No branches or pull requests

5 participants