-
Notifications
You must be signed in to change notification settings - Fork 777
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add function logs support (#53) #131
Conversation
Awesome stuff, @life1347 !! Excited to have this feature and thanks for thinking it through and talking about it in the issue. I'll review and make comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @life1347 ,
Sorry for the review delay. This is a pretty big PR :)
I've suggested lots of changes and most of them can be deferred to later PRs. However the suggestions for the code inside poolmgr/gp.go specializePod
are more important, I think they should be addressed before merge.
Thanks again for doing this work. It's going to improve the experience of using fission for lots of people.
environments/fluentd/build.sh
Outdated
@@ -0,0 +1,2 @@ | |||
#!/bin/sh | |||
docker build -t fission-daemonset-fulentd:latest . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo (fluentd)
environments/fluentd/fluent.conf
Outdated
|
||
<match fission.**> | ||
type record_reformer | ||
enable_ruby true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this can be set to false?
environments/fluentd/fluent.conf
Outdated
namespace ${tag_parts[4]} | ||
pod ${tag_parts[5]} | ||
container ${tag_parts[6]} | ||
funcuid ${tag_parts[7]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add the function name as well? Seems like it would be useful.
environments/fluentd/fluent.conf
Outdated
sequence_tag _seq | ||
buffer_type memory | ||
buffer_chunk_limit 524288 # 512 * 1024 | ||
buffer_queue_limit 1024 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do lines 39-41 mean that in the worst case we can end up using 512*1024*1024 i.e 512M of memory? That's a lot. Maybe we should use buffer_type file
and a smaller buffer_queue_limit
?
environments/fluentd/run.sh
Outdated
@@ -0,0 +1,23 @@ | |||
#!/bin/sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment to this file saying where it's originally from?
poolmgr/gp.go
Outdated
@@ -262,6 +262,18 @@ func (gp *GenericPool) specializePod(pod *v1.Pod, metadata *fission.Metadata) er | |||
return errors.New(fmt.Sprintf("Error from fetcher: %v", resp.Status)) | |||
} | |||
|
|||
logRequest := fmt.Sprintf("{\"namespace\":\"%s\",\"pod\":\"%s\",\"container\":\"%s\",\"funcuid\":\"%s\"}", pod.Namespace, pod.Name, gp.env.Metadata.Name, metadata.Uid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we have a struct and then marshal it with json.Marshal?
poolmgr/gp.go
Outdated
// Failed | ||
resp.Body.Close() | ||
if err == nil { | ||
return errors.New(fmt.Sprintf("Error from fetcher: %v", resp.Status)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- error from logging helper, not fetcher
- we should make this error not fatal
- do we need to do this synchronously? anything in specializePod adds to cold start time -- can we do this http post in a goroutine?
poolmgr/gp.go
Outdated
loggerUrl := fmt.Sprintf("http://%s:1234/v1/log/%s", pod.Spec.NodeName, pod.Name) | ||
req, err := http.NewRequest("DELETE", loggerUrl, nil) | ||
resp, err := http.DefaultClient.Do(req) | ||
if err != nil || resp.StatusCode != 200 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this error should not be fatal; we should still continue trying to clean up
fission/logdb/logdb.go
Outdated
log.WithFields(log.Fields{ | ||
"FISSION_LOGDB_URL": cnf.Endpoint, | ||
}).Fatalf("FISSION_LOGDB_URL is incorrect, now only support %s", INFLUXDB) | ||
return dummyDB{}, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have a dummyDB? Why not just return nil and an error?
fission/function.go
Outdated
}(ctx, requestChan, responseChan) | ||
for { | ||
select { | ||
case <-time.After(1 * time.Second): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, we shouldn't wait a second for the first batch -- how about this: immediately do the first batch, and then either (a) quit if no -f
(b) time.sleep if -f
.
Oh I forgot to say: could you add the diagram from the issue into the Documentation/ directory in this PR? It's helpful to anyone trying to understand the architecture. Also, we're trying to keep the environments/ directory for languages (I know fetcher is there, it should be moved out too, but later). Could you move |
Wondering when this feature will be merged? |
51f1e68
to
13dd5ec
Compare
Hi @soamvasani, I've pushed the commit with following modification.
And thanks for the code review. :D |
Don't merge it right now, seems the logger failed to create symlink with latest commit. I am investigating it. |
With minikube, pod may fail to resolve node name (kubernetes/minikube#757). In the latest commit, poolmgr use host ip instead of node name to prevent dns problem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @life1347 , this is in pretty good shape. Just a few small requests below. If you choose not to address some of the feedback in this PR, you can file issues for them.
logger/types.go
Outdated
ContainerID string `json:"-"` | ||
} | ||
|
||
logRequestTacker struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: logRequestTracker (missing 'r')
fission-logger.yaml
Outdated
apiVersion: v1 | ||
kind: Service | ||
metadata: | ||
name: controller |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this here intentionally? it seems to just be a duplicate of the controller svc definition.
fission-logger.yaml
Outdated
value: "8086" | ||
- name: INFLUXDB_DBNAME | ||
value: "fissionFunctionLog" | ||
- name: INFLUXDB_USERNAME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So to be clear, this username and password is the only thing in this file the user needs to edit, right? Can you just add a comment here?
README.md
Outdated
|
||
``` | ||
# Currently only influxDB is supported, use --help to see more options | ||
$ export FISSION_LOGDB=http://$(kubectl --namespace fission get svc influxdb -o=jsonpath='{..clusterIP}'):8086 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why clusterIP would work here? The clusterIP of a service is usually not accessible from outside the cluster.
fatal("Need name of function, use --name") | ||
} | ||
|
||
dbHost := c.String("dbhost") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also all users to use env var here, instead of specifying this every time (also ok to do in a later change)
return err | ||
} | ||
|
||
loggerUrl := fmt.Sprintf("http://%s:1234/v1/log/%s", pod.Spec.NodeName, pod.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This port should be configurable (as an env var); but we can do that in a later change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add the env support in later PR with changes that cli gets log stream from controller instead of connecting to database directly.
poolmgr/gp.go
Outdated
@@ -262,6 +264,31 @@ func (gp *GenericPool) specializePod(pod *v1.Pod, metadata *fission.Metadata) er | |||
return errors.New(fmt.Sprintf("Error from fetcher: %v", resp.Status)) | |||
} | |||
|
|||
logReq := logger.LogRequest{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you move this whole thing into a separate function? So we can just have gp.setupLogging(pod)
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I should have been more clear -- I meant the whole functionality of setting up the logger, not just the one statement (upto line 285)
So I'm trying to set this up locally to test it out, and I'm not sure I understand the instructions -- after the influxdb username and password are created, how does one retrieve them? |
Yes, that's the drawback of the current design. A user needs to know the username/password in advance when tries to access function logs. |
Yes, I understand the drawback and I'm ok with it, I'm just trying to understand how to get it working on my setup :) Once I run the db, where do I get the credentials from? |
1. Move logger to top-level 2. Integrate logger with fission-bundle 3. Add influxdb deployment 4. Fix minor issues
1. Modify README.md for logger usage 2. Retrieve logdb host from env 3. Fix minor issues
4ea06fc
to
fd5b99c
Compare
(I rebased life1347/function-logs-support to fission/master) |
Also make the instructions a bit clearer.
This PR add function logs support for issue #53.
Display function logs
Display function logs from specific pod
Also, this PR implement the log database interface and use fluentd to pipe logs. Developer can implement their own log database solution for storing/searching function logs.