Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added functionality for running the same trace request again with the same parameters. #2202

Merged
merged 3 commits into from Jun 9, 2021

Conversation

Dhruv-J
Copy link
Contributor

@Dhruv-J Dhruv-J commented May 22, 2021

A "RUN TRACE AGAIN" option has been added to the Octant UI. The new graph's label is changed to reflect the time the trace was run again.

Fixes: #2178

Signed-off-by: Dhruv Jain 2dhruv@gmail.com

Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Dhruv-J for the PR

addTfAction = "traceflow/addTf"
addLiveTfAction = "traceflow/addLiveTf"
showGraphAction = "traceflow/showGraphAction"
runTraceAgain = "traceflow/runTagain"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you name the variable same style as other actions? i.e.

Suggested change
runTraceAgain = "traceflow/runTagain"
runTraceAgainAction = "traceflow/runTraceAgain"

@@ -165,6 +167,7 @@ func (p *antreaOctantPlugin) actionHandler(request *service.ActionRequest) error
}

updateIPHeader(tf, hasSrcPort, hasDstPort, srcPort, dstPort)
mostRecentTraceflow = tf;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The semicolon is redundant. https://golang.org/doc/effective_go#semicolons

Idiomatic Go programs have semicolons only in places such as for loop clauses, to separate the initializer, condition, and continuation elements. They are also necessary to separate multiple statements on a line, should you write code that way.

@@ -313,6 +316,14 @@ func (p *antreaOctantPlugin) actionHandler(request *service.ActionRequest) error
return nil
}
return nil
case runTraceAgain:
// Get name of new traceflow
mostRecentTraceflow.Name = mostRecentTraceflow.Spec.Source.Pod + "-" + mostRecentTraceflow.Spec.Destination.Pod
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works for pod to pod only, could you make it more generic?
Maybe extract a name generator function and call it in addTfAction, addLiveTfAction, and runTraceAgain

Copy link
Contributor Author

@Dhruv-J Dhruv-J May 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each case sets the different name a different way, but they have the time in common, so wouldn't it make sense to only change the time?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that works too. I suggested a common name generator function as this kind of code seems a bit redundant now, but of course not necessary to handle it in this PR. Up to you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has the comment been addressed? My point was some traceflow requests' destination may be service or IP, so mostRecentTraceflow.Spec.Destination.Pod is empty. The new traceflow's name will be weird.

Copy link
Contributor Author

@Dhruv-J Dhruv-J May 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The earlier way I had done it was to trim 15 characters off the end of the string, because those were guaranteed to have the date and time. I feel like that might be a better solution considering each traceflow name has a date and time. Should I do this instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it works for me.

mostRecentTraceflow.Name = mostRecentTraceflow.Spec.Source.Pod + "-" + mostRecentTraceflow.Spec.Destination.Pod
mostRecentTraceflow.Name += "-" + time.Now().Format(TIME_FORMAT_YYYYMMDD_HHMMSS)

log.Printf("Ran traceflow again successfully, Traceflow Results: %+v\n", mostRecentTraceflow)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It hasn't run traceflow successfully, right? Maybe just remove this line if it doesn't provide more information than existing logs in createTfCR


log.Printf("Ran traceflow again successfully, Traceflow Results: %+v\n", mostRecentTraceflow)
p.createTfCR(mostRecentTraceflow, request, context.Background(), mostRecentTraceflow.Name)
return nil;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

tAgainForm := component.Form{Fields: []component.FormField{
component.NewFormFieldHidden("action", runTraceAgain),
}}
runTagain := component.Action{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable should be runTAgain to meet the mixedCaps converntion. But maybe just use a full name here is more clear - runTraceAgain.

runTagain := component.Action{
Name: "Run Traceflow Again",
Title: "Run Traceflow Again",
Form: tAgainForm,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, could be traceAgainForm

@@ -313,6 +316,14 @@ func (p *antreaOctantPlugin) actionHandler(request *service.ActionRequest) error
return nil
}
return nil
case runTraceAgain:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could it support rerun a live traceflow too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any feedback on this comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it can't since a Live Traceflow doesn't create a Traceflow Request.

Copy link
Member

@tnqn tnqn May 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It creates a Traceflow request:

p.createTfCR(tf, request, context.Background(), tfName)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm adding functionality to be able to rerun live traceflows. My mistake, I didn't see that line.

@tnqn
Copy link
Member

tnqn commented May 24, 2021

The commit message needs to be formatted:

  1. Insert a newline and an empty line between the title and the body (and makes the title as short as possible)
  2. Remove the trailing period of the title
  3. Wrap the message in ~72 chars per line

For example:

Support running the same trace request again with the same parameters

A "RUN TRACE AGAIN" option has been added to the Octant UI. The new
graph's label is changed to reflect the time the trace was run again.

Fixes #2178

Signed-off-by: Dhruv Jain <2dhruv@gmail.com>

@codecov-commenter
Copy link

codecov-commenter commented May 24, 2021

Codecov Report

Merging #2202 (8ac0de3) into main (382cd88) will decrease coverage by 9.40%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2202      +/-   ##
==========================================
- Coverage   61.95%   52.55%   -9.41%     
==========================================
  Files         276      273       -3     
  Lines       21178    20504     -674     
==========================================
- Hits        13120    10775    -2345     
- Misses       6699     8372    +1673     
+ Partials     1359     1357       -2     
Flag Coverage Δ
kind-e2e-tests 52.55% <ø> (-0.58%) ⬇️
unit-tests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/agent/controller/egress/id_allocator.go 0.00% <0.00%> (-100.00%) ⬇️
pkg/controller/networkpolicy/endpoint_querier.go 2.85% <0.00%> (-88.58%) ⬇️
pkg/controller/egress/controller.go 0.00% <0.00%> (-83.88%) ⬇️
pkg/agent/util/iptables/lock.go 0.00% <0.00%> (-81.82%) ⬇️
pkg/apis/controlplane/v1beta1/conversion.go 1.16% <0.00%> (-79.85%) ⬇️
pkg/cni/client.go 0.00% <0.00%> (-75.52%) ⬇️
pkg/apiserver/handlers/endpoint/handler.go 0.00% <0.00%> (-70.59%) ⬇️
pkg/agent/apiserver/handlers/ovsflows/handler.go 14.06% <0.00%> (-60.94%) ⬇️
.../agent/flowexporter/connections/conntrack_linux.go 0.00% <0.00%> (-59.14%) ⬇️
pkg/util/ip/ip.go 21.21% <0.00%> (-59.10%) ⬇️
... and 85 more

addLiveTfAction = "traceflow/addLiveTf"
showGraphAction = "traceflow/showGraphAction"
runTraceAgain = "traceflow/runTagain"
mostRecentTraceflow = &crdv1alpha1.Traceflow{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we initialized it to nil and display an error if someone tries to run a Traceflow again without first running a normal Traceflow?

another question: since this is mutable global state for the plugin, do we need to protect it with a mutex or is it always accessed from a single goroutine? @mengdie-song do you know?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot initialize it to null right there, but I can check if the structure is still empty before returning an error. The variable should only be accessed by traceflow.go .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Dhruv-J why can't you initialize it to nil? The following should work:

var mostRecentTraceflow *crdv1alpha1.Traceflow = nil

Actually, the = nil is optional

@Dhruv-J Dhruv-J force-pushed the bugfix2178 branch 2 times, most recently from 8f36c37 to 3ec336d Compare May 25, 2021 20:17
case runTraceAgainAction:
// Check if traceflow has been run before
if mostRecentTraceflow == nil {
log.Printf("Failed to run traceflow again: Run a traceflow before attempting to run a traceflow again.\n")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the log.Printf will only add a log in backend, it's better to return an alert to UI as well so user can learn what's to do first. eg:

alert := action.CreateAlert(action.AlertTypeError,
`Failed to run traceflow again: Use 'START NEW TRACE' or 'START NEW LIVE-TRAFFIC TRACE' to 
run a traceflow before attempting to run a traceflow again.`,
action.DefaultAlertExpiration)
request.DashboardClient.SendAlert(request.Context(), request.ClientID, alert)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I'll add this.

Dhruv Jain added 2 commits June 1, 2021 10:57
A "RUN TRACE AGAIN" option has been added to the Octant UI. The new
graph's label is changed to reflect the time the trace was run again.

Fixes antrea-io#2178

Signed-off-by: Dhruv Jain <2dhruv@gmail.com>
Modified name changing scheme for re-running traceflows
and added user alert for attempting to re-run traceflow
without running a traceflow first. Also added support
for running a live traceflow again.

Signed-off-by: Dhruv Jain <2dhruv@gmail.com>
tnqn
tnqn previously approved these changes Jun 3, 2021
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Dhruv-J I tested the feature and it works great. However, I noticed that we already had a lastTf field in the antreaOctantPlugin struct, and it seems that we could just reuse this instead of adding mostRecentTraceflow?

BTW after clicking on "RUN TRACEFLOW AGAIN" (shouldn't it be "RUN TRACE AGAIN" for consistency?), there is another step required (clicking "SUBMIT"). It's too bad, but after looking at the Octant, looks like there may not be an easy way around it.

I'm still concerned about concurrent access to global plugin state, but that's not limited to this PR, so no need to change anything there, I'll keep thinking about this issue.

Removed mostRecentTraceflow variable. Also modified
RUN TRACEFLOW AGAIN button to read RUN TRACE AGAIN
button to maintain consistency.

Signed-off-by: Dhruv Jain <2dhruv@gmail.com>
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
This will be very useful. Thanks @Dhruv-J for working on this and for being patient during the review process.

@antoninbas
Copy link
Contributor

/skip-all
(all changes are to the Octant plugin)

@tnqn tnqn merged commit 92cb71c into antrea-io:main Jun 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Supporting running a new Traceflow request with the same exact parameters in Octant UI
5 participants