Skip to content

Conversation

@memodi
Copy link
Member

@memodi memodi commented Feb 28, 2025

Description

NETOBSERV-2133 CLI flows JSON

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
    • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
    • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
    • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
    • Standard QE validation, with pre-merge tests unless stated otherwise.
    • Regression tests only (e.g. refactoring with no user-facing change).
    • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

@memodi memodi requested a review from jpinsonneau February 28, 2025 19:45
@memodi
Copy link
Member Author

memodi commented Feb 28, 2025

/ok-to-test

@github-actions
Copy link

New image:
quay.io/netobserv/network-observability-cli:bb25ca5

It will expire after two weeks.

To use this build, update your commands using:

USER=netobserv VERSION=bb25ca5 make commands

or download the updated commands.

@memodi
Copy link
Member Author

memodi commented Feb 28, 2025

or download the updated commands.

I tested using this script, we're now getting valid json:

$ jq '.[0]' < output/flow/2025-02-28T194919Z.json
{
  "AgentIP": "10.0.48.141",
  "Bytes": 1893,
  "Dscp": 0,
  "DstAddr": "10.0.48.141",
  "DstK8S_HostIP": "10.0.48.141",
  "DstK8S_HostName": "ip-10-0-48-141.us-east-2.compute.internal",
  "DstK8S_Name": "ip-10-0-48-141.us-east-2.compute.internal",
  "DstK8S_NetworkName": "primary",
  "DstK8S_OwnerName": "ip-10-0-48-141.us-east-2.compute.internal",
  "DstK8S_OwnerType": "Node",
  "DstK8S_Type": "Node",
  "DstK8S_Zone": "us-east-2b",
  "DstMac": "06:A4:C7:67:22:B9",
  "DstPort": 6081,
  "Etype": 2048,
  "FlowDirection": 0,
  "IfDirections": [
    0,
    0
  ],
  "Interfaces": [
    "ens5",
    "br-ex"
  ],
  "Packets": 3,
  "Proto": 17,
  "SrcAddr": "10.0.25.233",
  "SrcK8S_HostIP": "10.0.25.233",
  "SrcK8S_HostName": "ip-10-0-25-233.us-east-2.compute.internal",
  "SrcK8S_Name": "ip-10-0-25-233.us-east-2.compute.internal",
  "SrcK8S_NetworkName": "primary",
  "SrcK8S_OwnerName": "ip-10-0-25-233.us-east-2.compute.internal",
  "SrcK8S_OwnerType": "Node",
  "SrcK8S_Type": "Node",
  "SrcK8S_Zone": "us-east-2a",
  "SrcMac": "06:78:B8:81:70:CF",
  "SrcPort": 57984,
  "TimeFlowEndMs": 1740772152596,
  "TimeFlowStartMs": 1740772152595,
  "TimeReceived": 1740772153,
  "Udns": [
    "",
    ""
  ]
}

echo "Copying collector output files..."
mkdir -p ./output
${K8S_CLI_BIN} cp -n "$namespace" collector:output ./output
flowFile=$(find ./output -name "*json" | sort | tail -1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copyOutput is called for both flows and packets. Usually you can rely on $command to know which command is running but the user can also run oc netobserv copy and we don't know what been copied in that case since it was a background run.

We should refactor the collector to output a .txt file instead, which will be parsed by the script and converted to a .json file properly formatted. Doing so, we will be able to skip the format step when the text file is not found.

WDYT ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, to pretty print the output once the json is correct, we can use the following command:

$ yq --inplace -p=json -o=json '.' ./output/flow/<filename>.json
  • I'm relying on yq instead of jq here as we have it as dependency when we run the script
  • using inplace to edit the same file
  • mentionning -p=json means we are parsing a json file and -o=json to output a json file

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copyOutput is called for both flows and packets. Usually you can rely on $command to know which command is running but the user can also run oc netobserv copy and we don't know what been copied in that case since it was a background run.

I don't see$command variable set in functions.sh, I see we're checking for numbered arguments for $1 or $3 , any reason why arg numbers are not consistent?

We should refactor the collector to output a .txt file instead, which will be parsed by the script and converted to a .json file properly formatted. Doing so, we will be able to skip the format step when the text file is not found.

yes, that makes sense. I'll look into this.

Copy link
Contributor

@jpinsonneau jpinsonneau Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$command is inside netobserv main script:

$1 / $3 is relative to the fonction called. So it differs indeed !

Let's focus on the txt / json files which should be good enough 👍

@codecov
Copy link

codecov bot commented Mar 3, 2025

Codecov Report

Attention: Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.

Project coverage is 23.70%. Comparing base (51e6676) to head (718c95a).
Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
cmd/flow_capture.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #201   +/-   ##
=======================================
  Coverage   23.70%   23.70%           
=======================================
  Files          11       11           
  Lines        1333     1333           
=======================================
  Hits          316      316           
  Misses       1000     1000           
  Partials       17       17           
Flag Coverage Δ
unittests 23.70% <0.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
cmd/flow_capture.go 31.78% <0.00%> (ø)
🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@memodi
Copy link
Member Author

memodi commented Mar 3, 2025

/ok-to-test

@github-actions
Copy link

github-actions bot commented Mar 3, 2025

New image:
quay.io/netobserv/network-observability-cli:0be6d3a

It will expire after two weeks.

To use this build, update your commands using:

USER=netobserv VERSION=0be6d3a make commands

or download the updated commands.

@github-actions
Copy link

github-actions bot commented Mar 3, 2025

New image:
quay.io/netobserv/network-observability-cli:5df14c7

It will expire after two weeks.

To use this build, update your commands using:

USER=netobserv VERSION=5df14c7 make commands

or download the updated commands.

@github-actions
Copy link

github-actions bot commented Mar 3, 2025

New image:
quay.io/netobserv/network-observability-cli:4ec33dc

It will expire after two weeks.

To use this build, update your commands using:

USER=netobserv VERSION=4ec33dc make commands

or download the updated commands.

@memodi
Copy link
Member Author

memodi commented Mar 3, 2025

/ok-to-test

@github-actions
Copy link

github-actions bot commented Mar 3, 2025

New image:
quay.io/netobserv/network-observability-cli:178ab12

It will expire after two weeks.

To use this build, update your commands using:

USER=netobserv VERSION=178ab12 make commands

or download the updated commands.

@memodi memodi requested a review from jpinsonneau March 3, 2025 20:09
Comment on lines 299 to 303
# output dir may already include files from previous runs
oldFlowFile=$(find ./output -name "*txt" | sort | tail -1)
${K8S_CLI_BIN} cp -n "$namespace" collector:output ./output
flowFile=$(find ./output -name "*txt" | sort | tail -1)
if [[ -n "$flowFile" && "$oldFlowFile" != "$flowFile" ]] ; then
Copy link
Contributor

@jpinsonneau jpinsonneau Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can assume the latest txt file is the new one since we delete it after the conversion
or even apply the json format to all the txt files found under the output/flow folder 😸

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or even apply the json format to all the txt files found under the output/flow folder 😸

I think that would be bit overkill.

We can assume the latest txt file is the new one since we delete it after the conversion

I can delete L300 and remove condition to compare.

Comment on lines 312 to 324
TMP_FILE="/$MANIFEST_OUTPUT_PATH/$filename"
cp "$file" "$TMP_FILE"
filenamePrefix=$(echo "$filename" | sed -E 's/(.*)\..*/\1/')
UPDATED_JSON_FILE="/$MANIFEST_OUTPUT_PATH/$filenamePrefix.json"
{
echo "["
# remove last line and "," (last character) of the last flowlog for valid json
sed '$d' "$TMP_FILE" | sed '$ s/.$//'
echo "]"
} >> "$UPDATED_JSON_FILE"
dirpath=$(dirname "$file")
mv "$UPDATED_JSON_FILE" "$dirpath"
rm "$TMP_FILE"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have a txt file as input and a json file as output, we don't need to rely on a TMP_FILE anymore.
The json file can be written in the final directory as it's done all at once using the brackets + redirect

We sould be able to simplify all of this by:

Suggested change
TMP_FILE="/$MANIFEST_OUTPUT_PATH/$filename"
cp "$file" "$TMP_FILE"
filenamePrefix=$(echo "$filename" | sed -E 's/(.*)\..*/\1/')
UPDATED_JSON_FILE="/$MANIFEST_OUTPUT_PATH/$filenamePrefix.json"
{
echo "["
# remove last line and "," (last character) of the last flowlog for valid json
sed '$d' "$TMP_FILE" | sed '$ s/.$//'
echo "]"
} >> "$UPDATED_JSON_FILE"
dirpath=$(dirname "$file")
mv "$UPDATED_JSON_FILE" "$dirpath"
rm "$TMP_FILE"
dirpath=$(dirname "$file")
filenamePrefix=$(echo "$filename" | sed -E 's/(.*)\..*/\1/')
{
echo "["
# remove last line and "," (last character) of the last flowlog for valid json
sed '$d' "$file" | sed '$ s/.$//'
echo "]"
} >> "$dirpath/$filenamePrefix.json"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the format suggested in #201 (comment) ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried with yq, it takes about 30 seconds to return:

$ time yq --inplace -p=json -o=json '.' /tmp/2025-03-03T180301Z1.json
yq --inplace -p=json -o=json '.' /tmp/2025-03-03T180301Z1.json  34.34s user 2.23s system 165% cpu 22.116 total

vs jq:

$ time jq empty < /tmp/2025-03-03T180301Z1.json
jq empty < /tmp/2025-03-03T180301Z1.json  2.11s user 0.25s system 97% cpu 2.421 total

I am hesitant to use yq way to validate json given the high processing time, we can probably rely on jq since it's widely prevalent in usage than yq . If jq is absent we can simply skip validation and assume it'd be valid json, wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting 👀 Indeed yq seems to be 3 times slower than jq on my machine.

I'm not in favor to set jq as dependency and we may have issues depending on the jq version installed on the machine so I would be in favor to simply remove the format part to keep things simple.

The user can still run the format by himself

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @msherif1234 thoughts on that ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpinsonneau I am okay to remove that part where we test format. We remove the txt file regardless, correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah keeping only the json non formated is good enough I feel

@memodi memodi force-pushed the 2133 branch 2 times, most recently from e5b2516 to 6383b30 Compare March 4, 2025 16:40
@memodi memodi mentioned this pull request Mar 6, 2025
10 tasks
@memodi
Copy link
Member Author

memodi commented Mar 13, 2025

@jpinsonneau are we good here? let me know if there's anything more to address. thanks!

Copy link
Contributor

@jpinsonneau jpinsonneau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me ! Thank you sir !

@memodi
Copy link
Member Author

memodi commented Mar 13, 2025

Looks good to me ! Thank you sir !

thanks, I think it needs approve label.

@jpinsonneau
Copy link
Contributor

Don't we wait for the QE review ? 😆

@openshift-ci
Copy link

openshift-ci bot commented Mar 13, 2025

[APPROVALNOTIFIER] This PR is APPROVED

Approval requirements bypassed by manually added approval.

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@memodi
Copy link
Member Author

memodi commented Mar 13, 2025

Don't we wait for the QE review ? 😆

oh, of course, I shouldn't have thought of my testing as QE testing for this one and probably should have asked @Amoghrd to review :D but on a serious note, would you mind giving a quick try if you have not already?

@Amoghrd
Copy link
Member

Amoghrd commented Mar 13, 2025

/ok-to-test

@github-actions
Copy link

New image:
quay.io/netobserv/network-observability-cli:27bc89d

It will expire after two weeks.

To use this build, update your commands using:

USER=netobserv VERSION=27bc89d make commands

or download the updated commands.

@Amoghrd
Copy link
Member

Amoghrd commented Mar 13, 2025

/label qe-approved

@memodi
Copy link
Member Author

memodi commented Mar 13, 2025

thanks @Amoghrd

/retest

@jpinsonneau
Copy link
Contributor

/retest

@jpinsonneau
Copy link
Contributor

Don't worry I tested it too before approving 😄
Thanks @memodi & @Amoghrd !

@openshift-merge-bot openshift-merge-bot bot merged commit cb62cac into netobserv:main Mar 14, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants