- 
                Notifications
    
You must be signed in to change notification settings  - Fork 16
 
NETOBSERV-2133 CLI flows JSON #201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| 
           /ok-to-test  | 
    
| 
           New image: It will expire after two weeks. To use this build, update your commands using: USER=netobserv VERSION=bb25ca5 make commands | 
    
| 
           I tested using this script, we're now getting valid json:  | 
    
        
          
                scripts/functions.sh
              
                Outdated
          
        
      | echo "Copying collector output files..." | ||
| mkdir -p ./output | ||
| ${K8S_CLI_BIN} cp -n "$namespace" collector:output ./output | ||
| flowFile=$(find ./output -name "*json" | sort | tail -1) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copyOutput is called for both flows and packets. Usually you can rely on $command to know which command is running but the user can also run oc netobserv copy and we don't know what been copied in that case since it was a background run.
We should refactor the collector to output a .txt file instead, which will be parsed by the script and converted to a .json file properly formatted. Doing so, we will be able to skip the format step when the text file is not found.
WDYT ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, to pretty print the output once the json is correct, we can use the following command:
$ yq --inplace -p=json -o=json '.' ./output/flow/<filename>.json
- I'm relying on yq instead of jq here as we have it as dependency when we run the script
 - using inplace to edit the same file
 - mentionning -p=json means we are parsing a json file and -o=json to output a json file
 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copyOutput is called for both flows and packets. Usually you can rely on $command to know which command is running but the user can also run oc netobserv copy and we don't know what been copied in that case since it was a background run.
I don't see$command variable set in functions.sh, I see we're checking for numbered arguments for $1 or $3 , any reason why arg numbers are not consistent?
We should refactor the collector to output a .txt file instead, which will be parsed by the script and converted to a .json file properly formatted. Doing so, we will be able to skip the format step when the text file is not found.
yes, that makes sense. I'll look into this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$command is inside netobserv main script: 
| command="" | 
$1 / $3 is relative to the fonction called. So it differs indeed !
Let's focus on the txt / json files which should be good enough 👍
          Codecov ReportAttention: Patch coverage is  
 
 Additional details and impacted files@@           Coverage Diff           @@
##             main     #201   +/-   ##
=======================================
  Coverage   23.70%   23.70%           
=======================================
  Files          11       11           
  Lines        1333     1333           
=======================================
  Hits          316      316           
  Misses       1000     1000           
  Partials       17       17           
 Flags with carried forward coverage won't be shown. Click here to find out more. 
 🚀 New features to boost your workflow:
  | 
    
| 
           /ok-to-test  | 
    
| 
           New image: It will expire after two weeks. To use this build, update your commands using: USER=netobserv VERSION=0be6d3a make commands | 
    
| 
           New image: It will expire after two weeks. To use this build, update your commands using: USER=netobserv VERSION=5df14c7 make commands | 
    
| 
           New image: It will expire after two weeks. To use this build, update your commands using: USER=netobserv VERSION=4ec33dc make commands | 
    
| 
           /ok-to-test  | 
    
| 
           New image: It will expire after two weeks. To use this build, update your commands using: USER=netobserv VERSION=178ab12 make commands | 
    
        
          
                scripts/functions.sh
              
                Outdated
          
        
      | # output dir may already include files from previous runs | ||
| oldFlowFile=$(find ./output -name "*txt" | sort | tail -1) | ||
| ${K8S_CLI_BIN} cp -n "$namespace" collector:output ./output | ||
| flowFile=$(find ./output -name "*txt" | sort | tail -1) | ||
| if [[ -n "$flowFile" && "$oldFlowFile" != "$flowFile" ]] ; then | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can assume the latest txt file is the new one since we delete it after the conversion
or even apply the json format to all the txt files found under the output/flow folder 😸
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or even apply the json format to all the txt files found under the output/flow folder 😸
I think that would be bit overkill.
We can assume the latest txt file is the new one since we delete it after the conversion
I can delete L300 and remove condition to compare.
        
          
                scripts/functions.sh
              
                Outdated
          
        
      | TMP_FILE="/$MANIFEST_OUTPUT_PATH/$filename" | ||
| cp "$file" "$TMP_FILE" | ||
| filenamePrefix=$(echo "$filename" | sed -E 's/(.*)\..*/\1/') | ||
| UPDATED_JSON_FILE="/$MANIFEST_OUTPUT_PATH/$filenamePrefix.json" | ||
| { | ||
| echo "[" | ||
| # remove last line and "," (last character) of the last flowlog for valid json | ||
| sed '$d' "$TMP_FILE" | sed '$ s/.$//' | ||
| echo "]" | ||
| } >> "$UPDATED_JSON_FILE" | ||
| dirpath=$(dirname "$file") | ||
| mv "$UPDATED_JSON_FILE" "$dirpath" | ||
| rm "$TMP_FILE" | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we have a txt file as input and a json file as output, we don't need to rely on a TMP_FILE anymore.
The json file can be written in the final directory as it's done all at once using the brackets + redirect
We sould be able to simplify all of this by:
| TMP_FILE="/$MANIFEST_OUTPUT_PATH/$filename" | |
| cp "$file" "$TMP_FILE" | |
| filenamePrefix=$(echo "$filename" | sed -E 's/(.*)\..*/\1/') | |
| UPDATED_JSON_FILE="/$MANIFEST_OUTPUT_PATH/$filenamePrefix.json" | |
| { | |
| echo "[" | |
| # remove last line and "," (last character) of the last flowlog for valid json | |
| sed '$d' "$TMP_FILE" | sed '$ s/.$//' | |
| echo "]" | |
| } >> "$UPDATED_JSON_FILE" | |
| dirpath=$(dirname "$file") | |
| mv "$UPDATED_JSON_FILE" "$dirpath" | |
| rm "$TMP_FILE" | |
| dirpath=$(dirname "$file") | |
| filenamePrefix=$(echo "$filename" | sed -E 's/(.*)\..*/\1/') | |
| { | |
| echo "[" | |
| # remove last line and "," (last character) of the last flowlog for valid json | |
| sed '$d' "$file" | sed '$ s/.$//' | |
| echo "]" | |
| } >> "$dirpath/$filenamePrefix.json" | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the format suggested in #201 (comment) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just tried with yq, it takes about 30 seconds to return:
$ time yq --inplace -p=json -o=json '.' /tmp/2025-03-03T180301Z1.json
yq --inplace -p=json -o=json '.' /tmp/2025-03-03T180301Z1.json  34.34s user 2.23s system 165% cpu 22.116 total
vs jq:
$ time jq empty < /tmp/2025-03-03T180301Z1.json
jq empty < /tmp/2025-03-03T180301Z1.json  2.11s user 0.25s system 97% cpu 2.421 total
I am hesitant to use yq way to validate json given the high processing time,  we can probably rely on jq since it's widely prevalent in usage than yq . If jq is absent we can simply skip validation and assume it'd be valid json, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting 👀  Indeed yq seems to be 3 times slower than jq on my machine.
I'm not in favor to set jq as dependency and we may have issues depending on the jq version installed on the machine so I would be in favor to simply remove the format part to keep things simple.
The user can still run the format by himself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @msherif1234 thoughts on that ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpinsonneau I am okay to remove that part where we test format. We remove the txt file regardless, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah keeping only the json non formated is good enough I feel
e5b2516    to
    6383b30      
    Compare
  
    | 
           @jpinsonneau are we good here? let me know if there's anything more to address. thanks!  | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me ! Thank you sir !
          
 thanks, I think it needs approve label.  | 
    
| 
           Don't we wait for the QE review ? 😆  | 
    
| 
           [APPROVALNOTIFIER] This PR is APPROVED Approval requirements bypassed by manually added approval. This pull-request has been approved by: The full list of commands accepted by this bot can be found here. The pull request process is described here 
Needs approval from an approver in each of these files:
 
      Approvers can indicate their approval by writing   | 
    
          
 oh, of course, I shouldn't have thought of my testing as QE testing for this one and probably should have asked @Amoghrd to review :D but on a serious note, would you mind giving a quick try if you have not already?  | 
    
| 
           /ok-to-test  | 
    
| 
           New image: It will expire after two weeks. To use this build, update your commands using: USER=netobserv VERSION=27bc89d make commands | 
    
| 
           /label qe-approved  | 
    
| 
           thanks @Amoghrd /retest  | 
    
| 
           /retest  | 
    
Description
NETOBSERV-2133 CLI flows JSON
Dependencies
n/a
Checklist
If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.