Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NIFI-13082] Created SplitPcap processor, Pcap supporting class, and … #8691

Closed

Conversation

JackHintonSmartDCSIT
Copy link
Contributor

@JackHintonSmartDCSIT JackHintonSmartDCSIT commented Apr 23, 2024

Created SplitPcap processor, Pcap supporting class, and attendant unit tests.

This processor is used to split a flowfile containing a large PCAP file (in binary format) into smaller files, depending on the max size set by the user. Each of the resultant flowfiles contains a valid PCAP that may be used as normal.

If the incoming PCAP is malformed, in the wrong format, or the incoming PCAP contains a packet that is larger than the maximum desired size, the incoming flowfile will be transferred to the 'failure' relationship and an 'error' attribute will be added to that flowfile stating the reason for failure.

Summary

NIFI-13082

Tracking

Please complete the following tracking steps prior to pull request creation.

Issue Tracking

Pull Request Tracking

  • Pull Request title starts with Apache NiFi Jira issue number, such as NIFI-00000
  • Pull Request commit message starts with Apache NiFi Jira issue number, as such NIFI-00000

Pull Request Formatting

  • Pull Request based on current revision of the main branch
  • Pull Request refers to a feature branch with one commit containing changes

Verification

Please indicate the verification steps performed prior to pull request creation.

Build

  • Build completed using mvn clean install -P contrib-check
    • JDK 21

Licensing

  • New dependencies are compatible with the Apache License 2.0 according to the License Policy
  • New dependencies are documented in applicable LICENSE and NOTICE files

Documentation

  • Documentation formatting appears as expected in rendered files

Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for proposing this new Processor @JackHintonSmartDCSIT.

There are some coding convention details that I would highlight, but before going into those details, I first recommend moving this processor to the nifi-network-bundle instead of the nifi-standard-processors bundle. Although PCAP is a common format, this Processor seems to fit more naturally under a different bundle.

I also recommend naming this SplitPCAP, following the general NiFi convention of using capital letters for acronyms in Processor class names.

As far as the substance of the code, the relationship of the Pcap class to the source from the Kaitai Project was not quite clear. Do you have a source code reference as well as the general PCAP page listed in the NOTICE file?

I would also be interested in feedback from another project contributor about the maintainability of this Processor, as we want to avoid bringing in components that might not be actively maintained.

@exceptionfactory
Copy link
Contributor

@JackHintonSmartDCSIT I was looking with the GitHub repositories for the Kaitai Project instead of the PCAP home page, now I see the source for the file mentioned.

@exceptionfactory
Copy link
Contributor

As the Pcap class already includes notable differences from the example, it is worth retaining the link, but also refactoring the code to align it with other project conventions, such as removing the use of _ as a prefix in variable names. The use of the ByteBufferInterface also raises some questions about whether the standard Java ByteBuffer could be used. That's on a cursory review, a more detailed evaluation is still necessary.

@dan-s1
Copy link
Contributor

dan-s1 commented May 6, 2024

@JackHintonSmartDCSIT Can you please rebase as there are conflicts due to the latest changes. Thanks!

@dan-s1
Copy link
Contributor

dan-s1 commented May 6, 2024

Thanks for proposing this new Processor @JackHintonSmartDCSIT.

There are some coding convention details that I would highlight, but before going into those details, I first recommend moving this processor to the nifi-network-bundle instead of the nifi-standard-processors bundle. Although PCAP is a common format, this Processor seems to fit more naturally under a different bundle.

I also recommend naming this SplitPCAP, following the general NiFi convention of using capital letters for acronyms in Processor class names.

As far as the substance of the code, the relationship of the Pcap class to the source from the Kaitai Project was not quite clear. Do you have a source code reference as well as the general PCAP page listed in the NOTICE file?

I would also be interested in feedback from another project contributor about the maintainability of this Processor, as we want to avoid bringing in components that might not be actively maintained.

@exceptionfactory My team would certainly be interested in this processor as we have our own custom processor but not with the use of kaitai. I am trying to better understand how kaitai works.

@exceptionfactory
Copy link
Contributor

Thanks for proposing this new Processor @JackHintonSmartDCSIT.
There are some coding convention details that I would highlight, but before going into those details, I first recommend moving this processor to the nifi-network-bundle instead of the nifi-standard-processors bundle. Although PCAP is a common format, this Processor seems to fit more naturally under a different bundle.
I also recommend naming this SplitPCAP, following the general NiFi convention of using capital letters for acronyms in Processor class names.
As far as the substance of the code, the relationship of the Pcap class to the source from the Kaitai Project was not quite clear. Do you have a source code reference as well as the general PCAP page listed in the NOTICE file?
I would also be interested in feedback from another project contributor about the maintainability of this Processor, as we want to avoid bringing in components that might not be actively maintained.

@exceptionfactory My team would certainly be interested in this processor as we have our own custom processor but not with the use of kaitai. I am trying to better understand how kaitai works.

Thanks for the comments @dan-s1, that is helpful.

As noted in the initial review, there are a number of coding convention concerns, along with the location of this Processor in the standard-nar. It should be possible to address those and move this forward, and it looks like there are no dependencies on the Kaitai project, which is useful. I will be glad to give this a closer review after our initial comments have been addressed.

@JackHintonSmartDCSIT
Copy link
Contributor Author

I've made the changes suggested, let me know if I need to do anything else. Thanks!

@exceptionfactory
Copy link
Contributor

Thanks for the updates @JackHintonSmartDCSIT. Please review the comments from @dan-s1 as it looks like there are still several outstanding recommendations, such as using static imports for JUnit assert methods. I plan on taking a closer look soon.

@JackHintonSmartDCSIT
Copy link
Contributor Author

Thanks for the updates @JackHintonSmartDCSIT. Please review the comments from @dan-s1 as it looks like there are still several outstanding recommendations, such as using static imports for JUnit assert methods. I plan on taking a closer look soon.

Ah, I missed those ones - thanks for pointing them out.

Copy link
Contributor

@dan-s1 dan-s1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JackHintonSmartDCSIT Thanks for making all the changes so far.
There are some other changes mostly Intellij suggestions. which I have noted.
In addition I noticed that there is no entry for the SplitPcap processor in nifi-nar-bundles/nifi-network-bundle/nifi-network-processors/src/main/resources/META-INF/services/org.apache.nifi.processor.Processor. That is necessary for NIFI to discover your processor and display it in the UI.

Copy link
Contributor

@dan-s1 dan-s1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JackHintonSmartDCSIT Thanks for the changes, they look great. I have some more stylistic change requests but more critical are the request changes to align this processor with the current family of split processors (SplitAvro, SplitContent, SplitXml, SplitJson and SplitRecord). Also I think it may be helpful in addition to the capability description perhaps adding an additional details section where more details regarding PCAP can be added. You will need to add the directory src/main/resources/docs/org.apache.nifi.processors.network.SplitPCAP and have in there an additional Details.html file. You can copy the one from src/main/resources/docs/org.apache.nifi.processors.network.ParseNetflowv5 and edit accordingly.

@dan-s1
Copy link
Contributor

dan-s1 commented May 16, 2024

@JackHintonSmartDCSIT Looks like you have Checkstyle violations. Please fix them and then I will restart the build.

@JackHintonSmartDCSIT
Copy link
Contributor Author

@JackHintonSmartDCSIT Looks like you have Checkstyle violations. Please fix them and then I will restart the build.

Forgot to add the file prior to committing, oops.

@dan-s1
Copy link
Contributor

dan-s1 commented May 17, 2024

@JackHintonSmartDCSIT Please run mvn clean install -Pcontrib-check locally which will run the checkstyle checks which you can then fix. Thanks!

@dan-s1
Copy link
Contributor

dan-s1 commented May 17, 2024

@exceptionfactory Please advise whether the SplitPcap processor should remain in package org.apache.nifi.processors.network and classes LinkType and PCAP remain in org.apache.nifi.processors.network.util. I was thinking they should all go under a new package named org.apache.nifi.processors.network.pcap.

@exceptionfactory
Copy link
Contributor

@exceptionfactory Please advise whether the SplitPcap processor should remain in package org.apache.nifi.processors.network and classes LinkType and PCAP remain in org.apache.nifi.processors.network.util. I was thinking they should all go under a new package named org.apache.nifi.processors.network.pcap.

Moving the new classes under a pcap package sounds like the best way forward.

@JackHintonSmartDCSIT
Copy link
Contributor Author

I've recently determined that the linkstyle enum is unnecessary for the use case of splitting PCAP files - it would only be useful if further processing was done after parsing. I'll move the other bits into a new package now.

@dan-s1
Copy link
Contributor

dan-s1 commented May 17, 2024

@exceptionfactory Is it okay to have the PCAP class named in all caps (pardon the pun) or should it be camel cased Pcap?

@JackHintonSmartDCSIT
Copy link
Contributor Author

I've noticed it's got checkstyle violations again - the same ones as before - but when I run it using mvn clean install -P contrib-check it doesn't flag them up. Is there some config item I'm missing?

@dan-s1
Copy link
Contributor

dan-s1 commented May 17, 2024

@JackHintonSmartDCSIT I believe @exceptionfactory just had a comment on a PR found here indicating there are new Checkstyle rules . I assume you would have to rebase in order to pick them up.

@JackHintonSmartDCSIT
Copy link
Contributor Author

Ah good to know, thanks. I'll do that now

@exceptionfactory
Copy link
Contributor

@exceptionfactory Is it okay to have the PCAP class named in all caps (pardon the pun) or should it be camel cased Pcap?

Yes, I had originally recommend naming the Processor SplitPCAP, so it seems reasonable to carry that name over. However, it might be better to name it more specifically to something like PcapUtils. I'm fine as it is for now.

@exceptionfactory
Copy link
Contributor

Ah good to know, thanks. I'll do that now

Thanks for making adjustments @JackHintonSmartDCSIT, part of the challenge of continual adjustments and improvements, I appreciate the responsiveness!

@JackHintonSmartDCSIT
Copy link
Contributor Author

Had a minor heart attack there - turns out that rebase went really badly wrong and I managed to revert everything by 2 weeks. A minor scramble through this repo later, I've gotten it back to where it was prior to the 'rebase' I attempted.

@exceptionfactory
Copy link
Contributor

@JackHintonSmartDCSIT I pushed one comment to reformat all of the files. You may have addressed it in an earlier commit that was lost in the rebase, but I noticed almost all the files had a single extra space indenting each line.

Aside from that, the code looks good and runtime testing with various files from WireShark SampleCaptures worked well.

Do you have any further feedback @dan-s1?

Otherwise, I think this should be ready to go pending successful builds!

@JackHintonSmartDCSIT
Copy link
Contributor Author

Ah fair enough, good spot!

@dan-s1
Copy link
Contributor

dan-s1 commented Jun 20, 2024

@exceptionfactory I am good. I think everything looks great!
@JackHintonSmartDCSIT Thank you so much for working this ticket so diligently! You have done a great job to address all issues. This feature will certainly add great value.

@JackHintonSmartDCSIT
Copy link
Contributor Author

@dan-s1 No worries! Thank you both for helping to get it where it is!

Copy link
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your perseverance and responsiveness @JackHintonSmartDCSIT, and thank you @dan-s1 for tag-teaming on this review! This highlights the value of attention to detail on new features, and the importance of responsive collaboration. Thanks again for the work @JackHintonSmartDCSIT! +1 merging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants