New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak in ndpi_free_flow() #994
Comments
|
@FitzBC Hi, can you please provide the version of nfstream you use. Which interpreter (PyPy or CPython) and its version. How did you confirm that it's a memory leak? It was on live or offline mode? BR, |
|
@aouinizied Thanks for your reply, I am using offline mode, and my relevant version is as follows:
For the confirm of a memory leak, first, during the running of my program, the memory usage rate keeps rising and eventually is forced to end. You can use the following simple test code to verify(Because it is a single process, the memory usage rate increases slowly, about 2 MB/s. You can use the process pool to speed up this process): from nfstream import NFStreamer
import os
def nfs_judge_process(pcap_name, return_dict=None):
my_awesome_streamer = NFStreamer(source=pcap_name, # or network interface
snaplen=1,
idle_timeout=0,
active_timeout=0,
# plugins=(),
dissect=True,
max_tcp_dissections=1,
max_udp_dissections=1,
statistics=False,
enable_guess=True,
decode_tunnels=True,
bpf_filter=None,
promisc=False
)
data_pandas = my_awesome_streamer.to_pandas(ip_anonymization=False)
application_name = data_pandas[0:1]['application_name'][0]
return application_name
def main():
pcap_dir='your_pcap_dir'
while(1):
for path, dir_list, file_list in os.walk(pcap_dir):
file_list.sort()
for pcap_name in file_list:
nfs_judge_process(os.path.join(pcap_dir, pcap_name))
if __name__ == "__main__":
main()Second, I used muppy to help locate. My code is as follows: from pympler import muppy, summary
all_objects_1 = muppy.get_objects()
sum1 = summary.summarize(all_objects_1)
nfs_judge_process(os.path.join(pcap_dir, pcap_name))
sum2 = summary.summarize(muppy.get_objects())
diff = summary.get_diff(sum1, sum2)
summary.print_(diff)The output of each loop is similar to the following output: |
|
@FitzBC Thanks for the information.
|
In addition, I think that even if each packet has a single flow, if each flow will release the memory normally after it is used, it will not cause the memory to rise rapidly. Finally, I found an interesting thing. In the nfs_judge_process() function of the sample code I provided, if you delete the following two lines and replace them with # data_pandas = my_awesome_streamer.to_pandas(ip_anonymization=False)
# application_name = data_pandas[0:1]['application_name'][0]
del my_awesome_streamer |
|
@FitzBC Thank you for the provided details.
BR, |
I am using
nfstreamto detect traffic, and thedissectoption is turned on, which means that nfstream will usenDPIto complete this task. But I encountered a memory leak problem during use. Nfstream is developed based on the python language, memory leaks are rarely encountered, so I started to locate the cause.In nfstream, I located the reason for this line of code : plugin.py. In fact, the code calls the
ndpi_flow_free()function in nDPI: ndpi_main.c. By comparing the memory release operation in ndpi_flow_free() and the prototype of the ndpi_flow_struct structure, I think that ndpi_flow_free() has not completely released all the variables in ndpi_flow_struct.I think this may be the root cause of the memory leak.
The text was updated successfully, but these errors were encountered: