Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to determine when files are finished #11

Closed
sjkjs opened this issue Oct 11, 2012 · 8 comments
Closed

Unable to determine when files are finished #11

sjkjs opened this issue Oct 11, 2012 · 8 comments

Comments

@sjkjs
Copy link

sjkjs commented Oct 11, 2012

I'm trying to feed the output files from tcpflow into another program. I use inotify to monitor for files that have been closed and pick them up that way.

However, for each packet tcpflow receives, it seems to open the output file, write the data, then close the file. This means that inotify picks up the file after the very first packet.

Is it possible to keep the files open until a FIN/RST has been received (or a timeout), then close them?

@simsong
Copy link
Owner

simsong commented Oct 11, 2012

Files are closed when the file handles are needed elsewhere and not until then, but they are also closed when the file is finished. There are better ways to know when the file is done. For example, you could monitor the XML file, or you could have a pipe for the XML output. Or some IPC mechanism. What's your preference? Having the XML go to a named pipe or a numbered pipe would be easiest.

@sjkjs
Copy link
Author

sjkjs commented Oct 11, 2012

Thanks for the quick reply!

Interesting - all my files seem to be getting closed almost immediately. I wonder if inotify is interfering with it somehow.

I can monitor the XML file for changes, but I was concerned that it would just keep growing and become massive if I don't restart tcpflow occasionally. Is a file guaranteed to be finished once it's been entered into the XML file?

Alternatively, do you think it makes sense to write the files in /tmp until they're complete, at which point they get moved to the output directory (and the XML gets written to)?

@simsong
Copy link
Owner

simsong commented Oct 11, 2012

Nothing is guaranteed in this life. If out-of-order packets are delivered, the file will be re-opened so that they can be written. Many people who use tcpflow think that tcp is much cleaner than it actually is.

@sjkjs
Copy link
Author

sjkjs commented Oct 11, 2012

So if I want to process then delete the output files, is there any way I can be certain that the file won't be reopened, without completely killing tcpflow?

Am I better off doing something like waiting for the entry in the XML file, then waiting another 5 minutes before processing the file?

@simsong
Copy link
Owner

simsong commented Oct 11, 2012

What problem are you really trying to solve? We are building an API for tcpflow that will allow you to link in a shared library.

@sjkjs
Copy link
Author

sjkjs commented Oct 12, 2012

I'm using tcpflow as one component for a network-based IDS. I feed the output files from tcpflow to foremost (to carve actual PDF/JPG/etc files from things like HTTP responses), then do analysis on the output files from foremost to look for things like shellcode.

So, tcpflow will be essentially running 24/7, constantly outputting more data.

I did a quick test where I did what I described in my previous comment (wait for the XML entry, then sleep for a little while before touching the file) and it appeared to work so far - but I haven't extensively tested it yet.

@simsong
Copy link
Owner

simsong commented Oct 12, 2012

If you have the XML file go to a pipe and read the pipe, you should be fine. I do not think that the XML file flushes after each connection, but you could easily add that.

@simsong
Copy link
Owner

simsong commented Dec 16, 2012

The XML file now flushes after each connection. I want to modify the system so that the connections only terminate when both the FIN is received and all of the data is complete.

@simsong simsong closed this as completed Dec 16, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants