Implement some file type identification heuristics based on: * Text file extensions, text file paths * Binary file extensions, binary file paths * Final catch-all "unknown" (treat as potential binary)