Skip to content

Tar on Windows is very slow when extracting many small files from tar files #27

@warpdesign

Description

@warpdesign

Environment

Item Value
OS, Version / Build Windows 10 version 10.0.19041.388
Processor Architecture AMD64
Processor Type & Model Core i5 (Skylake)
Memory 8GB
Storage Type, free / capacity (e.g. C: SSD 128GB / 512GB) SSD 256GB
Relevant apps installed ___

Description

Creating multiple small files is very slow: Windows Defender's real-time protection appears to makes things even slower but even when it's disabled Windows appears to lag behind Linux and macOS when handling multiple small files.

To show how slow Windows is at creating lots of small files, I downloaded the Firefox sources. Since it's a tar.bz2 file, I first uncompressed the .bz2 file to get the .tar file and all tests were done using this file as a source. It's a 852mb file that flattens to a directory containing 119 954 files.

I know it's an extreme case, but it's not rare having to deal with thousand of small files when working with npm, git, development, etc..

I added macOS results (running on a slow MacBook with a core m3) as a comparison.

Operation Windows 10 (protection on) Windows 10 (protection on, folder excluded) Windows 10 (protection disabled) Ubuntu (WSL2) macOS
flatten tar file 28m38s 3m14s 1m29s 15s 50s
delete directory 58s 57s 45s 5s 14s

The commands I used were:

  • Mac & Linux: time tar -xf firefox-40.0.source.tar and time rm -Rf mozilla-release
  • Windows (PowerShell): Measure-Command { tar xf .\firefox-40.0.source.tar } and Measure-Command { rm -r -fo .\mozilla-release\ }

I am sure that improving these file operations will benefit to a lot of different Windows use cases:

  • installing Windows apps
  • installing dev packages, like npm, ruby gems,...
  • git operations on large repositories
  • building apps/websites (webpack, C/C++ compilers,...)
  • apps like VSCode that need to deal with lots of files in the background
  • installing Windows updates
  • even browsers like Chrome/Edge have caches with lots of files in it
  • ...

Windows & Linux (WSL2) tests were run on a Surface Book 1/core i5/8gb/256gb SSD.
Mac tests were run on a 2017 MacBook 12" with a core m3/256gb SSD/8gb and Catalina 10.15.3

Applications used:

App Windows 10 Ubuntu (WSL2) macOS
tar bsdtar 3.3.2 - libarchive 3.3.2 zlib/1.2.5.F-ipp (tar.exe that's included in Windows 10) tar (GNU tar) 1.30 bsdtar 3.3.2 - libarchive 3.3.2 zlib/1.2.11 liblzma/5.0.5 bz2lib/1.0.6
delete tool rm -r -fo (powershell alias) rm (GNU coreutils) 8.30 rm from Catalina 10.15.3

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions