Skip to content
This repository

(Back to Code Reference main page)

How to use SharpZipLib to work with GZip and Tar files

GZip and Tar files are commonly encountered together. These samples cover handling them both individually and combined.

Table of Contents on this page

Extract the file within a GZip
Simple full extract from a Tar archive
Simple full extract from a TGZ or .Tar.GZip archive
Extract from a Tar with full control
Create a TGZ (.tar.gz)
Create a TAR or TGZ with control over filenames and data source
Updating files within a .tgz

Extract the file within a GZip

You create a new instance of GZipInputStream, passing in a stream (of any kind) containing the archive. You then read the contents of this stream until eof. This straightforward example shows how to extract the contents of a gzip file, and write the content to a disk file in the nominated directory.

using System;
using System.IO;
using ICSharpCode.SharpZipLib.Core;
using ICSharpCode.SharpZipLib.GZip;

// Extracts the file contained within a GZip to the target dir.
// A GZip can contain only one file, which by default is named the same as the GZip except
// without the extension.
//
public void ExtractGZipSample(string gzipFileName, string targetDir) {

    // Use a 4K buffer. Any larger is a waste.    
    byte[ ] dataBuffer = new byte[4096];

    using (System.IO.Stream fs = new FileStream(gzipFileName, FileMode.Open, FileAccess.Read)) {
        using (GZipInputStream gzipStream = new GZipInputStream(fs)) {

            // Change this to your needs
            string fnOut = Path.Combine(targetDir, Path.GetFileNameWithoutExtension(gzipFileName));

            using (FileStream fsOut = File.Create(fnOut)) {
                StreamUtils.Copy(gzipStream, fsOut, dataBuffer);
            }
        }
    }
}

VB

Imports System
Imports System.IO
Imports ICSharpCode.SharpZipLib.Core
Imports ICSharpCode.SharpZipLib.GZip

' Extracts the file contained within a GZip to the target dir.
' A GZip can contain only one file, which by default is named the same as the GZip except
' without the extension.
'
Public Sub ExtractGZipSample(gzipFileName As String, targetDir As String)

    ' Use a 4K buffer. Any larger is a waste.  
    Dim dataBuffer As Byte() = New Byte(4095) {}

    Using fs As System.IO.Stream = New FileStream(gzipFileName, FileMode.Open, FileAccess.Read)
        Using gzipStream As New GZipInputStream(fs)

            ' Change this to your needs
            Dim fnOut As String = Path.Combine(targetDir, Path.GetFileNameWithoutExtension(gzipFileName))

            Using fsOut As FileStream = File.Create(fnOut)
                StreamUtils.Copy(gzipStream, fsOut, dataBuffer)
            End Using
        End Using
    End Using
End Sub

Simple full extract from a Tar archive

A Tar file or archive is essentially a simple concatenation of multiple files. If you only need to extract all the contents of the tar to a folder path with no conditionals or name transformations, this easy example may be all you need.

using System;
using System.IO;
using ICSharpCode.SharpZipLib.Tar;

public void ExtractTar(String tarFileName, String destFolder) {

    Stream inStream = File.OpenRead(tarFileName);

    TarArchive tarArchive = TarArchive.CreateInputTarArchive(inStream);
    tarArchive.ExtractContents(destFolder);
    tarArchive.Close();

    inStream.Close();
}

VB

Imports System
Imports System.IO
Imports ICSharpCode.SharpZipLib.Tar

Public Sub ExtractTar(tarFileName As String, destFolder As String)

    Dim inStream As Stream = File.OpenRead(tarFileName)

    Dim tarArchive As TarArchive = TarArchive.CreateInputTarArchive(inStream)
    tarArchive.ExtractContents(destFolder)
    tarArchive.Close()

    inStream.Close()
End Sub

Simple full extract from a TGZ (.tar.gz)

A Unix TGZ provides concatenation of multiple files (tar) with compression (gzip). This sample illustrates the automatic extraction capabilities of the library. The folder structure of the Tar archive is preserved, within the nominated target directory.

using ICSharpCode.SharpZipLib.GZip;
using ICSharpCode.SharpZipLib.Tar;

// example: ExtractTGZ(@"c:\temp\test.tar.gz", @"C:\DestinationFolder")

public void ExtractTGZ(String gzArchiveName, String destFolder) {

    Stream inStream = File.OpenRead(gzArchiveName);
    Stream gzipStream = new GZipInputStream(inStream);

    TarArchive tarArchive = TarArchive.CreateInputTarArchive(gzipStream);
    tarArchive.ExtractContents(destFolder);
    tarArchive.Close();

    gzipStream.Close();
    inStream.Close();
}

VB

Imports ICSharpCode.SharpZipLib.GZip
Imports ICSharpCode.SharpZipLib.Tar

' for example:  ExtractTGZ("c:\temp\test.tar.gz", "C:\DestinationFolder")

Public Sub ExtractTGZ(ByVal gzArchiveName As String, ByVal destFolder As String)

    Dim inStream As Stream = File.OpenRead(gzArchiveName)
    Dim gzipStream As Stream = New GZipInputStream(inStream)

    Dim tarArchive As TarArchive = TarArchive.CreateInputTarArchive(gzipStream)
    tarArchive.ExtractContents(destFolder)
    tarArchive.Close()

    gzipStream.Close()
    inStream.Close()
End Sub

Extract from a Tar with full control

By contrast with the sample above, this sample traverses through the tar, one entry at a time, extracting the contents to the nominated folder and allowing for skipping or renaming of individual entries. Updated: Also handles Ascii translate, and fixes problem if TAR entry filename begins with a "\". Now sets the file date/time.

using System;
using System.IO;
using ICSharpCode.SharpZipLib.Tar;

// Iterates through each file entry within the supplied tar,
// extracting them to the nominated folder.
//
public void ExtractTarByEntry(string tarFileName, string targetDir, bool asciiTranslate) {

    using (FileStream fsIn = new FileStream(tarFileName, FileMode.Open, FileAccess.Read)) {
        TarInputStream tarIn = new TarInputStream(fsIn);
        TarEntry tarEntry;
        while ((tarEntry = tarIn.GetNextEntry()) != null) {

            if (tarEntry.IsDirectory) {
                continue;
            }
            // Converts the unix forward slashes in the filenames to windows backslashes
            //
            string name = tarEntry.Name.Replace('/', Path.DirectorySeparatorChar);

            // Remove any root e.g. '\' because a PathRooted filename defeats Path.Combine
            if (Path.IsPathRooted(name)) {
                name = name.Substring(Path.GetPathRoot(name).Length);
            }

            // Apply further name transformations here as necessary
            string outName = Path.Combine(targetDir, name);

            string directoryName = Path.GetDirectoryName(outName);
            Directory.CreateDirectory(directoryName);       // Does nothing if directory exists

            FileStream outStr = new FileStream(outName, FileMode.Create);

            if (asciiTranslate) {
                CopyWithAsciiTranslate(tarIn, outStr);
            }
            else {
                tarIn.CopyEntryContents(outStr);
            }
            outStr.Close();
            // Set the modification date/time. This approach seems to solve timezone issues.
            DateTime myDt = DateTime.SpecifyKind(tarEntry.ModTime, DateTimeKind.Utc);
            File.SetLastWriteTime(outName, myDt);
        }
        tarIn.Close();
    }
}

private void CopyWithAsciiTranslate(TarInputStream tarIn, Stream outStream) {
    byte[ ] buffer = new byte[4096];
    bool isAscii = true;
    bool cr = false;

    int numRead = tarIn.Read(buffer, 0, buffer.Length);
    int maxCheck = Math.Min(200, numRead);
    for (int i = 0; i < maxCheck; i++) {
        byte b = buffer[i];
        if (b < 8 || (b > 13 && b < 32) || b == 255) {
            isAscii = false;
            break;
        }
    }
    while (numRead > 0) {
        if (isAscii) {
            // Convert LF without CR to CRLF. Handle CRLF split over buffers.
            for (int i = 0; i < numRead; i++) {
                byte b = buffer[i]; // assuming plain Ascii and not UTF-16
                if (b == 10 && !cr)     // LF without CR
                    outStream.WriteByte(13);
                cr = (b == 13);

                outStream.WriteByte(b);
            }
        }
        else {
            outStream.Write(buffer, 0, numRead);
        }
        numRead = tarIn.Read(buffer, 0, buffer.Length);
    }
}

VB

Imports System
Imports System.IO
Imports ICSharpCode.SharpZipLib.Tar

' Iterates through each file entry within the supplied tar,
' extracting them to the nominated folder.
'
Public Sub ExtractTarByEntry(tarFileName As String, targetDir As String)

    Using fsIn As New FileStream(tarFileName, FileMode.Open, FileAccess.Read)

        ' The TarInputStream reads a UNIX tar archive as an InputStream.
        '
        Dim tarIn As New TarInputStream(fsIn)

        Dim tarEntry As TarEntry

        While (InlineAssignHelper(tarEntry, tarIn.GetNextEntry())) IsNot Nothing

            If tarEntry.IsDirectory Then
                Continue While
            End If
            ' Converts the unix forward slashes in the filenames to windows backslashes
            '
            Dim name As String = tarEntry.Name.Replace("/"C, Path.DirectorySeparatorChar)

            ' Apply further name transformations here as necessary
            Dim outName As String = Path.Combine(targetDir, name)

            Dim directoryName As String = Path.GetDirectoryName(outName)
            Directory.CreateDirectory(directoryName)

            Dim outStr As New FileStream(outName, FileMode.Create)
            If asciiTranslate Then
                CopyWithAsciiTranslate(tarIn, outStr)
            Else
                tarIn.CopyEntryContents(outStr)
            End If
            outStr.Close()
            ' Set the modification date/time. This approach seems to solve timezone issues.
            Dim myDt As DateTime = DateTime.SpecifyKind(tarEntry.ModTime, DateTimeKind.Utc)
            File.SetLastWriteTime(outName, myDt)
        End While
        tarIn.Close()
    End Using
End Sub

Private Sub CopyWithAsciiTranslate(tarIn As TarInputStream, outStream As Stream)
    Dim buffer As Byte() = New Byte(4095) {}
    Dim isAscii As Boolean = True
    Dim cr As Boolean = False

    Dim numRead As Integer = tarIn.Read(buffer, 0, buffer.Length)
    Dim maxCheck As Integer = Math.Min(200, numRead)
    For i As Integer = 0 To maxCheck - 1
        Dim b As Byte = buffer(i)
        If b < 8 OrElse (b > 13 AndAlso b < 32) OrElse b = 255 Then
            isAscii = False
            Exit For
        End If
    Next
    While numRead > 0
        If isAscii Then
            ' Convert LF without CR to CRLF. Handle CRLF split over buffers.
            For i As Integer = 0 To numRead - 1
                Dim b As Byte = buffer(i)   ' assuming plain Ascii and not UTF-16
                If b = 10 AndAlso Not cr Then   ' LF without CR
                    outStream.WriteByte(13)
                End If
                cr = (b = 13)

                outStream.WriteByte(b)
            Next
        Else
            outStream.Write(buffer, 0, numRead)
        End If
        numRead = tarIn.Read(buffer, 0, buffer.Length)
    End While
End Sub

Create a TGZ (.tar.gz)

This shows how to create a tar archive and gzip that at the same time. This example recurses down a directory structure adding all the files.

For more advanced options giving control over filenames and data source, see the next example.

using System;
using System.IO;
using ICSharpCode.SharpZipLib.GZip;
using ICSharpCode.SharpZipLib.Tar;

// Calling example
    CreateTarGZ(@"c:\temp\gzip-test.tar.gz", @"c:\data");


private void CreateTarGZ(string tgzFilename, string sourceDirectory) {

    Stream outStream = File.Create(tgzFilename);
    Stream gzoStream = new GZipOutputStream(outStream);
    TarArchive tarArchive = TarArchive.CreateOutputTarArchive(gzoStream);

    // Note that the RootPath is currently case sensitive and must be forward slashes e.g. "c:/temp"
    // and must not end with a slash, otherwise cuts off first char of filename
    // This is scheduled for fix in next release
    tarArchive.RootPath = sourceDirectory.Replace('\\', '/');
    if (tarArchive.RootPath.EndsWith("/"))
        tarArchive.RootPath = tarArchive.RootPath.Remove(tarArchive.RootPath.Length - 1);

    AddDirectoryFilesToTar(tarArchive, sourceDirectory, true);

    tarArchive.Close();
}
private void AddDirectoryFilesToTar(TarArchive tarArchive, string sourceDirectory, bool recurse) {

    // Optionally, write an entry for the directory itself.
    // Specify false for recursion here if we will add the directory's files individually.
    //
    TarEntry tarEntry = TarEntry.CreateEntryFromFile(sourceDirectory);
    tarArchive.WriteEntry(tarEntry, false);

    // Write each file to the tar.
    //
    string[] filenames = Directory.GetFiles(sourceDirectory);
    foreach (string filename in filenames) {
        tarEntry = TarEntry.CreateEntryFromFile(filename);
        tarArchive.WriteEntry(tarEntry, true);
    }

    if (recurse) {
        string[] directories = Directory.GetDirectories(sourceDirectory);
        foreach (string directory in directories)
            AddDirectoryFilesToTar(tarArchive, directory, recurse);
    }
}

VB

Imports System
Imports System.IO
Imports ICSharpCode.SharpZipLib.GZip
Imports ICSharpCode.SharpZipLib.Tar

' Calling example
    CreateTarGZ(@"c:\temp\gzip-test.tar.gz", @"c:\data");


Private Sub CreateTarGZ(tgzFilename As String, sourceDirectory As String)
    Dim outStream As Stream = File.Create(tgzFilename)
    Dim gzoStream As Stream = New GZipOutputStream(outStream)
    Dim tarArchive__1 As TarArchive = TarArchive.CreateOutputTarArchive(gzoStream)

    ' Note that the RootPath is currently case sensitive and must be forward slashes e.g. "c:/temp"
    ' and must not end with a slash, otherwise cuts off first char of filename
    ' This is scheduled for fix in next release
    tarArchive__1.RootPath = sourceDirectory.Replace("\"C, "/"C)
    If tarArchive__1.RootPath.EndsWith("/") Then
        tarArchive__1.RootPath = tarArchive__1.RootPath.Remove(tarArchive__1.RootPath.Length - 1)
    End If

    AddDirectoryFilesToTar(tarArchive__1, sourceDirectory, True)

    tarArchive__1.Close()
End Sub
Private Sub AddDirectoryFilesToTar(tarArchive As TarArchive, sourceDirectory As String, recurse As Boolean)

    ' Optionally, write an entry for the directory itself.
    ' Specify false for recursion here if we will add the directory's files individually.
    '
    Dim tarEntry__1 As TarEntry = TarEntry.CreateEntryFromFile(sourceDirectory)
    tarArchive.WriteEntry(tarEntry__1, False)

    ' Write each file to the tar.
    '
    Dim filenames As String() = Directory.GetFiles(sourceDirectory)
    For Each filename As String In filenames
        tarEntry__1 = TarEntry.CreateEntryFromFile(filename)
        tarArchive.WriteEntry(tarEntry__1, True)
    Next

    If recurse Then
        Dim directories As String() = Directory.GetDirectories(sourceDirectory)
        For Each directory__2 As String In directories
            AddDirectoryFilesToTar(tarArchive, directory__2, recurse)
        Next
    End If
End Sub

Create a TAR or TGZ with control over filenames and data source

This shows how to create a TAR or TAR.GZ archive, using manual creation of entries and copying data to output. This sample shows the processing of files in a directory, and recursing down the directory structure.

To illustrate how to create TAR entries from any stream data, in this example we use the following construct: (Note that the type is the abstract Stream class.)

Stream inputStream = File.OpenRead(filename)

You can replace this with a Stream sourced in any other way - for example a MemoryStream (it does not have to be a File stream).

using System;
using System.IO;
using ICSharpCode.SharpZipLib.Tar;

public void TarCreateFromStream() {

    // Create an output stream. Does not have to be disk, could be MemoryStream etc.
    string tarOutFn = @"c:\temp\test.tar";
    Stream outStream = File.Create(tarOutFn);

    // If you wish to create a .Tar.GZ (.tgz):
    // - set the filename above to a ".tar.gz",
    // - create a GZipOutputStream here
    // - change "new TarOutputStream(outStream)" to "new TarOutputStream(gzoStream)"
    // Stream gzoStream = new GZipOutputStream(outStream);
    // gzoStream.SetLevel(3); // 1 - 9, 1 is best speed, 9 is best compression

    TarOutputStream tarOutputStream = new TarOutputStream(outStream);

    CreateTarManually(tarOutputStream, @"c:\temp\debug");

    // Closing the archive also closes the underlying stream.
    // If you don't want this (e.g. writing to memorystream), set tarOutputStream.IsStreamOwner = false
    tarOutputStream.Close();
}

private void CreateTarManually(TarOutputStream tarOutputStream, string sourceDirectory) {

    // Optionally, write an entry for the directory itself.
    //
    TarEntry tarEntry = TarEntry.CreateEntryFromFile(sourceDirectory);
    tarOutputStream.PutNextEntry(tarEntry);

    // Write each file to the tar.
    //
    string[] filenames = Directory.GetFiles(sourceDirectory);

    foreach (string filename in filenames) {

        // You might replace these 3 lines with your own stream code

        using (Stream inputStream = File.OpenRead(filename)) {

            string tarName = filename.Substring(3); // strip off "C:\"

            long fileSize = inputStream.Length;

            // Create a tar entry named as appropriate. You can set the name to anything,
            // but avoid names starting with drive or UNC.

            TarEntry entry = TarEntry.CreateTarEntry(tarName);

            // Must set size, otherwise TarOutputStream will fail when output exceeds.
            entry.Size = fileSize;

            // Add the entry to the tar stream, before writing the data.
            tarOutputStream.PutNextEntry(entry);

            // this is copied from TarArchive.WriteEntryCore
            byte[] localBuffer = new byte[32 * 1024];
            while (true) {
                int numRead = inputStream.Read(localBuffer, 0, localBuffer.Length);
                if (numRead <= 0) {
                    break;
                }
                tarOutputStream.Write(localBuffer, 0, numRead);
            }
        }
        tarOutputStream.CloseEntry();
    }


    // Recurse. Delete this if unwanted.

    string[] directories = Directory.GetDirectories(sourceDirectory);
    foreach (string directory in directories) {
        CreateTarManually(tarOutputStream, directory);
    }
}

Updating files within a .tgz (.tar.gzip)

The Unix .tgz or .tar.gz format is almost the equivalent of a Zip archive in Windows, but this combination does not allow directly adding or replacing files within the archive. This is because all the files are concatenated into a single file (tar) which is then compressed as a unit.

Updating items within this would require the decompressing into the original tar, creating a new tar from the old one plus changes, and recompressing the entire thing.

Back to Code Reference main page

Something went wrong with that request. Please try again.