Skip to content
This repository has been archived by the owner on Jun 6, 2024. It is now read-only.
Ralf Kilian edited this page Jun 6, 2024 · 6 revisions

Dive Dive logo

Table of contents


Components

Dive Content File Builder

The content file builder stores the content information of a directory or medium into a content file. Content information can be excluded from the file by using an exclude pattern which supports wildcards as well as regular expression syntax.

Dive Content Finder

The content finder searches the content files for a given search term. It also allows using a search pattern which supports wildcards as well as regular expression syntax.

Top

Content files

Before Dive can be used to find any data, it needs to read out the content information from the media whose contents should be searchable and write that information into content files, which (for now) simply are unencrypted plain text files.

These content files can be rudimentarily compared with image files, with the difference, that image files contain actual data instead of just some content information like file and directory names.

They can be created from all kinds of storage media (such as hard disks, flash drives, memory cards, network drives, floppy disks, data CDs, data DVDs, etc.) as long as the media that contains the data is supported and readable by the system.

As soon as the content file has been created, the medium is no longer required to locate data on it, but it is still required to access the data, of course. However, if you afterwards change any data on that medium (e. g. by adding or deleting files), you have to create a new content file, because it will not be updated automatically.

Top

Deep Dive feature

This feature allows to additionally store the content information of certain archive file types, but only if these are not password protected.

The following archive formats can be processed:

  • ACE (requires the UnACE tool, see requirements below for further information)
  • RAR (requires the UnRAR tool, see requirements below for further information)
  • TAR (supported natively, also if bzip2 or gzip compressed)
  • ZIP (supported natively).

Requirements

Python framework

The Deep Dive feature requires Python 2.7 or higher as it will not work with earlier versions.

Third-party tools

Depending on the archive file types from which you want to read out any content information, some additional file archivers or extraction tools are required. These are available for various platforms and can be downloaded for free.

The following tools are supported:

  • UnACE (version 2.50 or higher is recommended, may also work with earlier versions)
  • UnRAR (version 4.2.3 or higher is recommended, may also work with earlier versions)

Top

End-of-life

Dead end signOn April 2018 the project was officially discontinued.

The main reason for this decision was the fact that today, compact discs have been largely replaced by flash media and external hard disk drives with quite some space, so Dive has lost the most of its importance.

Furthermore, the project is way too immature in modern times. For example, like in all its predecessors, the information is being stored inside plain, unencrypted text files instead of a database.

However, it may still be useful for people without any demands who use e. g. multiple flash media, hard disk drives or whatever.

Feel free to fork!

Top

Useless facts

Miscellaneous

  • The first version uploaded on this GitHub repository was Dive 2.1.3 built on February 19th, 2018.
  • Before uploading, the project has neither been changed nor even touched for almost three years.

History

In the late 1990's I had many compact discs containing tools, projects and backup data. Finding data often took some time, because I had to insert the disc from which I thought it contained the files I needed, but sometimes I inserted the wrong disc, so I had to remove it from the drive again and look for the data on another one.

Due to this, I wrote CDCF (Compact Disc Content Finder), a very basic but quite fast interactive command line tool for MS-DOS. The tool consisted of two simple components, a content file builder which stored the content information of a disc into a file and a content finder which allowed searching the created content files for a given search term.

A few days later, a buddy saw me using this tool and was very interested in it, due to the fact, that he also had quite some data discs. At that time CDCF was still pretty alpha version like, so I had to revise it a little to make it more user-friendly. I gave him the revised copy of the tool and after some weeks quite many of my buddies used it.

In 2004 some of them asked me if I could rewrite CDCF as a Windows application with a graphical user interface. I did not have much spare time back then, so I created CDCFWin, a temporary solution that simply had the demanded interface but no new features at all.

Three years later CDCFWin was replaced by DataInventory, a completely new project that did not contain any source code from its predecessors. It also had a new easy-to-use interface as well as more features like excluding data when creating content files or using wildcards when searching for some content.

The DataInventory project worked fine so far, but I did not like the facts that it was slower than both of its predecessors and that it was running on Windows operating systems, only. So, I decided to completely redevelop the tool once again (also without any code from its predecessors) with the primary target to make it faster as well as platform independent.

The new project also needed a name. A buddy suggested to call it MILF (Media I'd Like to Find) just for fun, but due to the ambiguousness of that abbreviation, I did not want use that name. Then, I decided to call it Dive (Data Inventory with Various Enhancements), because it provides the features of its predecessor as well as some new ones.

Planned features

The following project features were planned, but never implemented.

  • Additional database support to write the content information into a database instead of files.
  • Additional content information in general (e. g. file size as well as create, access and modify date).
  • Additional content information depending on the file type (e. g. ID3 tags from MP3 files).
  • Feature to automatically update existing content files if the content on a medium has changed.
  • Optional encryption of content files.
  • Platform independent graphical user interface (based on Qt, using PyQt or PySide).

Top