Skip to content

droduit/pictDBM

Repository files navigation

Pictures Database Manager

Description

This project is a command line utility tool for managing images in a specific format database. This is an inspired and simplified version of the "Haystack" system used by Facebook.

Social networks have to manage hundreds of millions of images. Usual file systems (such as the one used on your hard disk) have efficiency problems with such numbers of files. Moreover, they are not designed to handle the fact that we want to have each of these images in several resolutions, for example very small (icon), medium for a quick "preview" and in normal size (original resolution ).

In the “Haystack” approach, several images are in the same file. Also, different resolutions of the same image are stored automatically. This single file contains both data (images) and metadata (information about each image). The key idea is that the image server has a copy of this metadata in memory, in order to allow very fast access to a specific photo, and in the correct resolution.

This approach has a number of advantages: first, it reduces the number of files managed by the operating system; on the other hand, it makes it possible to elegantly implement two important aspects of the management of an image database:

  1. automatic management of different image resolutions, in our case the three supported resolutions;
  2. the possibility of not duplicating identical images submitted under different names (eg by different users at Facebook); it is an extremely useful optimization in any social network.

This “deduplication” is done using a “hash function” which summarizes binary content (in our case an image) into a much shorter signature. We use here the “SHA-256” function which summarizes all binary content in 256 bits, with the interesting cryptographic property that the function is resistant to collisions: for a given image, it is practically impossible to create another image which would have the same signature.

Preview

pictDBM_server

How to install

  1. Clone this git repository locally

  2. Make sure the following packages are installed :

    If not, MacOS: brew install libvips pkg-config

  3. From the root of the project, run cd libmongoose && make clean && make all.

  4. From the root of the project, run make clean-all && make all.

  5. Copy libmongoose/libmongoose.so into the root folder: cp libmongoose/libmongoose.so libmongoose.so.

  6. Run the server with make server.

  7. Open localhost:8000 on any browser.

Makefile commands

  • make clean-all Clear all objects files and executables generated by a call to make
  • make server Launch the server, reachable on your web browser at localhost:8000 (default value)
  • make style Apply astyle on the whole project's .c and .h files

Commands available

./pictDBM [COMMAND] [ARGUMENTS]
  • help displays this help.

  • list <dbfilename> list pictDB content.

  • create <dbfilename> [options] create a new pictDB.

      options are: 
      	-max_files <MAX_FILES> : maximum number of files.
      	-thumb_res <X_RES> <Y_RES> : resolution for thumbnail images.
      	-small_res <X_RES> <Y_RES> : resolution for small images.
    
  • read <dbfilename> <pictID> [original|orig|thumbnail|thumb|small]
    read an image from the pictDB and save it to a file.
    default resolution is "original".

  • insert <dbfilename> <pictID> <filename>
    insert a new image in the pictDB.

  • delete <dbfilename> <pictID>
    delete picture pictID from pictDB.

  • gc <dbfilename> <tmp dbfilename>
    performs garbage collecting on pictDB. Requires a temporary filename for copying the pictDB.

Authors

Note

Project completed within the context of the EPFL course « System programming project »