Skip to content

Commit

Permalink
Documentation updates.
Browse files Browse the repository at this point in the history
  • Loading branch information
csparker247 committed Jun 20, 2013
1 parent 57e2377 commit 37bc31c
Show file tree
Hide file tree
Showing 7 changed files with 101 additions and 27 deletions.
120 changes: 96 additions & 24 deletions MANUAL.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,26 +57,28 @@ organization scheme you please. The VisCenter uses `MVDaily_[YYYYMMDD]`. It real
you pick or even if you're consistent in naming. As long as you can navigate your data, what's much more
important is what goes inside of these Daily folders.

Each Daily folder should have a folder for each folio shot on that day. The names of the folio folders
Each Daily folder should have a folder for each page shot on that day. The names of the folio folders
should be the volume identifier, a dash, and the folio identifying number. _(e.g. The folio folder name
for the 53 page of The Iliad could be "Iliad-53")_ The volume identifier is, of course, variable, but
should remain consistent for all folios from a single volume. The '-' acts as a delimiter to separate
the volume ID from the folio ID, therefore dashes **should not** be used in the volume ID. We prefer
underscores for further volume differentiation. For example, in a collection of numbered volumes 1-5,
your folio folder names could be: `NPM_1-1`, `NPM_1-2`, `NPM_2-1`, etc. Any character preceding the `-`
will be used as part of the volume ID during file reorganization. However, folio IDs should ONLY be
numbers. If you have more than 10 pages, it's a good idea to use leading zeros in your folio IDs
_(e.g. Iliad-001)_. This will make the report generated by `summarize.sh` a bit easier to read.
will be used as part of the volume ID during file reorganization. Any characters following the `-`
will be used as folio ID number. Normally these will just be numbers, but any alphanumeric character can
be used. This is useful if you prefer to use a recto-verso file organization. If you have more than 10
pages, it's a good idea to use leading zeros in your folio IDs _(e.g. Iliad-001)_. This will make the
report generated by `summarize.sh` a bit easier to read.

Inside each folio folder should be a subdirectory named Mega and Processed. Mega houses a RAW DNG version
of each exposure for a particular folio. Processed contains a 16-bit TIFF version of the same.
multispectral-toolkit only ever works with the TIFFs in the Processed folder, so you can safely ignore
the DNGs in your post-processing workflow.
the Mega folder and the DNGs in your post-processing workflow.

The TIFFs inside of the Processed folder also have their own naming conventions. Using the EurekaVision
imaging system, they will often automatically be named `[Volume ID]-[Folio ID #]_###.tif`. Note
the sequential numbering for each exposure and the file extension. multispectral-toolkit expects 14
exposures in this Processed folder, each exposed under specific circumstances and embedded with certain
the sequential numbering for each exposure and the file extension. The multispectral-toolkit was built to expect
14 exposures in this Processed folder, each exposed under specific circumstances and embedded with certain
metadata. This will be discussed in more detail in the **_[Metadata](#metadata)_** section. For filename purposes, these
sequential numbers should be `001-014`. The file extension should also be the three-letter `.tif` and not the
four-letter `.tiff` variant.
Expand All @@ -86,7 +88,7 @@ Thus, a completely compliant file path for a TIFF of the sixth exposure in a Pro

One last note: If you find files aren't getting processed, check for special characters in the paths to
your files. Spaces, punctuation, and other special characters can cause issues. A good rule of thumb is
to only use the A\-Z characters, numbers, underscores, and dashes.
to only use the alphanumeric characters, underscores, and dashes.


<a id="flatfields" />_**FLATFIELDS**_:
Expand All @@ -97,7 +99,7 @@ images that can correct for exposure variations (caused by lens vignetting or un
to images shot under the same exposure conditions. Applying a flatfield to itself would create an all-white
image.

The multispectral-toolkit assumes that each Daily folder will also contain a corresponding flatfields folder.
The multispectral-toolkit assumes that each Daily folder will also contain a single corresponding flatfields folder.
This flatfields folder should have the prefix `FLATS_`. Any characters following the underscore are
ignored, though it is usually most useful to include a date that matches the date of the Daily folder.
The subdirectory structure of this folder should match that of a folio folder and should include a
Expand All @@ -109,14 +111,18 @@ environment in which imaging occurs. Any variation in exposure times, position o
even zoom should come with a corresponding reacquisition of flatfield images. It is usually easiest to
treat these changes as the time to switch to a new Dailies folder, acquiring a new set of flatfields
appropriately.

In some cases, datasets might contain multiple flatfield folders. In these instances, it can often be
difficult to identify which flatfields should be used for a particular folio. Running `summarize.sh` on
such a dataset will produce a log file that will aid in identifying the correct flatfields.


<a id="metadata" />_**METADATA**_:
Each image acquired using the "Standard" EurekaVision workflow is embedded with metadata in EXIF tags. For
the multispectral-toolkit, the most relevant information is the exposure information, particularly the
wavelength used to expose the image. The VisCenter workflow involves exposing a set of images exposed
under a particular sequence of wavelengths. Though exposure times may very across exposure environments,
the order in which particular wavelengths are acquired does not change. For example, in a good data set, the
the order in which particular wavelengths are acquired typically does not change. For example, in a good data set, the
second image will always represent an image acquired under 365nm (Ultraviolet) light. See the N-Shot list
below for more information.

Expand All @@ -141,6 +147,9 @@ table. The "Standard" N-shot table that the multispectral-toolkit ideally operat
* 013 \- 940nm IR940
* 014 \- 450nm Royal Blue

`spectralize.sh` has been updated so that it is more resilient to inconsistencies in file naming and metadata.
It should automatically arrange files for measurement based on embedded metadata and not file organization.

Running `summarize.sh` on your dataset will generate a report that summarizes your dataset's organization and
embedded multispectral information. Warnings in this report can point you to images that don't have a corresponding
flatfield or that don't match the "Standard" EurekaVision workflow.
Expand All @@ -158,9 +167,9 @@ the install instructions. From there, run the following commands to install all
> $ cd ~/source/multispectral-toolkit/flatfield
> $ make
_NOTE: If you have ffmpeg installed (or any other package that requires libav), OpenCV will link against
_NOTE: If you have ffmpeg installed (or any other package that uses or installs libav), OpenCV will link against
your specific build of libav. If libav is later updated (as it would be if you updated ffmpeg), this
will cause OpenCV to crash. Make sure that if you update your libav packages, you also reinstall OpenCV
will cause OpenCV and pngflatten to crash. Make sure that if you update your libav packages, you also reinstall OpenCV
at the same time._

## The Scripts ##
Expand All @@ -179,7 +188,7 @@ results. It is also meant to simplify post-processing such that a minimally trai
processing of data sets.

`mstk.sh` should be run from inside the folder containing all of the Daily folders to be processed. It needs no
arguments to run, though it accepts `--minimal`, `--standard`, or `--mega` as preset flags.
arguments to run, though it accepts certain preset flags.

> $ ~/source/multispectral-toolkit/mstk.sh
Expand All @@ -189,14 +198,42 @@ window. If this is not a valid location or if the directory is not writable, you
_NOTE: As of now, only the directory is checked for writable flags; its root volume is not checked. If you encounter
unwritable file errors, ensure you have full permissions to write to the directory._

`mstk.sh` will then ask for copyright information. This information will be embedded into all image files at the end of
If you did not run `mstk.sh` with a preset flag, it will then prompt for which output formats you want to create. In some cases,
`mstk.sh` will create intermediate versions of files even if you choose not to keep them. For example, all multispectral measurement
outputs require flatfielded PNGs. If PNG output is disabled, the PNGs will be deleted after multispectral processing.

`mstk.sh` will ask for copyright information. This information will be embedded into all image files at the end of
the processing procedure. `mstk.sh` will only attempt to write copyright information to files whose EXIF copyright field does
not already match the pattern created during this input phase. Make sure to double-check the spelling! Processing will begin
after copyright information has been confirmed. _NOTE: Unlike many of the other steps in the multispectral-toolkit where the
`mstk.sh` versions of scripts are specialized versions of preexisting utilities, the copyrighting functions of `mstk.sh` are
shared with `copyrighter.sh`. If you find you need to cancel the copyright procedure in the middle of post-processing, it
is usually better to run `copyrighter.sh` in your output folder than to rerun `mstk.sh`._

#### Preset Flags ####
`mstk.sh` has preset flags that provide a simple way to process datasets into preselected outputs. These flags should
be added at runtime. The script will accept only one flag at a time.

> $ ~/source/multispectral-toolkit/mstk.sh --standard
There are currently five preset flags that can be used:

* **--minimal**: Minimal output mode
Otherwise known as All JPEG Mode. This will output JPG files of the flatfielded, RGB, and multispectral measurement
images. No uncompressed or intermediate files are kept. Non-histogram equalized output of multispectral measurements is disabled.
* **--standard**: Standard output mode
This mode generates TIFs of flatfielded and RGB images, as well as PNGs of multispectral measurements. NRRD files created
for multispectral measurement are kept. Non-histogram equalized output of multispectral measurements is disabled.
* **--mega**: Mega output mode
This mode generates every type of output possible: TIF, PNG, and JPG outputs of flatfielded images; TIF and JPG outputs of
RGB images; NRRD, PNG, and JPG outputs of multispectral measurements. Non-histogram equalized output of multispectral measurements is enabled.
* **--google**: Google CI output mode
Specificaly created for the VisCenter's internal workflow for ingesting assets into the Google Cultural Institue site. The same as
Minimal output mode, except that only skew, intercept, and standard deviation measurements are generated during spectralize processing.
* **--multijpg**: Multispectral JPEG output mode
Generates only JPGs of multispectral measurements. Non-histogram equalized output of multispectral measurements is disabled.


### summarize.sh ###

`summarize.sh` is a script that generates a pre-processing report on a folder containing a shoot's Daily folders. This report
Expand All @@ -207,11 +244,18 @@ The script should be run from the flatfields' Processed folder. It takes no argu

> $ ~/source/multispectral-toolkit/mstk.sh
A full processing log will be generated in the working directory. Any warnings generated will be outputted on the console.
A full processing log will be generated in the working directory. Any warnings generated will output to the console.

If multiple flatfield folders (any folders named _FLATS\__) are detected in a daily folder, `summarize.sh` will compare
flat and folio exposure information and notify the user of which flatfield matches a particular folio. Users should use
this information to help identify the correct flatfield for a particular daily folder. In cases where multiple flatfields
still match folios based on exposure information, visual identification of the flatfield may be necessary. Additional checks
for missing folio and flatfield exposures will not be run until all multiple flatfield issues are resolved.

_NOTE: Running `summarize.sh` requires that you have installed evix2. See **[Prerequisites](#prerequisites)**
_NOTE: Running `summarize.sh` requires that you have installed exiv2. See **[Prerequisites](#prerequisites)**
for more information._



### applyflats.sh ###

`applyflats.sh` flat-field corrects all folios in all Daily folders using flatfields found in the corresponding Daily folder.
Expand All @@ -220,22 +264,24 @@ but takes a setup log created by `mstk.sh` as an argument.

> $ ~/source/multispectral\-toolkit/applyflats.sh [~/output\_folder/2013\-05\-09\_14/02/37\_setup.log]
Arguments are used internally by `mstk.sh` to pass output information to `applyflats.sh`. Advanced users may use
this functionality to automatically run `applyflats.sh` with preselected output types.

_NOTE: Running `applyflats.sh` requires that you have previously built the `pngflatten` application. See **[Prerequisites](#prerequisites)**
for more information._


### spectralize.sh ###

`spectralize.sh` takes sets of flatfielded folios and applies various measurements to their data. The
output is generally referred to as a "multispectral rendering". The script should be run from the output
`spectralize.sh` takes sets of flatfielded folio PNGs and applies various measurements to their data. The
output is generally referred to as a "multispectral measurement". The script should be run from the output
folder created by `applyflats.sh`. It requires no arguments.

> $ cd ~/output\_folder
> $ ~/source/multispectral\-toolkit/spectralize.sh
It's important that all files are numbered according to the "Standard" EurekaVision Workflow. Misnumbered
files will cause `spectralize.sh` to crash. See **_[Metadata](#metadata)_** for more information.
_NOTE: `spectralize.sh` requires ImageMagick, teem, and GNU parallel. See **[Prerequisites](#prerequisites)** for more information._
_NOTE: `spectralize.sh` requires ImageMagick, teem, exiv2, and GNU parallel. See **[Prerequisites](#prerequisites)** for more information._


### copyrighter.sh ###

Expand All @@ -250,6 +296,13 @@ This information will be written to the images' EXIF tags in the format specifie

_NOTE: `copyright.sh` requires exiv2. See **[Prerequisites](#prerequisites)** for more information._



## Utilities ##

The Utilities folder contains a number of scripts that aid in very specific processing tasks. They are mostly for internal use at
the VisCenter.

### despot.sh ###

`despot.sh` attempts to remove spots or blemishes in flatfield images that might cause irregularities when the flatfields are applied
Expand All @@ -273,4 +326,23 @@ of the image. This causes issues during in-painting and `despot` has been writte
that do not have these borders should be wary of `despot` or should modify `despot.cpp` appropriately.

_NOTE: Running `despot.sh` requires that you have previously built the `despot` application. See **[Prerequisites](#prerequisites)**
for more information._
for more information._


### dng2tif.sh ###

Used to convert images in a folio's Mega folder to TIFs in the corresponding Processed folder. Converts format and maintains appropriate
metadata for multispectral processing. Run from inside a folio's Mega folder.

_NOTE: `copyright.sh` requires exiv2. See **[Prerequisites](#prerequisites)** for more information._


### mstk2ci.sh ###

Easily copy and rename multispectral measurements generated by the `mstk.sh` process for upload into Google Cultural Institute site.
Specifically copies only measurements selected for use by the VisCenter.


### renameFolders.sh ###

Used to convert folder names from format `Name###` to format `Name-###`.
2 changes: 0 additions & 2 deletions TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,6 @@ multispectral-toolkit v2 to-do list

**Spectralize**

* Auto-detect captured wavelength
* output type option (PNG, TIF, JPG)


**All Scripts**
Expand Down
3 changes: 2 additions & 1 deletion summarize.sh
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,8 @@ for i in */; do
# Substitution magic
FLATTEMP="\${${flatname}_$wavelength}"
FLATTEMP=`eval echo $FLATTEMP`


# Check variables to ensure accuracy
#echo "$(basename $image): $exposure"
#echo " ${flatname}_${wavelength}: $FLATTEMP"

Expand Down
File renamed without changes.
1 change: 1 addition & 0 deletions utilities/dng2tif.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
#!/bin/sh
# Uses ufraw to convert DNGs to TIFs
# Run from Mega folder

#cp ~/source/multispectral-toolkit/utilities/dng2tif.ufraw $PWD/dng2tif.ufraw

Expand Down
1 change: 1 addition & 0 deletions utilities/mstk2ci.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
#!/bin/sh
# From mstk to Google CI
# Copies and renames multispectral measurement JPGs specific to Google CI upload process
# Run from output folder created by mstk.sh

echo
Expand Down
1 change: 1 addition & 0 deletions utilities/renameFolders.sh
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
# Converts folder names from format Name### to format Name-###
for i in */; do
if [[ "$(basename $i)" != FLATS_* ]]; then
volume=$(basename $i | sed 's/\([A-Za-z]*\)[0-9]*[A-Za-z]*/\1/')
Expand Down

0 comments on commit 37bc31c

Please sign in to comment.