View the GitHub project here or download the latest release here.
This script provides a way to create an item set which is deduplicated by generating an MD5 for each item based on the concatenation of values yielded by a MetadataProfile for each item.
When creating an item set in Nuix using the API, the API provides a way to provide a custom expression. This expression can be thought of as a function that is provided an item and is expected to provide back a value relating to that item. The value returned is then used in place of the MD5 digest Nuix would have originally calculated for the item at the time of processing.
This script leverages the ProfileDigester class of SuperUtilities to generate a custom MD5 digest by using the concatenated values yielded by a provided metadata profile for each item.
Technical Note: A concatentation of values is not actually used, instead for a given item each metadata profile field is evaluated against that item and each resulting string value is converted to a byte array and successively used to update the digest. See code for ProfileDigester.generateMd5Bytes for more detail. Effectively this should be the same result as digesting the concatenation of the fields, but with potentially lower resource overhead.
Begin by downloading the latest release of this code. Extract the contents of the archive into your Nuix scripts directory. In Windows the script directory is likely going to be either of the following:
%appdata%\Nuix\Scripts
- User level script directory%programdata%\Nuix\Scripts
- System level script directory
- Choose the metadata profile to use. The fields present in the selected metadata profile dictate which values are used to generate the custom MD5 digest.
- Choose whether the item's content text should be included when generating the custom MD5 digest.
- Choose the name of the item set. If no item set with the provided name exists, one will be created. If an item set with the given name does exist, items will be added to that item set. Important: Adding items to an existing item set in which items were previously added using any other means or with this script but different settings (such as a different metadata profile) will produce undefined results. If adding to an existing item set, make sure to use this script and the same settings each time!
- Choose whether deduplication is performed per item or per family.
- Choose whether the custom MD5 digest is recorded onto the item as custom metadata.
- Choose the name of the custom metadata field to record the custom MD5 digest into.
- Choose whether to use existing values in the custom metadata field if they are present. See below if you wish to enable this setting!
If items are selected in the result view when the script is ran, those selected items will be added to the designated item set. If no items are selected when the script is ran all items in the case will be added.
Please read this section thoroughly if using this setting!
When checked, the code that normally would generate the customized digest for an item during the item set creation, will look to see if the item already has a value stored in the specified custom metadata field. If the item has a custom metadata field with the same name (as specified for setting Digest Custom Metadata Field) the script will further check:
- Is the value of the field non null?
- Is the value a String?
- Is the value not empty or only whitespace character?
If the value passes all of these checks, then the existing value will be used rather than generating the value from the profile.
Care should be taken when using this setting! The script has no way to know if the digest value present in the field was generated by this script or using the same metadata profile! It is very possible to run the script one time with profile A on some items and then run the script again later with profile B (or even modified profile A) and some of those same items and incompatible digests could be used!
This script relies on code from Nx to present a settings dialog and progress dialog. This JAR file is not included in the repository (although it is included in release downloads). If you clone this repository, you will also want to obtain a copy of Nx.jar by either:
- Building it from the source
- Downloading an already built JAR file from the Nx releases
Once you have a copy of Nx.jar, make sure to include it in the same directory as the script.
This script also relies on code from SuperUtilities. This JAR file is not included in the repository (although it is included in release downloads). If you clone this repository, you will also want to obtain a copy of SuperUtilities.jar by either:
- Building it from the source
- Downloading an already built JAR file from the SuperUtilities releases
Once you also have a copy of SuperUtilities.jar, make sure to include it in the same directory as the script.
Copyright 2021 Nuix
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.