Skip to content

Commit

Permalink
document update and bump up to version 0.2.1
Browse files Browse the repository at this point in the history
  • Loading branch information
beniyama committed Aug 9, 2017
1 parent 6126868 commit b38ee7c
Show file tree
Hide file tree
Showing 2 changed files with 31 additions and 4 deletions.
33 changes: 30 additions & 3 deletions README.md
@@ -1,23 +1,30 @@
# Mask filter plugin for Embulk

mask columns with asterisks (still in initial development phase and missing basic functionalities to use in production )
Mask columns with asterisks in a variety of patterns (still in initial development phase and missing basic features to use in production).

## Overview

* **Plugin type**: filter

## Configuration

*Caution* : Now we use `type` to specify mask types such as `all` and `email`, instead of `pattern` which was used in version 0.1.1 or earlier.

- **columns**: target columns which would be replaced with asterisks (string, required)
- **name**: name of the column (string, required)
- **type**: mask type, `all` or `email` (string, default: `all`)
- **type**: mask type, `all`, `email`, `regex` or `substring` (string, default: `all`)
- **paths**: list of JSON path and type, works if the column type is JSON
- `[{key: $.json_path1}, {key: $.json_path2}]` would mask both `$.json_path1` and `$.json_path2` nodes
- Elements under the nodes would be converted to string and then masked (e.g., `[0,1,2]` -> `*******`)
- **length**: if specified, this filter replaces the column with fixed number of asterisks (integer, optional)
- **length**: if specified, this filter replaces the column with fixed number of asterisks (integer, optional. supported only in `all`, `email`, `substring`.)
- **pattern**: Regex pattern such as "[0-9]+" (string, required for `regex` type)
- **start**: The beginning index for `substring` type. The value starts from 0 and inclusive (integer, default: 0)
- **end**: The ending index for `substring` type. The value is exclusive (integer, default: length of the target column)

## Example



If you have below data in csv or other format file,

|first_name | last_name | gender | age | contact |
Expand Down Expand Up @@ -49,6 +56,26 @@ would produce
| Christian | **** | male | ** | *****@example.com |
| Amy | ***** | female | ** | *****@example.com |

If you use `regex` or `substring` types,

```yaml
filters:
- type: mask
columns:
- { name: last_name, type: regex, pattern: "[a-z]"}
- { name: contact, type: substring, start: 5, length: 5}
```

would produce

|first_name | last_name | gender | age | contact |
|---|---|---|---|---|
| B******* | Bell | male | 30 | bell.***** |
| L**** | Duncan | male | 20 | lucas***** |
| E******* | May | female | 25 | eliza***** |
| C******** | Reid | male | 15 | chris***** |
| A** | Avery | female | 40 | amy.a***** |

JSON type column is also partially supported.

If you have a `user` column with this JSON data structure
Expand Down
2 changes: 1 addition & 1 deletion build.gradle
Expand Up @@ -15,7 +15,7 @@ configurations {
provided
}

version = "0.1.1"
version = "0.2.1"

sourceCompatibility = 1.7
targetCompatibility = 1.7
Expand Down

0 comments on commit b38ee7c

Please sign in to comment.