Skip to content
This repository was archived by the owner on Sep 26, 2023. It is now read-only.

Add basic JSON support #25

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

albertpastrana
Copy link
Member

By applying this commit the anon tool will be able to read,
anonymise and output basic JSON files.

It currently only supports one level JSON fields.

Some small refactor has been done and it's more than probable
that the solution could be a bit DRYer, but don't have time
to do so.

Signed-off-by: Albert Pastrana albert.pastrana@intenthq.com

@albertpastrana albertpastrana force-pushed the add-basic-json-support branch from 8239ffa to bdb5070 Compare January 2, 2019 14:49
@codecov
Copy link

codecov bot commented Jan 2, 2019

Codecov Report

Merging #25 into master will decrease coverage by 6.62%.
The diff coverage is 65.47%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #25      +/-   ##
==========================================
- Coverage      85%   78.37%   -6.63%     
==========================================
  Files           3        5       +2     
  Lines         140      185      +45     
==========================================
+ Hits          119      145      +26     
- Misses         18       37      +19     
  Partials        3        3
Impacted Files Coverage Δ
config.go 100% <ø> (ø) ⬆️
main.go 43.24% <0%> (-24.45%) ⬇️
anonymisations.go 100% <100%> (ø) ⬆️
csv_processor.go 70% <70%> (ø)
json_processor.go 72% <72%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fecd661...744ecb9. Read the comment docs.

Copy link
Contributor

@nathankleyn nathankleyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Starting with documentation and high-level review for now — a few comments from me! 👍

README.md Outdated
// column.
"idColumn": "0"
},
"json": {
Copy link
Contributor

@nathankleyn nathankleyn Apr 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a bit confusing to have both csv and json in the same example — should give a bit of preamble and have two examples. Maybe something like this:

At a high-level, the config looks like this:

{
  // Name of the format of the input file
  // Currently supports "csv" and "json"
  "formatName": {
    // Options for the format you have picked go here.
    // See the documentation for the format you choose below.
  },
  "sampling": {
    // FIXME: Steal from below.
  },
  "actions": [
    // FIXME: Steal from below.
  ]
}

Formats

You can use CSV or JSON files as input.

CSV

For a CSV file you will need a config like this:

"csv": {
  "delimiter": ",",
  // Specify in which column a unique ID exists on which the sampling can
  // be performed. Indices are 0 based, so this would sample on the first
  // column.
  "idColumn": "0"
}

JSON

For a JSON file you will need to define config like this:

"json": {
  // Specify in which field a unique ID exists on which the sampling can
  // be performed.
  "idField": "id"
}

README.md Outdated
"name": "outcode",
// what field in the json this action needs to be applied. If a field in
// the json doesn't have an action defined, then it will be left untouched.
"JsonField": "postcode"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we camel-case this to match the rest? eg. jsonField instead of JsonField

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's actually how it should be, that was a typo (probably because of copy&pasting from go code)

By applying this commit the `anon` tool will be able to read,
anonymise and output basic JSON files.

It currently only supports one level JSON fields.

Some small refactor has been done and it's more than probable
that the solution could be a bit DRYer, but don't have time
to do so.

Signed-off-by: Albert Pastrana <albert.pastrana@intenthq.com>
@albertpastrana albertpastrana force-pushed the add-basic-json-support branch from bdb5070 to 744ecb9 Compare April 1, 2019 13:17
Copy link
Contributor

@nathankleyn nathankleyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of final things here — and codecov is complaining about coverage, worth a look! 👍

@@ -8,7 +8,7 @@
</a> [![Go Report Card](https://goreportcard.com/badge/github.com/intenthq/anon)](https://goreportcard.com/report/github.com/intenthq/anon) [![License](https://img.shields.io/npm/l/express.svg)](https://github.com/intenthq/anon/LICENSE)
![GitHub release](https://img.shields.io/github/release/intenthq/anon.svg)

Anon is a tool for taking delimited files and anonymising or transforming columns until the output is useful for applications where sensitive information cannot be exposed.
Anon is a tool for taking delimited files and anonymising or transforming columns/fields until the output is useful for applications where sensitive information cannot be exposed. Currently this tools supports both CSV and JSON files (with one level of depth).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Anon is a tool for taking delimited files and anonymising or transforming columns/fields until the output is useful for applications where sensitive information cannot be exposed. Currently this tools supports both CSV and JSON files (with one level of depth).
Anon is a tool for taking delimited or JSON files and anonymising or transforming columns/fields until the output is useful for applications where sensitive information cannot be exposed. Currently this tools supports both CSV and JSON files (with one level of depth).

@@ -61,7 +63,10 @@ In order to be useful, Anon needs to be told what you want to do to each column
{
// Takes a UK format postcode (eg. W1W 8BE) and just keeps the outcode
// (eg. W1W).
"name": "outcode"
"name": "outcode",
// what field in the json this action needs to be applied. If a field in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a bit out of place — it suddenly talks about JSON when above we've changed it all to talk about there being multiple formats. Worth breaking out into a little explanation about how CSVs will infer the column number whereas JSON needs the jsonField to tell it which one you wanted?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants