From 70760773d2e4cbd30eac13e51a3f8171c20cffed Mon Sep 17 00:00:00 2001
From: rightlag
Date: Mon, 7 Aug 2017 22:24:29 -0400
Subject: [PATCH] Update readme.md
---
readme.md | 281 +++++++++++++++++++++++++++++-------------------------
1 file changed, 151 insertions(+), 130 deletions(-)
diff --git a/readme.md b/readme.md
index 2507a72..4c61e24 100644
--- a/readme.md
+++ b/readme.md
@@ -7,17 +7,52 @@
-> A module that parses [JSON Schema](http://json-schema.org/) documents to validate client-submitted data and convert JSON schema documents to Avro schema documents.
+> Validate client-submitted data using [JSON Schema](http://json-schema.org/) documents and convert JSON Schema documents into different data-interchange formats.
+
+## Contents
+
+- [Installation](#installation)
+- [Usage](#usage)
+- [Data Validation](#data-validation)
+- [Data Validation CLI](#data-validation-cli)
+- [Data Validation API](#data-validation-api)
+- [Structured Messaged Generation](#structured-message-generation)
+- [Supported Data-Interchange Formats](#supported-data-interchange-formats)
+- [Avro](#avro)
+- [Data-Interchange CLI](#data-interchange-cli)
+- [Data-Interchange API](#data-interchange-api)
+- [Testing](#testing)
+- [Additional Resources](#additional-resources)
+- [Maintainers](#maintainers)
+- [Contributing](#contributing)
+
+## Why aptos?
+
+- Validate client-submitted data
+- Convert JSON Schema documents into different data-interchange formats
+- Simple syntax
+- CLI support for data validation and JSON Schema conversion
+- [Swagger](https://swagger.io/) specification support
+
+## Installation
+
+**via git**
+
+ $ git clone https://github.com/pennsignals/aptos.git && cd aptos
+ $ python setup.py install
## Usage
-`aptos` supports validating client-submitted data and generates Avro structured messages from a given JSON Schema document.
+`aptos` supports the following capabilities:
+
+ - **Data Validation:** Validate client-submitted data using [validation keywords](http://json-schema.org/latest/json-schema-validation.html#rfc.section.6) described in the JSON Schema specification.
+ - **Schema Conversion:** Convert JSON Schema documents into different data-interchange formats. See the list of [supported data-interchange formats](#supported-data-interchange-formats) for more information.
```
usage: aptos [arguments] SCHEMA
aptos is a tool for validating client-submitted data using the JSON Schema
-vocabulary and converts JSON Schema documents to different data-interchange
+vocabulary and converts JSON Schema documents into different data-interchange
formats.
positional arguments:
@@ -29,7 +64,7 @@ optional arguments:
Arguments:
{validate,convert}
validate Validate a JSON instance
- convert Convert a JSON Schema to a different data-interchange
+ convert Convert a JSON Schema into a different data-interchange
format
More information on JSON Schema: http://json-schema.org/
@@ -38,64 +73,53 @@ More information on JSON Schema: http://json-schema.org/
## Data Validation
-Given a JSON Schema document, `aptos` can validate client-submitted data to ensure that it satisfies a certain number of criteria.
+Here is a basic example of a JSON Schema:
```json
{
- "title": "Product",
+ "title": "Person",
"type": "object",
- "definitions": {
- "geographical": {
- "title": "Geographical",
- "description": "A geographical coordinate",
- "type": "object",
- "properties": {
- "latitude": { "type": "number" },
- "longitude": { "type": "number" }
- }
- }
- },
"properties": {
- "id": {
- "description": "The unique identifier for a product",
- "type": "number"
- },
- "name": {
+ "firstName": {
"type": "string"
},
- "price": {
- "type": "number",
- "minimum": 0,
- "exclusiveMinimum": true
- },
- "tags": {
- "type": "array",
- "items": {
- "type": "string"
- },
- "minItems": 1,
- "uniqueItems": true
- },
- "dimensions": {
- "title": "Dimensions",
- "type": "object",
- "properties": {
- "length": {"type": "number"},
- "width": {"type": "number"},
- "height": {"type": "number"}
- },
- "required": ["length", "width", "height"]
+ "lastName": {
+ "type": "string"
},
- "warehouseLocation": {
- "description": "Coordinates of the warehouse with the product",
- "$ref": "#/definitions/geographical"
+ "age": {
+ "description": "Age in years",
+ "type": "integer",
+ "minimum": 0
}
},
- "required": ["id", "name", "price"]
+ "required": ["firstName", "lastName"]
}
```
-Validation keywords such as `uniqueItems`, `required`, and `minItems` can be used in a schema to impose requirements for successful validation of an instance.
+Given a JSON Schema, `aptos` can validate client-submitted data to ensure that it satisfies a certain number of criteria.
+
+JSON Schema [Validation keywords](http://json-schema.org/latest/json-schema-validation.html#rfc.section.6) such as `minimum` and `required` can be used to impose requirements for successful validation of an instance. In the JSON Schema above, both the `firstName` and `lastName` properties are required, and the `age` property *MUST* have a value greater than or equal to 0.
+
+| Valid Instance :heavy_check_mark: | Invalid Instance :heavy_multiplication_x: |
+|-------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------|
+| `{"firstName": "John", "lastName": "Doe", "age": 42}` | `{"firstName": "John", "age": -15}` (missing required property `lastName` and `age` is not greater than or equal to 0) |
+
+`aptos` can validate client-submitted data using either the CLI or the API:
+
+### Data Validation CLI
+
+ $ aptos validate -instance INSTANCE SCHEMA
+
+**Arguments:**
+
+ - **INSTANCE:** JSON document being validated
+ - **SCHEMA:** JSON document containing the description
+
+| Successful Validation :heavy_check_mark: | Unsuccessful Validation :heavy_multiplication_x: |
+|----------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------|
+| ![](https://user-images.githubusercontent.com/2184329/29053486-5c787966-7bbe-11e7-8fd3-4cb51d87d7d9.png) | ![](https://user-images.githubusercontent.com/2184329/29053538-afcce9c6-7bbe-11e7-8be5-61ac1d876fc1.png) |
+
+### Data Validation API
```python
import json
@@ -106,33 +130,66 @@ from aptos.visitor import ValidationVisitor
with open('/path/to/schema') as fp:
schema = json.load(fp)
-component = SchemaParser.parse('/path/to/schema')
-# Valid client-submitted data (instance)
+component = SchemaParser.parse(schema)
+# Invalid client-submitted data (instance)
instance = {
- "id": 2,
- "name": "An ice sculpture",
- "price": 12.50,
- "tags": ["cold", "ice"],
- "dimensions": {
- "length": 7.0,
- "width": 12.0,
- "height": 9.5
- },
- "warehouseLocation": {
- "latitude": -78.75,
- "longitude": 20.4
- }
+ 'firstName': 'John'
}
-component.accept(ValidationVisitor(instance))
+try:
+ component.accept(ValidationVisitor(instance))
+except AssertionError as e:
+ print(e) # instance {'firstName': 'John'} is missing required property 'lastName'
```
## Structured Message Generation
-Given a JSON Schema document, `aptos` can generate Avro structured messages.
+Given a JSON Schema, `aptos` can generate different structured messages.
+
+:warning: **Note:** The JSON Schema being converted *MUST* be a valid [JSON Object](https://spacetelescope.github.io/understanding-json-schema/reference/object.html).
+
+## Supported Data-Interchange Formats
+
+| Format | Supported | Notes |
+|---------------------------------------------------------------------|:------------------------:|-----------------------------|
+| [Apache Avro](https://avro.apache.org/) | :heavy_check_mark: | |
+| [Protocol Buffers](https://developers.google.com/protocol-buffers/) | :heavy_multiplication_x: | Planned for future releases |
+| [Apache Thrift](https://thrift.apache.org/) | :heavy_multiplication_x: | Planned for future releases |
+| [Apache Parquet](https://parquet.apache.org/) | :heavy_multiplication_x: | Planned for future releases |
### Avro
-For brevity, the [Product](https://github.com/pennsignals/aptos/blob/master/tests/schema/product) schema is omitted from the example.
+Using the `Person` schema in the previous example, `aptos` can convert the schema into the Avro data-interchange format using either the CLI or the API.
+
+`aptos` maps the following JSON schema types to Avro types:
+
+| JSON Schema Type | Avro Type |
+|------------------|-----------|
+| `string` | `string` |
+| `boolean` | `boolean` |
+| `null` | `null` |
+| `integer` | `long` |
+| `number` | `double` |
+| `object` | `record` |
+| `array` | `array` |
+
+JSON Schema documents containing the `enum` validation keyword are mapped to Avro [`enum`](http://avro.apache.org/docs/current/spec.html#Enums) `symbols` attribute.
+
+JSON Schema documents with the `type` keyword as an array are mapped to Avro [Union](http://avro.apache.org/docs/current/spec.html#Unions) types.
+
+## Data-Interchange CLI
+
+ $ aptos convert -format FORMAT SCHEMA
+
+**Arguments:**
+
+ - **FORMAT:** Data-interchange format
+ - **SCHEMA:** JSON document containing the description
+
+
+
+
+
+## Data-Interchange API
```python
import json
@@ -153,82 +210,46 @@ The above code generates the following Avro schema:
```json
{
"type": "record",
- "name": "Product",
"fields": [
{
- "type": "double",
- "name": "price",
- "doc": ""
- },
- {
+ "doc": "",
"type": "string",
- "name": "name",
- "doc": ""
- },
- {
- "type": {
- "type": "record",
- "name": "Geographical",
- "fields": [
- {
- "type": "double",
- "name": "latitude",
- "doc": ""
- },
- {
- "type": "double",
- "name": "longitude",
- "doc": ""
- }
- ]
- },
- "name": "warehouseLocation",
- "doc": "Coordinates of the warehouse with the product"
+ "name": "lastName"
},
{
- "type": {
- "type": "record",
- "name": "Dimensions",
- "fields": [
- {
- "type": "double",
- "name": "height",
- "doc": ""
- },
- {
- "type": "double",
- "name": "length",
- "doc": ""
- },
- {
- "type": "double",
- "name": "width",
- "doc": ""
- }
- ]
- },
- "name": "dimensions",
- "doc": ""
- },
- {
- "type": {
- "type": "array",
- "items": "string"
- },
- "name": "tags",
- "doc": ""
+ "doc": "",
+ "type": "string",
+ "name": "firstName"
},
{
- "type": "double",
- "name": "id",
- "doc": "The unique identifier for a product"
+ "doc": "Age in years",
+ "type": "long",
+ "name": "age"
}
- ]
+ ],
+ "name": "Person"
}
```
+## Testing
+
+All unit tests exist in the [tests](tests) directory.
+
+To run tests, execute the following command:
+
+ $ python setup.py test
+
+## Additional Resources
+
+ - [Stop Being a "Janitorial" Data Scientist](https://medium.com/@rightlag/stop-being-a-janitorial-data-scientist-5959cccbeac) - *A blog post explaining why aptos was created*
+ - [Understanding JSON Schema](https://spacetelescope.github.io/understanding-json-schema/) - *An excellent guide for schema authors, from the [Space Telescope Science Institute](http://www.stsci.edu/portal/)*
+
## Maintainers
| ![Jason Walsh](https://avatars3.githubusercontent.com/u/2184329?v=3&s=128) |
|----------------------------------------------------------------------------|
| [Jason Walsh](https://github.com/rightlag) |
+
+## Contributing
+
+Contributions welcome! Please read the [`contributing.json`](contributing.json) file first.