Skip to content

Commit

Permalink
Merge d2e941e into a6c3545
Browse files Browse the repository at this point in the history
  • Loading branch information
SmetDenis committed Apr 5, 2024
2 parents a6c3545 + d2e941e commit 0fdc408
Show file tree
Hide file tree
Showing 21 changed files with 1,344 additions and 120 deletions.
48 changes: 29 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
[![Static Badge](https://img.shields.io/badge/Rules-118-green?label=Cell%20rules&labelColor=blue&color=gray)](src/Rules/Cell)
[![Static Badge](https://img.shields.io/badge/Rules-206-green?label=Aggregate%20rules&labelColor=blue&color=gray)](src/Rules/Aggregate)
[![Static Badge](https://img.shields.io/badge/Rules-8-green?label=Extra%20checks&labelColor=blue&color=gray)](#extra-checks)
[![Static Badge](https://img.shields.io/badge/Rules-17/11/25-green?label=Plan%20to%20add&labelColor=gray&color=gray)](tests/schemas/todo.yml)
[![Static Badge](https://img.shields.io/badge/Rules-17/11/20-green?label=Plan%20to%20add&labelColor=gray&color=gray)](tests/schemas/todo.yml)
<!-- auto-update:/rules-counter -->

A console utility designed for validating CSV files against a strictly defined schema and validation rules outlined
Expand Down Expand Up @@ -152,21 +152,6 @@ You can find launch examples in the [workflow demo](https://github.com/JBZoo/Csv
```
<!-- auto-update:/github-actions-yml -->

To see user-friendly error outputs in your pull requests (PRs), specify `report: github`. This
utilizes [annotations](https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions#setting-a-warning-message)
to highlight bugs directly within the GitHub interface at the PR level. This feature allows errors to be displayed in
the exact location within the CSV file, right in the diff of your Pull Requests. For a practical example,
view [this live demo PR](https://github.com/JBZoo/Csv-Blueprint-Demo/pull/1/files).

![GitHub Actions - PR](.github/assets/github-actions-pr.png)

<details>
<summary>Click to see example in GitHub Actions terminal</summary>

![GitHub Actions - Terminal](.github/assets/github-actions-termintal.png)

</details>

### Docker container

Ensure you have Docker installed on your machine.
Expand Down Expand Up @@ -307,6 +292,10 @@ description: | # Any description of the CSV file. Not u
supporting a wide range of data validation rules from basic type checks to complex regex validations.
This example serves as a comprehensive guide for creating robust CSV file validations.
includes:
parent-alias: ./readme_sample.yml # Include another schema and define an alias for it.


# Regular expression to match the file name. If not set, then no pattern check.
# This allows you to pre-validate the file name before processing its contents.
# Feel free to check parent directories as well.
Expand Down Expand Up @@ -513,9 +502,9 @@ columns:

# Identifications
phone: ALL # Validates if the input is a phone number. Specify the country code to validate the phone number for a specific country. Example: "ALL", "US", "BR".".
postal_code: US # Validate postal code by country code (alpha-2). Example: "02179". Extracted from https://www.geonames.org
is_iban: true # IBAN - International Bank Account Number. See: https://en.wikipedia.org/wiki/International_Bank_Account_Number
is_bic: true # Validates a Bank Identifier Code (BIC) according to ISO 9362 standards. See: https://en.wikipedia.org/wiki/ISO_9362
postal_code: US # Validate postal code by country code (alpha-2). Example: "02179". Extracted from https://www.geonames.org
is_imei: true # Validates an International Mobile Equipment Identity (IMEI). See: https://en.wikipedia.org/wiki/International_Mobile_Station_Equipment_Identity
is_isbn: true # Validates an International Standard Book Number (ISBN). See: https://www.isbn-international.org/content/what-isbn

Expand Down Expand Up @@ -1037,6 +1026,8 @@ The validation process culminates in a human-readable report detailing any error
the default report format is a table, the tool supports various output formats, including text, GitHub, GitLab,
TeamCity, JUnit, among others, to best suit your project's needs and your personal or team preferences.

### Table format

When using the `table` format (default), the output is organized in a clear, easily interpretable table that lists all
discovered errors. This format is ideal for quick reviews and sharing with team members for further action.

Expand Down Expand Up @@ -1088,12 +1079,30 @@ Summary:
<!-- auto-update:/output-table -->


### GitHub Action format

To see user-friendly error outputs in your pull requests (PRs), specify `report: github`. This
utilizes [annotations](https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions#setting-a-warning-message)
to highlight bugs directly within the GitHub interface at the PR level. This feature allows errors to be displayed in
the exact location within the CSV file, right in the diff of your Pull Requests. For a practical example,
view [this live demo PR](https://github.com/JBZoo/Csv-Blueprint-Demo/pull/1/files).

![GitHub Actions - PR](.github/assets/github-actions-pr.png)

<details>
<summary>Click to see example in GitHub Actions terminal</summary>

![GitHub Actions - Terminal](.github/assets/github-actions-termintal.png)

</details>


### Text format
Optional format `text` with highlited keywords:
```sh
./csv-blueprint validate:csv --report=text
```


![Report - Text](.github/assets/output-text.png)


Expand All @@ -1102,6 +1111,7 @@ Optional format `text` with highlited keywords:
* Tools uses [JBZoo/CI-Report-Converter](https://github.com/JBZoo/CI-Report-Converter) as SDK to convert reports to
different formats. So you can easily integrate it with any CI system.


## Benchmarks

Understanding the performance of this tool is crucial, but it's important to note that its efficiency is influenced by
Expand Down Expand Up @@ -1354,12 +1364,12 @@ It's random ideas and plans. No promises and deadlines. Feel free to [help me!](
* Flag to ignore file name pattern. It's useful when you have a lot of files, and you don't want to validate the file name.

* **Validation**
* Multi `filename_pattern`. Support list of regexs.
* Multi values in one cell.
* Custom cell rule as a callback. It's useful when you have a complex rule that can't be described in the schema file.
* Custom agregate rule as a callback. It's useful when you have a complex rule that can't be described in the schema file.
* Configurable keyword for null/empty values. By default, it's an empty string. But you will use `null`, `nil`, `none`, `empty`, etc. Overridable on the column level.
* Handle empty files and files with only a header row, or only with one line of data. One column wthout header is also possible.
* Inheritance of schemas, rules and columns. Define parent schema and override some rules in the child schemas. Make it DRY and easy to maintain.
* If option `--schema` is not specified, then validate only super base level things (like "is it a CSV file?").
* Complex rules (like "if field `A` is not empty, then field `B` should be not empty too").
* Extending with custom rules and custom report formats. Plugins?
Expand Down
6 changes: 5 additions & 1 deletion schema-examples/full.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@
"name" : "CSV Blueprint Schema Example",
"description" : "This YAML file provides a detailed description and validation rules for CSV files\nto be processed by CSV Blueprint tool. It includes specifications for file name patterns,\nCSV formatting options, and extensive validation criteria for individual columns and their values,\nsupporting a wide range of data validation rules from basic type checks to complex regex validations.\nThis example serves as a comprehensive guide for creating robust CSV file validations.\n",

"includes" : {
"parent-alias" : ".\/readme_sample.yml"
},

"filename_pattern" : "\/demo(-\\d+)?\\.csv$\/i",

"csv" : {
Expand Down Expand Up @@ -147,9 +151,9 @@
"is_luhn" : true,

"phone" : "ALL",
"postal_code" : "US",
"is_iban" : true,
"is_bic" : true,
"postal_code" : "US",
"is_imei" : true,
"is_isbn" : true,

Expand Down
6 changes: 5 additions & 1 deletion schema-examples/full.php
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,10 @@
This example serves as a comprehensive guide for creating robust CSV file validations.
',

'includes' => [
'parent-alias' => './readme_sample.yml',
],

'filename_pattern' => '/demo(-\\d+)?\\.csv$/i',

'csv' => [
Expand Down Expand Up @@ -167,9 +171,9 @@
'is_luhn' => true,

'phone' => 'ALL',
'postal_code' => 'US',
'is_iban' => true,
'is_bic' => true,
'postal_code' => 'US',
'is_imei' => true,
'is_isbn' => true,

Expand Down
6 changes: 5 additions & 1 deletion schema-examples/full.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@ description: | # Any description of the CSV file. Not u
supporting a wide range of data validation rules from basic type checks to complex regex validations.
This example serves as a comprehensive guide for creating robust CSV file validations.
includes:
parent-alias: ./readme_sample.yml # Include another schema and define an alias for it.


# Regular expression to match the file name. If not set, then no pattern check.
# This allows you to pre-validate the file name before processing its contents.
# Feel free to check parent directories as well.
Expand Down Expand Up @@ -228,9 +232,9 @@ columns:

# Identifications
phone: ALL # Validates if the input is a phone number. Specify the country code to validate the phone number for a specific country. Example: "ALL", "US", "BR".".
postal_code: US # Validate postal code by country code (alpha-2). Example: "02179". Extracted from https://www.geonames.org
is_iban: true # IBAN - International Bank Account Number. See: https://en.wikipedia.org/wiki/International_Bank_Account_Number
is_bic: true # Validates a Bank Identifier Code (BIC) according to ISO 9362 standards. See: https://en.wikipedia.org/wiki/ISO_9362
postal_code: US # Validate postal code by country code (alpha-2). Example: "02179". Extracted from https://www.geonames.org
is_imei: true # Validates an International Mobile Equipment Identity (IMEI). See: https://en.wikipedia.org/wiki/International_Mobile_Station_Equipment_Identity
is_isbn: true # Validates an International Standard Book Number (ISBN). See: https://www.isbn-international.org/content/what-isbn

Expand Down
5 changes: 4 additions & 1 deletion schema-examples/full_clean.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@ description: |
supporting a wide range of data validation rules from basic type checks to complex regex validations.
This example serves as a comprehensive guide for creating robust CSV file validations.
includes:
parent-alias: ./readme_sample.yml

filename_pattern: '/demo(-\d+)?\.csv$/i'

csv:
Expand Down Expand Up @@ -161,9 +164,9 @@ columns:
is_luhn: true

phone: ALL
postal_code: US
is_iban: true
is_bic: true
postal_code: US
is_imei: true
is_isbn: true

Expand Down
17 changes: 11 additions & 6 deletions src/Csv/Column.php
Original file line number Diff line number Diff line change
Expand Up @@ -33,21 +33,21 @@ final class Column

private ?int $csvOffset = null;
private int $schemaId;
private Data $column;
private Data $data;
private array $rules;
private array $aggRules;

public function __construct(int $schemaId, array $config)
{
$this->schemaId = $schemaId;
$this->column = new Data($config);
$this->data = new Data($config);
$this->rules = $this->prepareRuleSet('rules');
$this->aggRules = $this->prepareRuleSet('aggregate_rules');
}

public function getName(): string
{
return $this->column->getString('name', self::FALLBACK_VALUES['name']);
return $this->data->getString('name', self::FALLBACK_VALUES['name']);
}

public function getCsvOffset(): ?int
Expand All @@ -62,7 +62,7 @@ public function getSchemaId(): int

public function getDescription(): string
{
return $this->column->getString('description', self::FALLBACK_VALUES['description']);
return $this->data->getString('description', self::FALLBACK_VALUES['description']);
}

public function getHumanName(): string
Expand All @@ -78,7 +78,7 @@ public function getHumanName(): string

public function isRequired(): bool
{
return $this->column->getBool('required', self::FALLBACK_VALUES['required']);
return $this->data->getBool('required', self::FALLBACK_VALUES['required']);
}

public function getRules(): array
Expand Down Expand Up @@ -106,11 +106,16 @@ public function setCsvOffset(int $csvOffset): void
$this->csvOffset = $csvOffset;
}

public function getData(): Data
{
return clone $this->data;
}

private function prepareRuleSet(string $schemaKey): array
{
$rules = [];

$ruleSetConfig = $this->column->getSelf($schemaKey, [])->getArrayCopy();
$ruleSetConfig = $this->data->getSelf($schemaKey, [])->getArrayCopy();
foreach ($ruleSetConfig as $ruleName => $ruleValue) {
$rules[$ruleName] = $ruleValue;
}
Expand Down

0 comments on commit 0fdc408

Please sign in to comment.