Skip to content

Commit

Permalink
Merge 00bcb66 into ad046c0
Browse files Browse the repository at this point in the history
  • Loading branch information
SmetDenis committed Mar 13, 2024
2 parents ad046c0 + 00bcb66 commit f189843
Show file tree
Hide file tree
Showing 16 changed files with 165 additions and 52 deletions.
9 changes: 3 additions & 6 deletions .github/workflows/demo.yml
Original file line number Diff line number Diff line change
Expand Up @@ -105,13 +105,12 @@ jobs:
- name: 👎 Invalid CSV file
run: |
docker run \
! docker run \
-v `pwd`:/parent-host \
--rm jbzoo/csv-blueprint \
validate:csv \
--csv=/parent-host/tests/fixtures/batch/*.csv \
--schema=/parent-host/tests/schemas/demo_invalid.yml
continue-on-error: true
phar:
Expand All @@ -138,11 +137,10 @@ jobs:
- name: 👎 Invalid CSV file
run: |
./build/csv-blueprint.phar \
! ./build/csv-blueprint.phar \
validate:csv \
--csv=./tests/fixtures/batch/*.csv \
--schema=./tests/schemas/demo_invalid.yml
continue-on-error: true
php:
Expand All @@ -169,8 +167,7 @@ jobs:
- name: 👎 Invalid CSV file
run: |
./csv-blueprint \
! ./csv-blueprint \
validate:csv \
--csv=./tests/fixtures/batch/*.csv \
--schema=./tests/schemas/demo_invalid.yml
continue-on-error: true
34 changes: 28 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -254,9 +254,16 @@ Found CSV files: 3
| 7 | 0:Name | min_length | Value "Lois" (length: 4) is too short. Min length is 5 |
+------+------------+------------+----- demo-2.csv ---------------------------------------+
(3/3) OK: ./tests/fixtures/batch/sub/demo-3.csv
(3/3) Invalid file: ./tests/fixtures/batch/sub/demo-3.csv
+------+-----------+------------------+---- demo-3.csv -------------------------------------------+
| Line | id:Column | Rule | Message |
+------+-----------+------------------+-----------------------------------------------------------+
| 0 | | filename_pattern | Filename "./tests/fixtures/batch/sub/demo-3.csv" does not |
| | | | match pattern: "/demo-[12].csv$/i" |
+------+-----------+------------------+---- demo-3.csv -------------------------------------------+
Found 7 issues in 2 out of 3 CSV files.
Found 8 issues in 3 out of 3 CSV files.
```

Expand Down Expand Up @@ -307,6 +314,11 @@ This gives you great flexibility when validating CSV files.
```yml
# It's a full example of the CSV schema file in YAML format.

# Regular expression to match the file name. If not set, then no pattern check
# This way you can validate the file name before the validation process.
# Feel free to check parent directories as well.
filename_pattern: /demo(-\d+)?\.csv$/i

csv: # Here are default values. You can skip this section if you don't need to override the default values
header: true # If the first row is a header. If true, name of each column is required
delimiter: , # Delimiter character in CSV file
Expand Down Expand Up @@ -362,6 +374,8 @@ columns:
cardinal_direction: true # Valid cardinal direction. Examples: "N", "S", "NE", "SE", "none", ""
usa_market_name: true # Check if the value is a valid USA market name. Example: "New York, NY"

- name: "another_column"

```


Expand All @@ -370,15 +384,16 @@ columns:

```json
{
"csv" : {
"filename_pattern" : "/demo(-\\d+)?\\.csv$/i",
"csv" : {
"header" : true,
"delimiter" : ",",
"quote_char" : "\\",
"enclosure" : "\"",
"encoding" : "utf-8",
"bom" : false
},
"columns" : [
"columns" : [
{
"name" : "csv_header_name",
"description" : "Lorem ipsum",
Expand Down Expand Up @@ -412,7 +427,8 @@ columns:
"cardinal_direction" : true,
"usa_market_name" : true
}
}
},
{"name" : "another_column"}
]
}

Expand All @@ -422,6 +438,7 @@ columns:




<details>
<summary>Click to see: PHP Format</summary>

Expand All @@ -430,6 +447,8 @@ columns:
declare(strict_types=1);

return [
'filename_pattern' => '/demo(-\\d+)?\\.csv$/i',

'csv' => [
'header' => true,
'delimiter' => ',',
Expand All @@ -438,6 +457,7 @@ return [
'encoding' => 'utf-8',
'bom' => false,
],

'columns' => [
[
'name' => 'csv_header_name',
Expand Down Expand Up @@ -473,6 +493,7 @@ return [
'usa_market_name' => true,
],
],
['name' => 'another_column'],
],
];

Expand All @@ -481,6 +502,7 @@ return [
</details>



## Coming soon

It's random ideas and plans. No orderings and deadlines. <u>But batch processing is the priority #1</u>.
Expand All @@ -494,7 +516,7 @@ Batch processing
* [ ] Discovering CSV files by `filename_pattern` in the schema file. In case you have a lot of schemas and a lot of CSV files and want to automate the process as one command.

Validation
* [ ] `filename_pattern` validation with regex (like "all files in the folder should be in the format `/^[\d]{4}-[\d]{2}-[\d]{2}\.csv$/`").
* [x] ~~`filename_pattern` validation with regex (like "all files in the folder should be in the format `/^[\d]{4}-[\d]{2}-[\d]{2}\.csv$/`").~~
* [ ] Agregate rules (like "at least one of the fields should be not empty" or "all values must be unique").
* [ ] Handle empty files and files with only a header row, or only with one line of data. One column wthout header is also possible.
* [ ] Using multiple schemas for one csv file.
Expand Down
8 changes: 5 additions & 3 deletions schema-examples/full.json
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
{
"csv" : {
"filename_pattern" : "/demo(-\\d+)?\\.csv$/i",
"csv" : {
"header" : true,
"delimiter" : ",",
"quote_char" : "\\",
"enclosure" : "\"",
"encoding" : "utf-8",
"bom" : false
},
"columns" : [
"columns" : [
{
"name" : "csv_header_name",
"description" : "Lorem ipsum",
Expand Down Expand Up @@ -41,6 +42,7 @@
"cardinal_direction" : true,
"usa_market_name" : true
}
}
},
{"name" : "another_column"}
]
}
4 changes: 4 additions & 0 deletions schema-examples/full.php
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@
declare(strict_types=1);

return [
'filename_pattern' => '/demo(-\\d+)?\\.csv$/i',

'csv' => [
'header' => true,
'delimiter' => ',',
Expand All @@ -23,6 +25,7 @@
'encoding' => 'utf-8',
'bom' => false,
],

'columns' => [
[
'name' => 'csv_header_name',
Expand Down Expand Up @@ -58,5 +61,6 @@
'usa_market_name' => true,
],
],
['name' => 'another_column'],
],
];
7 changes: 7 additions & 0 deletions schema-examples/full.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,11 @@

# It's a full example of the CSV schema file in YAML format.

# Regular expression to match the file name. If not set, then no pattern check
# This way you can validate the file name before the validation process.
# Feel free to check parent directories as well.
filename_pattern: /demo(-\d+)?\.csv$/i

csv: # Here are default values. You can skip this section if you don't need to override the default values
header: true # If the first row is a header. If true, name of each column is required
delimiter: , # Delimiter character in CSV file
Expand Down Expand Up @@ -66,3 +71,5 @@ columns:
is_longitude: true # Can be integer or float. Example: -89.123456
cardinal_direction: true # Valid cardinal direction. Examples: "N", "S", "NE", "SE", "none", ""
usa_market_name: true # Check if the value is a valid USA market name. Example: "New York, NY"

- name: "another_column"
39 changes: 37 additions & 2 deletions src/Csv/CsvFile.php
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
namespace JBZoo\CsvBlueprint\Csv;

use JBZoo\CsvBlueprint\Schema;
use JBZoo\CsvBlueprint\Utils;
use JBZoo\CsvBlueprint\Validators\Error;
use JBZoo\CsvBlueprint\Validators\ErrorSuite;
use League\Csv\Reader as LeagueReader;
Expand Down Expand Up @@ -82,7 +83,9 @@ public function validate(bool $quickStop = false): ErrorSuite
{
$errors = new ErrorSuite($this->getCsvFilename());

$errors->addErrorSuit($this->validateHeader())
$errors
->addErrorSuit($this->validateFile($quickStop))
->addErrorSuit($this->validateHeader($quickStop))
->addErrorSuit($this->validateEachCell($quickStop))
->addErrorSuit(self::validateAggregateRules($quickStop));

Expand All @@ -106,7 +109,7 @@ private function prepareReader(): LeagueReader
return $reader;
}

private function validateHeader(): ErrorSuite
private function validateHeader(bool $quickStop = false): ErrorSuite
{
$errors = new ErrorSuite();

Expand All @@ -125,6 +128,10 @@ private function validateHeader(): ErrorSuite

$errors->addError($error);
}

if ($quickStop && $errors->count() > 0) {
return $errors;
}
}

return $errors;
Expand Down Expand Up @@ -152,6 +159,34 @@ private function validateEachCell(bool $quickStop = false): ErrorSuite
return $errors;
}

private function validateFile(bool $quickStop = false): ErrorSuite
{
$errors = new ErrorSuite();

$filenamePattern = $this->schema->getFilenamePattern();
if (
$filenamePattern !== null
&& $filenamePattern !== ''
&& \preg_match($filenamePattern, $this->csvFilename) === 0
) {
$error = new Error(
'filename_pattern',
'Filename "<c>' . Utils::cutPath($this->csvFilename) .
"</c>\" does not match pattern: \"<c>{$filenamePattern}</c>\"",
'',
0,
);

$errors->addError($error);

if ($quickStop && $errors->count() > 0) {
return $errors;
}
}

return $errors;
}

private static function validateAggregateRules(bool $quickStop = false): ErrorSuite
{
$errors = new ErrorSuite();
Expand Down
4 changes: 2 additions & 2 deletions src/Schema.php
Original file line number Diff line number Diff line change
Expand Up @@ -114,9 +114,9 @@ public function getColumn(int|string $columNameOrId): ?Column
return $column;
}

public function getFinenamePattern(): ?string
public function getFilenamePattern(): ?string
{
return $this->data->getStringNull('finename_pattern');
return Utils::prepareRegex($this->data->getStringNull('filename_pattern'));
}

public function getIncludes(): array
Expand Down
2 changes: 1 addition & 1 deletion src/Utils.php
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ public static function prepareRegex(?string $pattern, string $addDelimiter = '/'
}
}

return $addDelimiter . $pattern . $addDelimiter . 'u';
return $addDelimiter . $pattern . $addDelimiter;
}

/**
Expand Down
2 changes: 1 addition & 1 deletion tests/Blueprint/MiscTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ public function testPrepareRegex(): void
{
isSame(null, Utils::prepareRegex(null));
isSame(null, Utils::prepareRegex(''));
isSame('/.*/u', Utils::prepareRegex('.*'));
isSame('/.*/', Utils::prepareRegex('.*'));
isSame('#.*#u', Utils::prepareRegex('#.*#u'));
isSame('/.*/', Utils::prepareRegex('/.*/'));
isSame('/.*/ius', Utils::prepareRegex('/.*/ius'));
Expand Down
2 changes: 1 addition & 1 deletion tests/Blueprint/RulesTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -591,7 +591,7 @@ public function testRegex(): void
isSame(null, $rule->validate('aaa'));
isSame(null, $rule->validate('a'));
isSame(
'"regex" at line 0, column "prop". Value "1bc" does not match the pattern "/^a/u".',
'"regex" at line 0, column "prop". Value "1bc" does not match the pattern "/^a/".',
\strip_tags((string)$rule->validate('1bc')),
);
}
Expand Down
4 changes: 2 additions & 2 deletions tests/Blueprint/SchemaTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -44,10 +44,10 @@ public function testFilename(): void
public function testGetFinenamePattern(): void
{
$schemaEmpty = new Schema(self::SCHEMA_EXAMPLE_EMPTY);
isSame(null, $schemaEmpty->getFinenamePattern());
isSame(null, $schemaEmpty->getFilenamePattern());

$schemaFull = new Schema(self::SCHEMA_EXAMPLE_FULL);
isSame('^example\.csv$', $schemaFull->getFinenamePattern());
isSame('/^example\.csv$/', $schemaFull->getFilenamePattern());
}

public function testScvStruture(): void
Expand Down
Loading

0 comments on commit f189843

Please sign in to comment.