From c52a7afc7327bbb00661b32d276201fa6afe4db8 Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sat, 6 Apr 2024 01:46:23 +0400 Subject: [PATCH 01/24] Implement schema inheritance --- README.md | 44 +- schema-examples/full.json | 6 +- schema-examples/full.php | 6 +- schema-examples/full.yml | 6 +- schema-examples/full_clean.yml | 5 +- src/Csv/Column.php | 17 +- src/Schema.php | 99 ++- src/SchemaDataPrep.php | 273 +++++++ src/Utils.php | 35 +- src/Validators/ValidatorSchema.php | 18 +- tests/Commands/ValidateCsvBasicTest.php | 4 +- tests/Commands/ValidateCsvBatchSchemaTest.php | 2 +- tests/Commands/ValidateCsvReportsTest.php | 16 +- tests/SchemaInheritTest.php | 748 ++++++++++++++++++ tests/SchemaTest.php | 5 +- tests/Tools.php | 4 +- tests/schemas/inherit/child-of-child.yml | 40 + tests/schemas/inherit/child.yml | 76 ++ tests/schemas/inherit/parent.yml | 54 ++ 19 files changed, 1358 insertions(+), 100 deletions(-) create mode 100644 src/SchemaDataPrep.php create mode 100644 tests/SchemaInheritTest.php create mode 100644 tests/schemas/inherit/child-of-child.yml create mode 100644 tests/schemas/inherit/child.yml create mode 100644 tests/schemas/inherit/parent.yml diff --git a/README.md b/README.md index 5b53b052..e1051432 100644 --- a/README.md +++ b/README.md @@ -152,21 +152,6 @@ You can find launch examples in the [workflow demo](https://github.com/JBZoo/Csv ``` -To see user-friendly error outputs in your pull requests (PRs), specify `report: github`. This -utilizes [annotations](https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions#setting-a-warning-message) -to highlight bugs directly within the GitHub interface at the PR level. This feature allows errors to be displayed in -the exact location within the CSV file, right in the diff of your Pull Requests. For a practical example, -view [this live demo PR](https://github.com/JBZoo/Csv-Blueprint-Demo/pull/1/files). - -![GitHub Actions - PR](.github/assets/github-actions-pr.png) - -
- Click to see example in GitHub Actions terminal - -![GitHub Actions - Terminal](.github/assets/github-actions-termintal.png) - -
- ### Docker container Ensure you have Docker installed on your machine. @@ -307,6 +292,10 @@ description: | # Any description of the CSV file. Not u supporting a wide range of data validation rules from basic type checks to complex regex validations. This example serves as a comprehensive guide for creating robust CSV file validations. +includes: + parent-alias: ./readme_sample.yml # Include another schema and define an alias for it. + + # Regular expression to match the file name. If not set, then no pattern check. # This allows you to pre-validate the file name before processing its contents. # Feel free to check parent directories as well. @@ -513,9 +502,9 @@ columns: # Identifications phone: ALL # Validates if the input is a phone number. Specify the country code to validate the phone number for a specific country. Example: "ALL", "US", "BR".". + postal_code: US # Validate postal code by country code (alpha-2). Example: "02179". Extracted from https://www.geonames.org is_iban: true # IBAN - International Bank Account Number. See: https://en.wikipedia.org/wiki/International_Bank_Account_Number is_bic: true # Validates a Bank Identifier Code (BIC) according to ISO 9362 standards. See: https://en.wikipedia.org/wiki/ISO_9362 - postal_code: US # Validate postal code by country code (alpha-2). Example: "02179". Extracted from https://www.geonames.org is_imei: true # Validates an International Mobile Equipment Identity (IMEI). See: https://en.wikipedia.org/wiki/International_Mobile_Station_Equipment_Identity is_isbn: true # Validates an International Standard Book Number (ISBN). See: https://www.isbn-international.org/content/what-isbn @@ -1037,6 +1026,8 @@ The validation process culminates in a human-readable report detailing any error the default report format is a table, the tool supports various output formats, including text, GitHub, GitLab, TeamCity, JUnit, among others, to best suit your project's needs and your personal or team preferences. +### Table format + When using the `table` format (default), the output is organized in a clear, easily interpretable table that lists all discovered errors. This format is ideal for quick reviews and sharing with team members for further action. @@ -1088,12 +1079,30 @@ Summary: +### GitHub Action format + +To see user-friendly error outputs in your pull requests (PRs), specify `report: github`. This +utilizes [annotations](https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions#setting-a-warning-message) +to highlight bugs directly within the GitHub interface at the PR level. This feature allows errors to be displayed in +the exact location within the CSV file, right in the diff of your Pull Requests. For a practical example, +view [this live demo PR](https://github.com/JBZoo/Csv-Blueprint-Demo/pull/1/files). + +![GitHub Actions - PR](.github/assets/github-actions-pr.png) + +
+ Click to see example in GitHub Actions terminal + +![GitHub Actions - Terminal](.github/assets/github-actions-termintal.png) + +
+ + +### Text format Optional format `text` with highlited keywords: ```sh ./csv-blueprint validate:csv --report=text ``` - ![Report - Text](.github/assets/output-text.png) @@ -1102,6 +1111,7 @@ Optional format `text` with highlited keywords: * Tools uses [JBZoo/CI-Report-Converter](https://github.com/JBZoo/CI-Report-Converter) as SDK to convert reports to different formats. So you can easily integrate it with any CI system. + ## Benchmarks Understanding the performance of this tool is crucial, but it's important to note that its efficiency is influenced by diff --git a/schema-examples/full.json b/schema-examples/full.json index 232d52be..097af6d1 100644 --- a/schema-examples/full.json +++ b/schema-examples/full.json @@ -2,6 +2,10 @@ "name" : "CSV Blueprint Schema Example", "description" : "This YAML file provides a detailed description and validation rules for CSV files\nto be processed by CSV Blueprint tool. It includes specifications for file name patterns,\nCSV formatting options, and extensive validation criteria for individual columns and their values,\nsupporting a wide range of data validation rules from basic type checks to complex regex validations.\nThis example serves as a comprehensive guide for creating robust CSV file validations.\n", + "includes" : { + "parent-alias" : ".\/readme_sample.yml" + }, + "filename_pattern" : "\/demo(-\\d+)?\\.csv$\/i", "csv" : { @@ -147,9 +151,9 @@ "is_luhn" : true, "phone" : "ALL", + "postal_code" : "US", "is_iban" : true, "is_bic" : true, - "postal_code" : "US", "is_imei" : true, "is_isbn" : true, diff --git a/schema-examples/full.php b/schema-examples/full.php index 78c6adce..7e0e66db 100644 --- a/schema-examples/full.php +++ b/schema-examples/full.php @@ -23,6 +23,10 @@ This example serves as a comprehensive guide for creating robust CSV file validations. ', + 'includes' => [ + 'parent-alias' => './readme_sample.yml', + ], + 'filename_pattern' => '/demo(-\\d+)?\\.csv$/i', 'csv' => [ @@ -167,9 +171,9 @@ 'is_luhn' => true, 'phone' => 'ALL', + 'postal_code' => 'US', 'is_iban' => true, 'is_bic' => true, - 'postal_code' => 'US', 'is_imei' => true, 'is_isbn' => true, diff --git a/schema-examples/full.yml b/schema-examples/full.yml index 4ab1f58f..5d1dbfec 100644 --- a/schema-examples/full.yml +++ b/schema-examples/full.yml @@ -22,6 +22,10 @@ description: | # Any description of the CSV file. Not u supporting a wide range of data validation rules from basic type checks to complex regex validations. This example serves as a comprehensive guide for creating robust CSV file validations. +includes: + parent-alias: ./readme_sample.yml # Include another schema and define an alias for it. + + # Regular expression to match the file name. If not set, then no pattern check. # This allows you to pre-validate the file name before processing its contents. # Feel free to check parent directories as well. @@ -228,9 +232,9 @@ columns: # Identifications phone: ALL # Validates if the input is a phone number. Specify the country code to validate the phone number for a specific country. Example: "ALL", "US", "BR".". + postal_code: US # Validate postal code by country code (alpha-2). Example: "02179". Extracted from https://www.geonames.org is_iban: true # IBAN - International Bank Account Number. See: https://en.wikipedia.org/wiki/International_Bank_Account_Number is_bic: true # Validates a Bank Identifier Code (BIC) according to ISO 9362 standards. See: https://en.wikipedia.org/wiki/ISO_9362 - postal_code: US # Validate postal code by country code (alpha-2). Example: "02179". Extracted from https://www.geonames.org is_imei: true # Validates an International Mobile Equipment Identity (IMEI). See: https://en.wikipedia.org/wiki/International_Mobile_Station_Equipment_Identity is_isbn: true # Validates an International Standard Book Number (ISBN). See: https://www.isbn-international.org/content/what-isbn diff --git a/schema-examples/full_clean.yml b/schema-examples/full_clean.yml index aba54a55..b40ad3d1 100644 --- a/schema-examples/full_clean.yml +++ b/schema-examples/full_clean.yml @@ -21,6 +21,9 @@ description: | supporting a wide range of data validation rules from basic type checks to complex regex validations. This example serves as a comprehensive guide for creating robust CSV file validations. +includes: + parent-alias: ./readme_sample.yml + filename_pattern: '/demo(-\d+)?\.csv$/i' csv: @@ -161,9 +164,9 @@ columns: is_luhn: true phone: ALL + postal_code: US is_iban: true is_bic: true - postal_code: US is_imei: true is_isbn: true diff --git a/src/Csv/Column.php b/src/Csv/Column.php index 32f6aacd..e3dfaf72 100644 --- a/src/Csv/Column.php +++ b/src/Csv/Column.php @@ -33,21 +33,21 @@ final class Column private ?int $csvOffset = null; private int $schemaId; - private Data $column; + private Data $data; private array $rules; private array $aggRules; public function __construct(int $schemaId, array $config) { $this->schemaId = $schemaId; - $this->column = new Data($config); + $this->data = new Data($config); $this->rules = $this->prepareRuleSet('rules'); $this->aggRules = $this->prepareRuleSet('aggregate_rules'); } public function getName(): string { - return $this->column->getString('name', self::FALLBACK_VALUES['name']); + return $this->data->getString('name', self::FALLBACK_VALUES['name']); } public function getCsvOffset(): ?int @@ -62,7 +62,7 @@ public function getSchemaId(): int public function getDescription(): string { - return $this->column->getString('description', self::FALLBACK_VALUES['description']); + return $this->data->getString('description', self::FALLBACK_VALUES['description']); } public function getHumanName(): string @@ -78,7 +78,7 @@ public function getHumanName(): string public function isRequired(): bool { - return $this->column->getBool('required', self::FALLBACK_VALUES['required']); + return $this->data->getBool('required', self::FALLBACK_VALUES['required']); } public function getRules(): array @@ -106,11 +106,16 @@ public function setCsvOffset(int $csvOffset): void $this->csvOffset = $csvOffset; } + public function getData(): Data + { + return clone $this->data; + } + private function prepareRuleSet(string $schemaKey): array { $rules = []; - $ruleSetConfig = $this->column->getSelf($schemaKey, [])->getArrayCopy(); + $ruleSetConfig = $this->data->getSelf($schemaKey, [])->getArrayCopy(); foreach ($ruleSetConfig as $ruleName => $ruleValue) { $rules[$ruleName] = $ruleValue; } diff --git a/src/Schema.php b/src/Schema.php index ed33deb4..450809e3 100644 --- a/src/Schema.php +++ b/src/Schema.php @@ -32,23 +32,6 @@ final class Schema public const ENCODING_UTF16 = 'utf-16'; public const ENCODING_UTF32 = 'utf-32'; - private const FALLBACK_VALUES = [ - 'csv' => [ - 'inherit' => null, - 'header' => true, - 'delimiter' => ',', - 'quote_char' => '\\', - 'enclosure' => '"', - 'encoding' => 'utf-8', - 'bom' => false, - ], - - 'structural_rules' => [ - 'strict_column_order' => true, - 'allow_extra_columns' => false, - ], - ]; - /** @var Column[] */ private array $columns; private string $basepath = '.'; @@ -59,22 +42,22 @@ public function __construct(null|array|string $csvSchemaFilenameOrArray = null) { if (\is_array($csvSchemaFilenameOrArray)) { $this->filename = '_custom_array_'; - $this->data = new Data($csvSchemaFilenameOrArray); + $data = new Data($csvSchemaFilenameOrArray); } elseif ( \is_string($csvSchemaFilenameOrArray) && $csvSchemaFilenameOrArray !== '' && \file_exists($csvSchemaFilenameOrArray) ) { $this->filename = $csvSchemaFilenameOrArray; - $this->data = new Data(); + $data = new Data(); $fileExtension = \pathinfo($csvSchemaFilenameOrArray, \PATHINFO_EXTENSION); if ($fileExtension === 'yml' || $fileExtension === 'yaml') { - $this->data = yml($csvSchemaFilenameOrArray); + $data = yml($csvSchemaFilenameOrArray); } elseif ($fileExtension === 'json') { - $this->data = json($csvSchemaFilenameOrArray); + $data = json($csvSchemaFilenameOrArray); } elseif ($fileExtension === 'php') { - $this->data = phpArray($csvSchemaFilenameOrArray); + $data = phpArray($csvSchemaFilenameOrArray); } else { throw new \InvalidArgumentException("Unsupported file extension: {$fileExtension}"); } @@ -82,13 +65,16 @@ public function __construct(null|array|string $csvSchemaFilenameOrArray = null) throw new \InvalidArgumentException("Invalid schema data: {$csvSchemaFilenameOrArray}"); } else { $this->filename = null; - $this->data = new Data(); + $data = new Data(); } - if ((string)$this->filename !== '') { - $this->basepath = \dirname((string)$this->filename); + $basepath = '.'; + if ((string)$this->filename !== '' && $this->filename !== '_custom_array_') { + $this->filename = realpath($this->filename); + $basepath = \dirname((string)$this->filename); } + $this->data = (new SchemaDataPrep($data, $basepath))->buildData(); $this->columns = $this->prepareColumns(); } @@ -105,15 +91,39 @@ public function getColumns(): array return $this->columns; } - public function getColumn(int|string $columNameOrId): ?Column + public function getColumn(int|string $columNameOrId, ?string $forceName = null): ?Column { - if (\is_int($columNameOrId)) { + // By "index" + if (\is_numeric($columNameOrId) || \is_int($columNameOrId)) { return \array_values($this->getColumns())[$columNameOrId] ?? null; } - foreach ($this->getColumns() as $schemaColumn) { - if ($schemaColumn->getName() === $columNameOrId) { - return $schemaColumn; + // by "index:" + if (\preg_match('/^(\d+):$/', $columNameOrId, $matches) !== 0) { + return $this->getColumn((int)$matches[1]); + } + + // by "index:name" + if (\preg_match('/^(\d+):(.*)$/', $columNameOrId, $matches) !== 0) { + return $this->getColumn((int)$matches[1], $matches[2]); + } + + if ($forceName !== null) { + // by "index:name" (real) + foreach ($this->getColumns() as $columnIndex => $schemaColumn) { + if ( + $columnIndex === (int)$columNameOrId + && $schemaColumn->getName() === $forceName + ) { + return $schemaColumn; + } + } + } else { + // by "name" + foreach ($this->getColumns() as $schemaColumn) { + if ($schemaColumn->getName() === $columNameOrId) { + return $schemaColumn; + } } } @@ -125,23 +135,6 @@ public function getFilenamePattern(): ?string return Utils::prepareRegex($this->data->getStringNull('filename_pattern')); } - public function getIncludes(): array - { - $result = []; - - foreach ($this->data->getArray('includes') as $alias => $includedPath) { - if (\file_exists($includedPath)) { - $path = $includedPath; - } else { - $path = $this->basepath . \DIRECTORY_SEPARATOR . $includedPath; - } - - $result[$alias] = new self($path); - } - - return $result; - } - public function validate(bool $quickStop = false): ErrorSuite { return (new ValidatorSchema($this))->validate($quickStop); @@ -173,12 +166,12 @@ public function isAllowExtraColumns(): bool public function csvHasBOM(): bool { - return $this->data->findBool('csv.bom', self::FALLBACK_VALUES['csv']['bom']); + return $this->data->findBool('csv.bom'); } public function getCsvDelimiter(): string { - $value = $this->data->findString('csv.delimiter', self::FALLBACK_VALUES['csv']['delimiter']); + $value = $this->data->findString('csv.delimiter'); if (\strlen($value) === 1) { return $value; } @@ -188,7 +181,7 @@ public function getCsvDelimiter(): string public function getCsvQuoteChar(): string { - $value = $this->data->findString('csv.quote_char', self::FALLBACK_VALUES['csv']['quote_char']); + $value = $this->data->findString('csv.quote_char'); if (\strlen($value) === 1) { return $value; } @@ -198,7 +191,7 @@ public function getCsvQuoteChar(): string public function getCsvEnclosure(): string { - $value = $this->data->findString('csv.enclosure', self::FALLBACK_VALUES['csv']['enclosure']); + $value = $this->data->findString('csv.enclosure'); if (\strlen($value) === 1) { return $value; @@ -210,7 +203,7 @@ public function getCsvEnclosure(): string public function getCsvEncoding(): string { $encoding = \strtolower( - \trim($this->data->findString('csv.encoding', self::FALLBACK_VALUES['csv']['encoding'])), + \trim($this->data->findString('csv.encoding')), ); $availableOptions = [ // TODO: add flexible handler for this @@ -229,7 +222,7 @@ public function getCsvEncoding(): string public function csvHasHeader(): bool { - return $this->data->findBool('csv.header', self::FALLBACK_VALUES['csv']['header']); + return $this->data->findBool('csv.header'); } public function getCsvParams(): array diff --git a/src/SchemaDataPrep.php b/src/SchemaDataPrep.php new file mode 100644 index 00000000..66c40d4f --- /dev/null +++ b/src/SchemaDataPrep.php @@ -0,0 +1,273 @@ + '', + 'description' => '', + 'filename_pattern' => '', + + 'inlcudes' => [], + + 'csv' => [ + 'inherit' => null, + 'header' => true, + 'delimiter' => ',', + 'quote_char' => '\\', + 'enclosure' => '"', + 'encoding' => Schema::ENCODING_UTF8, + 'bom' => false, + ], + + 'structural_rules' => [ + 'strict_column_order' => true, + 'allow_extra_columns' => false, + ], + + 'column' => [ + 'inherit' => '', + 'name' => '', + 'description' => '', + 'example' => null, + 'required' => true, + 'rules' => [], + 'aggregate_rules' => [], + ], + + 'rules' => ['inherit' => ''], + 'aggregate_rules' => ['inherit' => ''], + ]; + + private AbstractData $data; + private string $basepath; + + /** @var Schema[] */ + private array $aliases; + + public function __construct(AbstractData $data, string $basepath) + { + $this->data = $data; + $this->basepath = $basepath; + $this->aliases = $this->prepareAliases($data); + } + + public function buildData(): Data + { + $result = [ + 'name' => $this->buildName(), + 'description' => $this->buildDescription(), + 'includes' => $this->buildIncludes(), + 'filename_pattern' => $this->buildFilenamePattern(), + 'csv' => $this->buildByKey('csv'), + 'structural_rules' => $this->buildByKey('structural_rules'), + 'columns' => $this->buildColumns(), + ]; + + // Any extra keys to see schema validation errors + foreach ($this->data->getArrayCopy() as $key => $value) { + if (!isset($result[$key])) { + $result[$key] = $value; + } + } + + return new Data($result); + } + + public static function getAliasRegex(): string + { + return '/^' . self::ALIAS_REGEX . '$/i'; + } + + public static function validateAlias(string $alias): void + { + if ($alias === '') { + throw new \InvalidArgumentException('Empty alias'); + } + + if (!\preg_match(self::getAliasRegex(), $alias)) { + throw new \InvalidArgumentException("Invalid alias: \"{$alias}\""); + } + } + + private function parseAliasParts(string $inherit): array + { + $alias = null; + $keyword = null; + $columnName = null; + $rules = null; + + $parts = \explode('/', $inherit); + if (\count($parts) === 2) { + [$alias, $keyword] = $parts; + } elseif (\count($parts) === 3) { + [$alias, $keyword, $columnName] = $parts; + } elseif (\count($parts) === 4) { + [$alias, $keyword, $columnName, $rules] = $parts; + } + + return [ + 'alias' => $alias, + 'keyword' => $keyword, + 'column' => $columnName, + 'rules' => $rules, + ]; + } + + /** + * @return Schema[] + */ + private function prepareAliases(AbstractData $data): array + { + $includes = []; + + foreach ($data->getArray('includes') as $alias => $includedPathOrArray) { + self::validateAlias($alias); + + if (\is_array($includedPathOrArray)) { + $includes[$alias] = new Schema($includedPathOrArray); + } elseif (\file_exists($includedPathOrArray)) { + $includes[$alias] = (new Schema($includedPathOrArray)); + } elseif (\file_exists("{$this->basepath}/{$includedPathOrArray}")) { + $includes[$alias] = (new Schema("{$this->basepath}/{$includedPathOrArray}")); + } else { + throw new \InvalidArgumentException("Unknown included file: \"{$includedPathOrArray}\""); + } + } + + return $includes; + } + + private function getParentSchema(string $alias): Schema + { + if (isset($this->aliases[$alias])) { + return $this->aliases[$alias]; + } + + throw new \InvalidArgumentException("Unknown included alias: \"{$alias}\""); + } + + private function buildFilenamePattern(): string + { + $inherit = $this->data->findString('filename_pattern.inherit'); + + if (\str_ends_with($inherit, '/filename_pattern')) { + $inheritParts = $this->parseAliasParts($inherit); + $parent = $this->getParentSchema($inheritParts['alias']); + return $parent->getData()->get('filename_pattern'); + } + + return $this->data->getString('filename_pattern', self::DEFAULTS['filename_pattern']); + } + + private function buildByKey(string $key = 'structural_rules'): array + { + $inherit = $this->data->findString("{$key}.inherit"); + + $parentConfig = []; + if (\preg_match('/' . self::ALIAS_REGEX . '\/' . $key . '$/i', $inherit)) { + $inheritParts = $this->parseAliasParts($inherit); + $parent = $this->getParentSchema($inheritParts['alias']); + $parentConfig = $parent->getData()->getArray($key); + } + + $result = Utils::mergeConfigs(self::DEFAULTS[$key], $parentConfig, $this->data->getArray($key)); + unset($result['inherit']); + + return $result; + } + + private function buildColumns(): array + { + $columns = []; + + foreach ($this->data->getArray('columns') as $columnIndex => $column) { + $columnData = new Data($column); + $columnInherit = $columnData->getString('inherit'); + + $parentConfig = []; + if (\preg_match('/' . self::ALIAS_REGEX . '\/columns\/[^\/]+$/i', $columnInherit)) { + $inheritParts = $this->parseAliasParts($columnInherit); + $parent = $this->getParentSchema($inheritParts['alias']); + $parentColumn = $parent->getColumn($inheritParts['column']); + if ($parentColumn === null) { + throw new \InvalidArgumentException("Unknown column: \"{$inheritParts['column']}\""); + } + + $parentConfig = $parentColumn->getData()->getArrayCopy(); + } + + $actualColumn = Utils::mergeConfigs(self::DEFAULTS['column'], $parentConfig, $columnData->getArrayCopy()); + $actualColumn['rules'] = $this->buildRules($actualColumn['rules'], 'rules'); + $actualColumn['aggregate_rules'] = $this->buildRules($actualColumn['aggregate_rules'], 'aggregate_rules'); + + unset($actualColumn['inherit']); + + $columns[$columnIndex] = $actualColumn; + } + + return $columns; + } + + private function buildIncludes(): array + { + $result = []; + foreach ($this->aliases as $alias => $schema) { + $result[$alias] = $schema->getFilename(); + } + + return $result; + } + + private function buildName(): string + { + return $this->data->getString('name', self::DEFAULTS['name']); + } + + private function buildDescription(): string + { + return $this->data->getString('description', self::DEFAULTS['description']); + } + + private function buildRules(array $rules, string $typeOfRules): array + { + $inherit = $rules['inherit'] ?? ''; + + $parentConfig = []; + if (\preg_match('/' . self::ALIAS_REGEX . '\/columns\/[^\/]+\/' . $typeOfRules . '$/i', $inherit)) { + $inheritParts = $this->parseAliasParts($inherit); + $parent = $this->getParentSchema($inheritParts['alias']); + $parentColumn = $parent->getColumn($inheritParts['column']); + if ($parentColumn === null) { + throw new \InvalidArgumentException("Unknown column: \"{$inheritParts['column']}\""); + } + + $parentConfig = $parentColumn->getData()->getArray($typeOfRules); + } + + $actualRules = Utils::mergeConfigs(self::DEFAULTS[$typeOfRules], $parentConfig, $rules); + unset($actualRules['inherit']); + + return $actualRules; + } +} diff --git a/src/Utils.php b/src/Utils.php index 116f1be2..b59b764b 100644 --- a/src/Utils.php +++ b/src/Utils.php @@ -246,12 +246,12 @@ public static function matchTypes( $actualType = \gettype($actual); $mapOfValidConvertions = [ - 'NULL' => [], + 'NULL' => ['string', 'integer', 'double', 'boolean'], 'array' => [], 'boolean' => [], - 'double' => ['string', 'integer'], - 'integer' => [], - 'string' => ['double', 'integer'], + 'double' => ['NULL', 'string', 'integer'], + 'integer' => ['NULL'], + 'string' => ['NULL', 'double', 'integer'], ]; if ($expectedType === $actualType) { @@ -430,6 +430,33 @@ public static function fixArgv(array $originalArgs): array return $newArgumens; } + public static function mergeConfigs(array ...$configs): array + { + $merged = \array_shift($configs); // Start with the first array + + foreach ($configs as $config) { + foreach ($config as $key => $value) { + // If both values are arrays + if (isset($merged[$key]) && \is_array($merged[$key]) && \is_array($value)) { + // Check if arrays are associative (assuming keys are consistent across values for simplicity) + $isAssoc = \array_keys($value) !== \range(0, \count($value) - 1); + if ($isAssoc) { + // Merge associative arrays recursively + $merged[$key] = self::mergeConfigs($merged[$key], $value); + } else { + // Replace non-associative arrays entirely + $merged[$key] = $value; + } + } else { + // Replace the value entirely + $merged[$key] = $value; + } + } + } + + return $merged; + } + /** * @param SplFileInfo[] $files */ diff --git a/src/Validators/ValidatorSchema.php b/src/Validators/ValidatorSchema.php index b4709a02..5fa42da8 100644 --- a/src/Validators/ValidatorSchema.php +++ b/src/Validators/ValidatorSchema.php @@ -129,6 +129,7 @@ private static function validateColumnExample(array $actualColumn, int $schemaCo { $exclude = [ 'Some example', // I.e. this value is taken from full.yml, then it will be invalid in advance. + null, ]; if (isset($actualColumn['example']) && !\in_array($actualColumn['example'], $exclude, true)) { @@ -144,7 +145,22 @@ private static function validateMeta( bool $quickStop = false, ): ErrorSuite { $errors = new ErrorSuite(); - $metaErrors = Utils::compareArray($expectedMeta, $actualMeta->getArrayCopy(), 'meta', '.'); + + $actualMetaAsArray = $actualMeta->getArrayCopy(); + $actualIncludes = $actualMetaAsArray['includes'] ?? []; + unset($expectedMeta['includes'], $actualMetaAsArray['includes']); + + $metaErrors = Utils::compareArray($expectedMeta, $actualMetaAsArray, 'meta', '.'); + + foreach($actualIncludes as $alias => $includedFile) { + if ($alias === '') { + $errors->addError(new Error('includes', 'Defined alias is empty')); + } + + if (!\is_string($includedFile)) { + $errors->addError(new Error('includes', 'Included filepath must be a string')); + } + } foreach ($metaErrors as $metaError) { $errors->addError(new Error('schema', $metaError[1], $metaError[0])); diff --git a/tests/Commands/ValidateCsvBasicTest.php b/tests/Commands/ValidateCsvBasicTest.php index f2c51045..f7eab1b8 100644 --- a/tests/Commands/ValidateCsvBasicTest.php +++ b/tests/Commands/ValidateCsvBasicTest.php @@ -173,8 +173,8 @@ public function testInvalidSchemaNotMatched(): void +-------+------------+--------+-------------------------------------------------------------------------+ | Line | id:Column | Rule | Message | +-------+------------+--------+-------------------------------------------------------------------------+ - | undef | meta | schema | Unknown key: .unknow_root_option | | undef | meta | schema | Unknown key: .csv.unknow_csv_param | + | undef | meta | schema | Unknown key: .unknow_root_option | | undef | 0:Name | schema | Unknown key: .columns.0.rules.unknow_rule | | undef | 1:City | schema | Unknown key: .columns.1.unknow_colum_option | | undef | 3:Birthday | schema | Expected type "string", actual "boolean" in .columns.3.rules.date_max | @@ -221,8 +221,8 @@ public function testInvalidSchemaAndNotFoundCSV(): void +-------+------------+--------+-------------------------------------------------------------------------+ | Line | id:Column | Rule | Message | +-------+------------+--------+-------------------------------------------------------------------------+ - | undef | meta | schema | Unknown key: .unknow_root_option | | undef | meta | schema | Unknown key: .csv.unknow_csv_param | + | undef | meta | schema | Unknown key: .unknow_root_option | | undef | 0:Name | schema | Unknown key: .columns.0.rules.unknow_rule | | undef | 1:City | schema | Unknown key: .columns.1.unknow_colum_option | | undef | 3:Birthday | schema | Expected type "string", actual "boolean" in .columns.3.rules.date_max | diff --git a/tests/Commands/ValidateCsvBatchSchemaTest.php b/tests/Commands/ValidateCsvBatchSchemaTest.php index e254eeef..829854b2 100644 --- a/tests/Commands/ValidateCsvBatchSchemaTest.php +++ b/tests/Commands/ValidateCsvBatchSchemaTest.php @@ -55,8 +55,8 @@ public function testMultiSchemaDiscovery(): void +-------+------------+--------+-------------------------------------------------------------------------+ | Line | id:Column | Rule | Message | +-------+------------+--------+-------------------------------------------------------------------------+ - | undef | meta | schema | Unknown key: .unknow_root_option | | undef | meta | schema | Unknown key: .csv.unknow_csv_param | + | undef | meta | schema | Unknown key: .unknow_root_option | | undef | 0:Name | schema | Unknown key: .columns.0.rules.unknow_rule | | undef | 1:City | schema | Unknown key: .columns.1.unknow_colum_option | | undef | 3:Birthday | schema | Expected type "string", actual "boolean" in .columns.3.rules.date_max | diff --git a/tests/Commands/ValidateCsvReportsTest.php b/tests/Commands/ValidateCsvReportsTest.php index e3409e48..cc3aef57 100644 --- a/tests/Commands/ValidateCsvReportsTest.php +++ b/tests/Commands/ValidateCsvReportsTest.php @@ -123,9 +123,9 @@ public function testGithub(): void Check schema syntax: 1 2 issues in ./tests/schemas/demo_invalid.yml - ::error file=./tests/schemas/demo_invalid.yml::is_float at column 2:Float%0A"is_float", column "2:Float". Value "Qwerty" is not a float number. + ::error file=/tests/schemas/demo_invalid.yml::is_float at column 2:Float%0A"is_float", column "2:Float". Value "Qwerty" is not a float number. - ::error file=./tests/schemas/demo_invalid.yml::allow_values at column 4:Favorite color%0A"allow_values", column "4:Favorite color". Value "123" is not allowed. Allowed values: ["red", "green", "Blue"]. + ::error file=/tests/schemas/demo_invalid.yml::allow_values at column 4:Favorite color%0A"allow_values", column "4:Favorite color". Value "123" is not allowed. Allowed values: ["red", "green", "Blue"]. CSV file validation: 1 @@ -171,11 +171,11 @@ public function testTeamcity(): void ##teamcity[testSuiteStarted name='tests/schemas/demo_invalid.yml' flowId='42'] - ##teamcity[testStarted name='is_float at column 2:Float' locationHint='php_qn://./tests/schemas/demo_invalid.yml' flowId='42'] + ##teamcity[testStarted name='is_float at column 2:Float' locationHint='php_qn:///tests/schemas/demo_invalid.yml' flowId='42'] "is_float", column "2:Float". Value "Qwerty" is not a float number. ##teamcity[testFinished name='is_float at column 2:Float' flowId='42'] - ##teamcity[testStarted name='allow_values at column 4:Favorite color' locationHint='php_qn://./tests/schemas/demo_invalid.yml' flowId='42'] + ##teamcity[testStarted name='allow_values at column 4:Favorite color' locationHint='php_qn:///tests/schemas/demo_invalid.yml' flowId='42'] "allow_values", column "4:Favorite color". Value "123" is not allowed. Allowed values: ["red", "green", "Blue"]. ##teamcity[testFinished name='allow_values at column 4:Favorite color' flowId='42'] @@ -241,10 +241,10 @@ public function testJunit(): void - + "is_float", column "2:Float". Value "Qwerty" is not a float number. - + "allow_values", column "4:Favorite color". Value "123" is not allowed. Allowed values: ["red", "green", "Blue"]. @@ -302,7 +302,7 @@ public function testGitlab(): void "fingerprint": "_replaced_", "severity": "major", "location": { - "path": ".\/tests\/schemas\/demo_invalid.yml", + "path": "\/tests\/schemas\/demo_invalid.yml", "lines": { "begin": 0 } @@ -313,7 +313,7 @@ public function testGitlab(): void "fingerprint": "_replaced_", "severity": "major", "location": { - "path": ".\/tests\/schemas\/demo_invalid.yml", + "path": "\/tests\/schemas\/demo_invalid.yml", "lines": { "begin": 0 } diff --git a/tests/SchemaInheritTest.php b/tests/SchemaInheritTest.php new file mode 100644 index 00000000..01a5980b --- /dev/null +++ b/tests/SchemaInheritTest.php @@ -0,0 +1,748 @@ + '', + 'description' => '', + 'includes' => [], + 'filename_pattern' => '', + 'csv' => [ + 'header' => true, + 'delimiter' => ',', + 'quote_char' => '\\', + 'enclosure' => '"', + 'encoding' => 'utf-8', + 'bom' => false, + ], + 'structural_rules' => [ + 'strict_column_order' => true, + 'allow_extra_columns' => false, + ], + 'columns' => [], + ], $schema->getData()->getArrayCopy()); + + isSame('', (string)$schema->validate()); + } + + public function testOverideDefaults(): void + { + $schema = new Schema([ + 'name' => 'Qwerty', + 'description' => 'Some description.', + 'includes' => [], + 'filename_pattern' => '/.*/i', + 'csv' => [ + 'header' => false, + 'delimiter' => 'd', + 'quote_char' => 'q', + 'enclosure' => 'e', + 'encoding' => 'utf-16', + 'bom' => true, + ], + 'structural_rules' => [ + 'strict_column_order' => false, + 'allow_extra_columns' => true, + ], + 'columns' => [ + ['name' => 'Name', 'required' => true], + ['name' => 'Second Column', 'required' => false], + ], + ]); + + isSame([ + 'name' => 'Qwerty', + 'description' => 'Some description.', + 'includes' => [], + 'filename_pattern' => '/.*/i', + 'csv' => [ + 'header' => false, + 'delimiter' => 'd', + 'quote_char' => 'q', + 'enclosure' => 'e', + 'encoding' => 'utf-16', + 'bom' => true, + ], + 'structural_rules' => [ + 'strict_column_order' => false, + 'allow_extra_columns' => true, + ], + 'columns' => [ + [ + 'name' => 'Name', + 'description' => '', + 'example' => null, + 'required' => true, + 'rules' => [], + 'aggregate_rules' => [], + ], + [ + 'name' => 'Second Column', + 'description' => '', + 'example' => null, + 'required' => false, + 'rules' => [], + 'aggregate_rules' => [], + ], + ], + ], $schema->getData()->getArrayCopy()); + + isSame('', (string)$schema->validate()); + } + + public function testOverideFilenamePattern(): void + { + $schema = new Schema([ + 'includes' => [ + 'parent' => ['filename_pattern' => '/.*/i'], + ], + 'filename_pattern' => [ + 'inherit' => 'parent/filename_pattern', + ], + ]); + + isSame('/.*/i', $schema->getData()->getString('filename_pattern')); + isSame('', (string)$schema->validate()); + } + + public function testOverideCsvFull(): void + { + $schema = new Schema([ + 'includes' => [ + 'parent' => [ + 'csv' => [ + 'header' => false, + 'delimiter' => 'd', + 'quote_char' => 'q', + 'enclosure' => 'e', + 'encoding' => 'utf-16', + 'bom' => true, + ], + ], + ], + 'csv' => ['inherit' => 'parent/csv'], + ]); + + isSame([ + 'header' => false, + 'delimiter' => 'd', + 'quote_char' => 'q', + 'enclosure' => 'e', + 'encoding' => 'utf-16', + 'bom' => true, + ], $schema->getData()->getArray('csv')); + + isSame('', (string)$schema->validate()); + } + + public function testOverideCsvPartial(): void + { + $schema = new Schema([ + 'includes' => [ + 'parent' => [ + 'csv' => [ + 'header' => false, + 'delimiter' => 'd', + 'quote_char' => 'q', + 'bom' => true, + ], + ], + ], + 'csv' => [ + 'inherit' => 'parent/csv', + 'encoding' => 'utf-32', + ], + ]); + + isSame([ + 'header' => false, // parent value + 'delimiter' => 'd', // parent value + 'quote_char' => 'q', // parent value + 'enclosure' => '"', // default value + 'encoding' => 'utf-32', // child value + 'bom' => true, // parent value + ], $schema->getData()->getArray('csv')); + + isSame('', (string)$schema->validate()); + } + + public function testOverideStructuralRulesFull(): void + { + $schema = new Schema([ + 'includes' => [ + 'parent' => [ + 'structural_rules' => [ + 'strict_column_order' => false, + 'allow_extra_columns' => true, + ], + ], + ], + 'structural_rules' => [ + 'inherit' => 'parent/structural_rules', + ], + ]); + + isSame([ + 'strict_column_order' => false, + 'allow_extra_columns' => true, + ], $schema->getData()->getArray('structural_rules')); + + isSame('', (string)$schema->validate()); + } + + public function testOverideStructuralRulesPartial1(): void + { + $schema = new Schema([ + 'includes' => [ + 'parent' => [ + 'structural_rules' => [ + 'strict_column_order' => true, + 'allow_extra_columns' => false, + ], + ], + ], + 'structural_rules' => [ + 'inherit' => 'parent/structural_rules', + 'allow_extra_columns' => true, + ], + ]); + + isSame([ + 'strict_column_order' => true, // parent value + 'allow_extra_columns' => true, // child value + ], $schema->getData()->getArray('structural_rules')); + isSame('', (string)$schema->validate()); + } + + public function testOverideStructuralRulesPartial2(): void + { + $schema = new Schema([ + 'includes' => ['parent' => ['structural_rules' => []]], + 'structural_rules' => [ + 'inherit' => 'parent/structural_rules', + 'allow_extra_columns' => true, + ], + ]); + + isSame([ + 'strict_column_order' => true, // default value + 'allow_extra_columns' => true, // parent value + ], $schema->getData()->getArray('structural_rules')); + isSame('', (string)$schema->validate()); + } + + public function testOverideColumnFull(): void + { + $parentColum0 = [ + 'name' => 'Name', + 'description' => 'Description', + 'example' => '123', + 'required' => false, + 'rules' => ['not_empty' => true], + 'aggregate_rules' => ['sum' => 10], + ]; + + $parentColum1 = [ + 'name' => 'Name', + 'description' => 'Another Description', + 'example' => '234', + 'required' => false, + 'rules' => ['is_int' => true], + 'aggregate_rules' => ['sum_max' => 100], + ]; + + $schema = new Schema([ + 'includes' => ['parent' => ['columns' => [$parentColum0, $parentColum1]]], + 'columns' => [ + ['inherit' => 'parent/columns/0'], + ['inherit' => 'parent/columns/1'], + ['inherit' => 'parent/columns/0:'], + ['inherit' => 'parent/columns/1:'], + ['inherit' => 'parent/columns/Name'], + ['inherit' => 'parent/columns/0:Name'], + ['inherit' => 'parent/columns/1:Name'], + ], + ]); + + isSame([ + $parentColum0, + $parentColum1, + $parentColum0, + $parentColum1, + $parentColum0, + $parentColum0, + $parentColum1, + ], $schema->getData()->getArray('columns')); + isSame('', (string)$schema->validate()); + } + + public function testOverideColumnPartial(): void + { + $parentColum = [ + 'name' => 'Name', + 'description' => 'Description', + 'rules' => [ + 'allow_values' => ['a', 'b', 'c'], + 'length_min' => 1, + 'length' => 5, + 'length_max' => 10, + ], + 'aggregate_rules' => ['sum_max' => 42], + ]; + + $schema = new Schema([ + 'includes' => ['parent' => ['columns' => [$parentColum]]], + 'columns' => [ + [ + 'inherit' => 'parent/columns/Name', + 'name' => 'Child name', + 'rules' => [ + 'is_int' => true, + 'length_min' => 2, + 'length' => 5, + 'allow_values' => ['c'], + ], + ], + ], + ]); + + isSame([ + [ + 'name' => 'Child name', // Child + 'description' => 'Description', // Parent + 'example' => null, // Default + 'required' => true, // Default + 'rules' => [ + 'allow_values' => ['c'], // Child + 'length_min' => 2, // Child + 'length' => 5, // Parent + 'length_max' => 10, // Parent + 'is_int' => true, // Child + ], + 'aggregate_rules' => ['sum_max' => 42], // Parent + ], + ], $schema->getData()->getArray('columns')); + isSame('', (string)$schema->validate()); + } + + public function testOverideColumnRulesFull(): void + { + $parentColum = [ + 'rules' => [ + 'allow_values' => ['a', 'b', 'c'], + 'length_min' => 1, + 'length' => 5, + 'length_max' => 10, + ], + 'aggregate_rules' => [ + 'sum_max' => 42, + 'is_unique' => true, + ], + ]; + + $schema = new Schema([ + 'includes' => ['parent' => ['columns' => [$parentColum]]], + 'columns' => [ + [ + 'name' => 'Child name', + 'rules' => ['inherit' => 'parent/columns/0:/rules'], + ], + ], + ]); + + isSame([ + [ + 'name' => 'Child name', // Child + 'description' => '', // Default + 'example' => null, // Default + 'required' => true, // Default + 'rules' => [ // Parent All + 'allow_values' => ['a', 'b', 'c'], + 'length_min' => 1, + 'length' => 5, + 'length_max' => 10, + ], + 'aggregate_rules' => [], // Default + ], + ], $schema->getData()->getArray('columns')); + isSame('', (string)$schema->validate()); + } + + public function testOverideColumnRulesPartial(): void + { + $parentColum = [ + 'rules' => [ + 'allow_values' => ['a', 'b', 'c'], + 'length_min' => 1, + 'length' => 5, + 'length_max' => 10, + ], + 'aggregate_rules' => [ + 'sum_max' => 42, + 'is_unique' => true, + ], + ]; + + $schema = new Schema([ + 'includes' => ['parent' => ['columns' => [$parentColum]]], + 'columns' => [ + [ + 'name' => 'Child name', + 'rules' => [ + 'inherit' => 'parent/columns/0:/rules', + 'allow_values' => ['d', 'c'], + 'length_max' => 100, + ], + ], + ], + ]); + + isSame([ + [ + 'name' => 'Child name', // Child + 'description' => '', // Default + 'example' => null, // Default + 'required' => true, // Default + 'rules' => [ + 'allow_values' => ['d', 'c'], // Child + 'length_min' => 1, // Parent + 'length' => 5, // Parent + 'length_max' => 100, // Child + ], + 'aggregate_rules' => [], // Default + ], + ], $schema->getData()->getArray('columns')); + isSame('', (string)$schema->validate()); + } + + public function testOverideColumnAggregateRulesFull(): void + { + $parentColum = [ + 'rules' => [ + 'allow_values' => ['a', 'b', 'c'], + 'length_min' => 1, + 'length' => 5, + 'length_max' => 10, + ], + 'aggregate_rules' => [ + 'sum_max' => 42, + 'is_unique' => true, + ], + ]; + + $schema = new Schema([ + 'includes' => ['parent' => ['columns' => [$parentColum]]], + 'columns' => [ + [ + 'name' => 'Child name', + 'aggregate_rules' => ['inherit' => 'parent/columns/0:/aggregate_rules'], + ], + ], + ]); + + isSame([ + [ + 'name' => 'Child name', // Child + 'description' => '', // Default + 'example' => null, // Default + 'required' => true, // Default + 'rules' => [], // default + 'aggregate_rules' => [ // Parent All + 'sum_max' => 42, + 'is_unique' => true, + ], + ], + ], $schema->getData()->getArray('columns')); + isSame('', (string)$schema->validate()); + } + + public function testOverideColumnAggregateRulesPartial(): void + { + $parentColum = [ + 'rules' => [ + 'allow_values' => ['a', 'b', 'c'], + 'length_min' => 1, + 'length' => 5, + 'length_max' => 10, + ], + 'aggregate_rules' => [ + 'sum_max' => 42, + 'is_unique' => true, + ], + ]; + + $schema = new Schema([ + 'includes' => ['parent' => ['columns' => [$parentColum]]], + 'columns' => [ + [ + 'name' => 'Child name', + 'aggregate_rules' => [ + 'inherit' => 'parent/columns/0:/aggregate_rules', + 'sum_max' => 4200, + 'sum_min' => 1, + ], + ], + ], + ]); + + isSame([ + [ + 'name' => 'Child name', // Child + 'description' => '', // Default + 'example' => null, // Default + 'required' => true, // Default + 'rules' => [], // default + 'aggregate_rules' => [ + 'sum_max' => 4200, // Child + 'is_unique' => true, // Parent + 'sum_min' => 1, // Child + ], + ], + ], $schema->getData()->getArray('columns')); + isSame('', (string)$schema->validate()); + } + + public function testRealParent(): void + { + $schema = new Schema('./tests/schemas/inherit/parent.yml'); + isSame([ + 'name' => 'Parent schema', + 'description' => 'Testing inheritance.', + 'includes' => [], + 'filename_pattern' => '/parent-\d.csv$/i', + 'csv' => [ + 'header' => false, + 'delimiter' => 'd', + 'quote_char' => 'q', + 'enclosure' => 'e', + 'encoding' => 'utf-16', + 'bom' => true, + ], + 'structural_rules' => [ + 'strict_column_order' => false, + 'allow_extra_columns' => true, + ], + 'columns' => [ + [ + 'name' => 'Name', + 'description' => 'Full name of the person.', + 'example' => 'John D', + 'required' => true, + 'rules' => [ + 'not_empty' => true, + 'length_min' => 5, + 'length_max' => 7, + ], + 'aggregate_rules' => [ + 'nth_num' => [4, 0.001], + ], + ], + [ + 'name' => 'Second Column', + 'description' => 'Some number.', + 'example' => 123, + 'required' => false, + 'rules' => [ + 'length_min' => 1, + 'length_max' => 4, + ], + 'aggregate_rules' => [ + 'sum' => 1000, + ], + ], + ], + ], $schema->getData()->getArrayCopy()); + isSame('', (string)$schema->validate()); + } + + public function testRealChild(): void + { + $schema = new Schema('./tests/schemas/inherit/child.yml'); + isSame([ + 'name' => 'Child schema', + 'description' => 'Testing inheritance from parent schema.', + 'includes' => [ + 'parent' => PROJECT_ROOT . '/tests/schemas/inherit/parent.yml', + ], + 'filename_pattern' => '/parent-\d.csv$/i', + 'csv' => [ + 'header' => true, + 'delimiter' => 'd', + 'quote_char' => 'q', + 'enclosure' => 'e', + 'encoding' => 'utf-16', + 'bom' => true, + ], + 'structural_rules' => [ + 'strict_column_order' => true, + 'allow_extra_columns' => true, + ], + 'columns' => [ + 0 => [ + 'name' => 'Name', + 'description' => 'Full name of the person.', + 'example' => 'John D', + 'required' => true, + 'rules' => [ + 'not_empty' => true, + 'length_min' => 5, + 'length_max' => 7, + ], + 'aggregate_rules' => ['nth_num' => [4, 0.001]], + ], + 1 => [ + 'name' => 'Overridden name by column name', + 'description' => 'Full name of the person.', + 'example' => 'John D', + 'required' => true, + 'rules' => [ + 'not_empty' => true, + 'length_min' => 5, + 'length_max' => 7, + ], + 'aggregate_rules' => ['nth_num' => [4, 0.001]], + ], + 2 => [ + 'name' => 'Overridden name by column index', + 'description' => 'Full name of the person.', + 'example' => 'John D', + 'required' => true, + 'rules' => [ + 'not_empty' => true, + 'length_min' => 5, + 'length_max' => 7, + ], + 'aggregate_rules' => ['nth_num' => [4, 0.001]], + ], + 3 => [ + 'name' => 'Overridden name by column index and column name', + 'description' => 'Full name of the person.', + 'example' => 'John D', + 'required' => true, + 'rules' => [ + 'not_empty' => true, + 'length_min' => 5, + 'length_max' => 7, + ], + 'aggregate_rules' => ['nth_num' => [4, 0.001]], + ], + 4 => [ + 'name' => 'Overridden name by column index and column name + added rules', + 'description' => 'Full name of the person.', + 'example' => 'John D', + 'required' => true, + 'rules' => [ + 'not_empty' => true, + 'length_min' => 1, + 'length_max' => 7, + ], + 'aggregate_rules' => ['nth_num' => [4, 0.001]], + ], + 5 => [ + 'name' => 'Overridden name by column index and column name + added aggregate rules', + 'description' => 'Full name of the person.', + 'example' => 'John D', + 'required' => true, + 'rules' => [ + 'not_empty' => true, + 'length_min' => 5, + 'length_max' => 7, + ], + 'aggregate_rules' => ['nth_num' => [10, 0.05]], + ], + 6 => [ + 'name' => 'Overridden only rules', + 'description' => '', + 'example' => null, + 'required' => true, + 'rules' => [ + 'not_empty' => true, + 'length_min' => 5, + 'length_max' => 7, + ], + 'aggregate_rules' => [], + ], + 7 => [ + 'name' => 'Overridden only aggregation rules', + 'description' => '', + 'example' => null, + 'required' => true, + 'rules' => [], + 'aggregate_rules' => ['nth_num' => [4, 0.001]], + ], + 8 => [ + 'name' => 'Second Column', + 'description' => 'Some number.', + 'example' => 123, + 'required' => false, + 'rules' => [ + 'length_min' => 1, + 'length_max' => 4, + ], + 'aggregate_rules' => ['sum' => 1000], + ], + ], + ], $schema->getData()->getArrayCopy()); + isSame('', (string)$schema->validate()); + } + + public function testRealChildOfChild(): void + { + $schema = new Schema('./tests/schemas/inherit/child-of-child.yml'); + isSame([ + 'name' => 'Child of child schema', + 'description' => 'Testing inheritance from child schema.', + 'includes' => [ + 'parent-1_0' => PROJECT_ROOT . '/tests/schemas/inherit/child.yml', + ], + 'filename_pattern' => '/child-of-child-\d.csv$/i', + 'csv' => [ + 'header' => true, + 'delimiter' => 'dd', + 'quote_char' => 'qq', + 'enclosure' => 'ee', + 'encoding' => 'utf-32', + 'bom' => false, + ], + 'structural_rules' => [ + 'strict_column_order' => true, + 'allow_extra_columns' => false, + ], + 'columns' => [ + [ + 'name' => 'Second Column', + 'description' => 'Some number.', + 'example' => 123, + 'required' => false, + 'rules' => [ + 'length_min' => 1, + 'length_max' => 4, + ], + 'aggregate_rules' => ['sum' => 1000], + ], + ], + ], $schema->getData()->getArrayCopy()); + isSame('', (string)$schema->validate()); + } +} diff --git a/tests/SchemaTest.php b/tests/SchemaTest.php index 3878fb9d..19aebad6 100644 --- a/tests/SchemaTest.php +++ b/tests/SchemaTest.php @@ -195,6 +195,7 @@ public function testValidateValidSchemaFixtures(): void { $schemas = (new Finder()) ->in(PROJECT_ROOT . '/tests/schemas') + ->in(PROJECT_ROOT . '/tests/schemas/inherit') ->in(PROJECT_ROOT . '/tests/Benchmarks') ->in(PROJECT_ROOT . '/schema-examples') ->name('*.yml') @@ -220,8 +221,8 @@ public function testValidateInvalidSchema(): void +-------+------------+--------+-------------------------------------------------------------------------+ | Line | id:Column | Rule | Message | +-------+------------+--------+-------------------------------------------------------------------------+ - | undef | meta | schema | Unknown key: .unknow_root_option | | undef | meta | schema | Unknown key: .csv.unknow_csv_param | + | undef | meta | schema | Unknown key: .unknow_root_option | | undef | 0:Name | schema | Unknown key: .columns.0.rules.unknow_rule | | undef | 1:City | schema | Unknown key: .columns.1.unknow_colum_option | | undef | 3:Birthday | schema | Expected type "string", actual "boolean" in .columns.3.rules.date_max | @@ -235,8 +236,8 @@ public function testValidateInvalidSchema(): void isSame( <<<'TEXT' - "schema", column "meta". Unknown key: .unknow_root_option. "schema", column "meta". Unknown key: .csv.unknow_csv_param. + "schema", column "meta". Unknown key: .unknow_root_option. "schema", column "0:Name". Unknown key: .columns.0.rules.unknow_rule. "schema", column "1:City". Unknown key: .columns.1.unknow_colum_option. "schema", column "3:Birthday". Expected type "string", actual "boolean" in .columns.3.rules.date_max. diff --git a/tests/Tools.php b/tests/Tools.php index 876c21f6..f05d65b2 100644 --- a/tests/Tools.php +++ b/tests/Tools.php @@ -34,9 +34,9 @@ final class Tools public const SCHEMA_SIMPLE_NO_HEADER = './tests/schemas/simple_no_header.yml'; public const SCHEMA_SIMPLE_HEADER_PHP = './tests/schemas/simple_header.php'; public const SCHEMA_SIMPLE_HEADER_JSON = './tests/schemas/simple_header.json'; - public const SCHEMA_EXAMPLE_EMPTY = './tests/schemas/example_empty.yml'; + public const SCHEMA_EXAMPLE_EMPTY = PROJECT_ROOT . '/tests/schemas/example_empty.yml'; - public const SCHEMA_FULL_YML = './schema-examples/full.yml'; + public const SCHEMA_FULL_YML = PROJECT_ROOT . '/schema-examples/full.yml'; public const SCHEMA_FULL_YML_CLEAN = './schema-examples/full_clean.yml'; public const SCHEMA_FULL_JSON = './schema-examples/full.json'; public const SCHEMA_FULL_PHP = './schema-examples/full.php'; diff --git a/tests/schemas/inherit/child-of-child.yml b/tests/schemas/inherit/child-of-child.yml new file mode 100644 index 00000000..465cdc4e --- /dev/null +++ b/tests/schemas/inherit/child-of-child.yml @@ -0,0 +1,40 @@ +# +# JBZoo Toolbox - Csv-Blueprint. +# +# This file is part of the JBZoo Toolbox project. +# For the full copyright and license information, please view the LICENSE +# file that was distributed with this source code. +# +# @license MIT +# @copyright Copyright (C) JBZoo.com, All rights reserved. +# @see https://github.com/JBZoo/Csv-Blueprint +# + +# This schema is invalid because does not match the CSV file (tests/fixtures/demo.csv). + + +name: Child of child schema +description: Testing inheritance from child schema. + +includes: + parent-1_0: child.yml + + +filename_pattern: /child-of-child-\d.csv$/i + + +csv: + inherit: 'parent-1_0/csv' + delimiter: 'dd' + quote_char: 'qq' + enclosure: 'ee' + encoding: utf-32 + bom: false + + +structural_rules: + inherit: 'parent-1_0/structural_rules' + allow_extra_columns: false + +columns: + - inherit: 'parent-1_0/columns/Second Column' diff --git a/tests/schemas/inherit/child.yml b/tests/schemas/inherit/child.yml new file mode 100644 index 00000000..64959153 --- /dev/null +++ b/tests/schemas/inherit/child.yml @@ -0,0 +1,76 @@ +# +# JBZoo Toolbox - Csv-Blueprint. +# +# This file is part of the JBZoo Toolbox project. +# For the full copyright and license information, please view the LICENSE +# file that was distributed with this source code. +# +# @license MIT +# @copyright Copyright (C) JBZoo.com, All rights reserved. +# @see https://github.com/JBZoo/Csv-Blueprint +# + +# This schema is invalid because does not match the CSV file (tests/fixtures/demo.csv). + + +name: Child schema +description: Testing inheritance from parent schema. + +includes: + parent: ./../inherit/parent.yml + + +filename_pattern: + inherit: 'parent/filename_pattern' + + +csv: + inherit: 'parent/csv' + header: true + + +structural_rules: + inherit: 'parent/structural_rules' + strict_column_order: true + + +columns: + # 0 + - inherit: 'parent/columns/Name' + + # 1 + - inherit: 'parent/columns/Name' + name: Overridden name by column name + + # 2 + - inherit: 'parent/columns/0:' + name: Overridden name by column index + + # 3 + - inherit: 'parent/columns/0:Name' + name: Overridden name by column index and column name + + # 4 + - inherit: 'parent/columns/0:Name' + name: Overridden name by column index and column name + added rules + rules: + length_min: 1 + + # 5 + - inherit: 'parent/columns/0:Name' + name: Overridden name by column index and column name + added aggregate rules + aggregate_rules: + nth_num: [ 10, 0.05 ] + + # 6 + - name: Overridden only rules + rules: + inherit: 'parent/columns/0:Name/rules' + + # 7 + - name: Overridden only aggregation rules + aggregate_rules: + inherit: 'parent/columns/0:Name/aggregate_rules' + + # 8 + - inherit: 'parent/columns/Second Column' diff --git a/tests/schemas/inherit/parent.yml b/tests/schemas/inherit/parent.yml new file mode 100644 index 00000000..2f51732b --- /dev/null +++ b/tests/schemas/inherit/parent.yml @@ -0,0 +1,54 @@ +# +# JBZoo Toolbox - Csv-Blueprint. +# +# This file is part of the JBZoo Toolbox project. +# For the full copyright and license information, please view the LICENSE +# file that was distributed with this source code. +# +# @license MIT +# @copyright Copyright (C) JBZoo.com, All rights reserved. +# @see https://github.com/JBZoo/Csv-Blueprint +# + +# This schema is invalid because does not match the CSV file (tests/fixtures/demo.csv). + +name: Parent schema +description: Testing inheritance. + +filename_pattern: /parent-\d.csv$/i + +csv: + header: false + delimiter: 'd' + quote_char: 'q' + enclosure: 'e' + encoding: utf-16 + bom: true + + +structural_rules: + strict_column_order: false + allow_extra_columns: true + + +columns: + - name: Name + required: true + example: John D + description: Full name of the person. + rules: + not_empty: true + length_min: 5 + length_max: 7 + aggregate_rules: + nth_num: [ 4, 0.001 ] + + - name: Second Column + required: false + example: 123 + description: Some number. + rules: + length_min: 1 + length_max: 4 + aggregate_rules: + sum: 1000 From 9dccfa1cadc814d0a4f7d908e988b1b3138a5c27 Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sat, 6 Apr 2024 01:52:40 +0400 Subject: [PATCH 02/24] Refactor code for clearer architecture and alias validation This commit addresses significant changes in several classes. In Schema.php, the constructor was refactored and stricter validation was added to handle the filename correctly. ValidatorSchema and SchemaDataPrep saw improvements in the way aliases are matched and validated. In Utils.php, added explicit casting to ensure the first array in mergeConfigs function is indeed an array. These changes combined improve the implementation of schema inheritance and make the code more readable and reliable. --- src/Schema.php | 10 +++++----- src/SchemaDataPrep.php | 8 ++++---- src/Utils.php | 2 +- src/Validators/ValidatorSchema.php | 2 +- 4 files changed, 11 insertions(+), 11 deletions(-) diff --git a/src/Schema.php b/src/Schema.php index 450809e3..2b3fbf70 100644 --- a/src/Schema.php +++ b/src/Schema.php @@ -34,7 +34,6 @@ final class Schema /** @var Column[] */ private array $columns; - private string $basepath = '.'; private ?string $filename; private AbstractData $data; @@ -69,9 +68,10 @@ public function __construct(null|array|string $csvSchemaFilenameOrArray = null) } $basepath = '.'; - if ((string)$this->filename !== '' && $this->filename !== '_custom_array_') { - $this->filename = realpath($this->filename); - $basepath = \dirname((string)$this->filename); + $filename = (string)$this->filename; + if ($filename !== '' && \file_exists($filename)) { + $this->filename = (string)\realpath($filename); + $basepath = \dirname($filename); } $this->data = (new SchemaDataPrep($data, $basepath))->buildData(); @@ -94,7 +94,7 @@ public function getColumns(): array public function getColumn(int|string $columNameOrId, ?string $forceName = null): ?Column { // By "index" - if (\is_numeric($columNameOrId) || \is_int($columNameOrId)) { + if (\is_numeric($columNameOrId)) { return \array_values($this->getColumns())[$columNameOrId] ?? null; } diff --git a/src/SchemaDataPrep.php b/src/SchemaDataPrep.php index 66c40d4f..5eb174db 100644 --- a/src/SchemaDataPrep.php +++ b/src/SchemaDataPrep.php @@ -105,7 +105,7 @@ public static function validateAlias(string $alias): void throw new \InvalidArgumentException('Empty alias'); } - if (!\preg_match(self::getAliasRegex(), $alias)) { + if (\preg_match(self::getAliasRegex(), $alias) === 0) { throw new \InvalidArgumentException("Invalid alias: \"{$alias}\""); } } @@ -185,7 +185,7 @@ private function buildByKey(string $key = 'structural_rules'): array $inherit = $this->data->findString("{$key}.inherit"); $parentConfig = []; - if (\preg_match('/' . self::ALIAS_REGEX . '\/' . $key . '$/i', $inherit)) { + if (\preg_match('/' . self::ALIAS_REGEX . '\/' . $key . '$/i', $inherit) === 1) { $inheritParts = $this->parseAliasParts($inherit); $parent = $this->getParentSchema($inheritParts['alias']); $parentConfig = $parent->getData()->getArray($key); @@ -206,7 +206,7 @@ private function buildColumns(): array $columnInherit = $columnData->getString('inherit'); $parentConfig = []; - if (\preg_match('/' . self::ALIAS_REGEX . '\/columns\/[^\/]+$/i', $columnInherit)) { + if (\preg_match('/' . self::ALIAS_REGEX . '\/columns\/[^\/]+$/i', $columnInherit) === 1) { $inheritParts = $this->parseAliasParts($columnInherit); $parent = $this->getParentSchema($inheritParts['alias']); $parentColumn = $parent->getColumn($inheritParts['column']); @@ -254,7 +254,7 @@ private function buildRules(array $rules, string $typeOfRules): array $inherit = $rules['inherit'] ?? ''; $parentConfig = []; - if (\preg_match('/' . self::ALIAS_REGEX . '\/columns\/[^\/]+\/' . $typeOfRules . '$/i', $inherit)) { + if (\preg_match('/' . self::ALIAS_REGEX . '\/columns\/[^\/]+\/' . $typeOfRules . '$/i', $inherit) === 1) { $inheritParts = $this->parseAliasParts($inherit); $parent = $this->getParentSchema($inheritParts['alias']); $parentColumn = $parent->getColumn($inheritParts['column']); diff --git a/src/Utils.php b/src/Utils.php index b59b764b..4fa37c38 100644 --- a/src/Utils.php +++ b/src/Utils.php @@ -432,7 +432,7 @@ public static function fixArgv(array $originalArgs): array public static function mergeConfigs(array ...$configs): array { - $merged = \array_shift($configs); // Start with the first array + $merged = (array)\array_shift($configs); // Start with the first array foreach ($configs as $config) { foreach ($config as $key => $value) { diff --git a/src/Validators/ValidatorSchema.php b/src/Validators/ValidatorSchema.php index 5fa42da8..634b08c9 100644 --- a/src/Validators/ValidatorSchema.php +++ b/src/Validators/ValidatorSchema.php @@ -152,7 +152,7 @@ private static function validateMeta( $metaErrors = Utils::compareArray($expectedMeta, $actualMetaAsArray, 'meta', '.'); - foreach($actualIncludes as $alias => $includedFile) { + foreach ($actualIncludes as $alias => $includedFile) { if ($alias === '') { $errors->addError(new Error('includes', 'Defined alias is empty')); } From 894090f12ad3a619aa532a857a71d1dc02c14b00 Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sat, 6 Apr 2024 02:26:15 +0400 Subject: [PATCH 03/24] Refactored and validated alias inheritance in schemas The changes in various classes improve reliability and readability of the code. In Schema.php, a refactored constructor and stricter filename validation ensures correct file handling. Enhanced match and validation handling in ValidatorSchema and SchemaDataPrep improves alias processing. Furthermore, explicit casting in Utils.php's mergeConfigs function ensures reliable array handling. --- src/Schema.php | 1 - src/SchemaDataPrep.php | 61 ++++++++++-------------- src/Utils.php | 3 ++ tests/SchemaInheritTest.php | 36 +++++++------- tests/schemas/inherit/child-of-child.yml | 12 ++--- tests/schemas/inherit/child.yml | 24 +++++----- tests/schemas/inherit/parent.yml | 6 +-- 7 files changed, 68 insertions(+), 75 deletions(-) diff --git a/src/Schema.php b/src/Schema.php index 2b3fbf70..8da2530a 100644 --- a/src/Schema.php +++ b/src/Schema.php @@ -48,7 +48,6 @@ public function __construct(null|array|string $csvSchemaFilenameOrArray = null) && \file_exists($csvSchemaFilenameOrArray) ) { $this->filename = $csvSchemaFilenameOrArray; - $data = new Data(); $fileExtension = \pathinfo($csvSchemaFilenameOrArray, \PATHINFO_EXTENSION); if ($fileExtension === 'yml' || $fileExtension === 'yaml') { diff --git a/src/SchemaDataPrep.php b/src/SchemaDataPrep.php index 5eb174db..357712da 100644 --- a/src/SchemaDataPrep.php +++ b/src/SchemaDataPrep.php @@ -105,35 +105,12 @@ public static function validateAlias(string $alias): void throw new \InvalidArgumentException('Empty alias'); } - if (\preg_match(self::getAliasRegex(), $alias) === 0) { + $regex = self::getAliasRegex(); + if ($regex !== '' && \preg_match($regex, $alias) === 0) { throw new \InvalidArgumentException("Invalid alias: \"{$alias}\""); } } - private function parseAliasParts(string $inherit): array - { - $alias = null; - $keyword = null; - $columnName = null; - $rules = null; - - $parts = \explode('/', $inherit); - if (\count($parts) === 2) { - [$alias, $keyword] = $parts; - } elseif (\count($parts) === 3) { - [$alias, $keyword, $columnName] = $parts; - } elseif (\count($parts) === 4) { - [$alias, $keyword, $columnName, $rules] = $parts; - } - - return [ - 'alias' => $alias, - 'keyword' => $keyword, - 'column' => $columnName, - 'rules' => $rules, - ]; - } - /** * @return Schema[] */ @@ -142,6 +119,8 @@ private function prepareAliases(AbstractData $data): array $includes = []; foreach ($data->getArray('includes') as $alias => $includedPathOrArray) { + $alias = (string)$alias; + self::validateAlias($alias); if (\is_array($includedPathOrArray)) { @@ -171,8 +150,8 @@ private function buildFilenamePattern(): string { $inherit = $this->data->findString('filename_pattern.inherit'); - if (\str_ends_with($inherit, '/filename_pattern')) { - $inheritParts = $this->parseAliasParts($inherit); + if ($inherit !== '') { + $inheritParts = self::parseAliasParts($inherit); $parent = $this->getParentSchema($inheritParts['alias']); return $parent->getData()->get('filename_pattern'); } @@ -185,13 +164,13 @@ private function buildByKey(string $key = 'structural_rules'): array $inherit = $this->data->findString("{$key}.inherit"); $parentConfig = []; - if (\preg_match('/' . self::ALIAS_REGEX . '\/' . $key . '$/i', $inherit) === 1) { - $inheritParts = $this->parseAliasParts($inherit); + if ($inherit !== '') { + $inheritParts = self::parseAliasParts($inherit); $parent = $this->getParentSchema($inheritParts['alias']); $parentConfig = $parent->getData()->getArray($key); } - $result = Utils::mergeConfigs(self::DEFAULTS[$key], $parentConfig, $this->data->getArray($key)); + $result = Utils::mergeConfigs((array)self::DEFAULTS[$key], $parentConfig, $this->data->getArray($key)); unset($result['inherit']); return $result; @@ -206,8 +185,8 @@ private function buildColumns(): array $columnInherit = $columnData->getString('inherit'); $parentConfig = []; - if (\preg_match('/' . self::ALIAS_REGEX . '\/columns\/[^\/]+$/i', $columnInherit) === 1) { - $inheritParts = $this->parseAliasParts($columnInherit); + if ($columnInherit !== '') { + $inheritParts = self::parseAliasParts($columnInherit); $parent = $this->getParentSchema($inheritParts['alias']); $parentColumn = $parent->getColumn($inheritParts['column']); if ($parentColumn === null) { @@ -254,8 +233,8 @@ private function buildRules(array $rules, string $typeOfRules): array $inherit = $rules['inherit'] ?? ''; $parentConfig = []; - if (\preg_match('/' . self::ALIAS_REGEX . '\/columns\/[^\/]+\/' . $typeOfRules . '$/i', $inherit) === 1) { - $inheritParts = $this->parseAliasParts($inherit); + if ($inherit !== '') { + $inheritParts = self::parseAliasParts($inherit); $parent = $this->getParentSchema($inheritParts['alias']); $parentColumn = $parent->getColumn($inheritParts['column']); if ($parentColumn === null) { @@ -265,9 +244,21 @@ private function buildRules(array $rules, string $typeOfRules): array $parentConfig = $parentColumn->getData()->getArray($typeOfRules); } - $actualRules = Utils::mergeConfigs(self::DEFAULTS[$typeOfRules], $parentConfig, $rules); + $actualRules = Utils::mergeConfigs((array)self::DEFAULTS[$typeOfRules], $parentConfig, $rules); unset($actualRules['inherit']); return $actualRules; } + + private static function parseAliasParts(string $inherit): array + { + $parts = \explode('/', $inherit); + self::validateAlias($parts[0]); + + if (\count($parts) === 1) { + return ['alias' => $parts[0]]; + } + + return ['alias' => $parts[0], 'column' => $parts[1]]; + } } diff --git a/src/Utils.php b/src/Utils.php index 4fa37c38..315691cb 100644 --- a/src/Utils.php +++ b/src/Utils.php @@ -430,6 +430,9 @@ public static function fixArgv(array $originalArgs): array return $newArgumens; } + /** + * @param array|int[]|string[] ...$configs + */ public static function mergeConfigs(array ...$configs): array { $merged = (array)\array_shift($configs); // Start with the first array diff --git a/tests/SchemaInheritTest.php b/tests/SchemaInheritTest.php index 01a5980b..ac5a2d96 100644 --- a/tests/SchemaInheritTest.php +++ b/tests/SchemaInheritTest.php @@ -118,7 +118,7 @@ public function testOverideFilenamePattern(): void 'parent' => ['filename_pattern' => '/.*/i'], ], 'filename_pattern' => [ - 'inherit' => 'parent/filename_pattern', + 'inherit' => 'parent', ], ]); @@ -141,7 +141,7 @@ public function testOverideCsvFull(): void ], ], ], - 'csv' => ['inherit' => 'parent/csv'], + 'csv' => ['inherit' => 'parent'], ]); isSame([ @@ -170,7 +170,7 @@ public function testOverideCsvPartial(): void ], ], 'csv' => [ - 'inherit' => 'parent/csv', + 'inherit' => 'parent', 'encoding' => 'utf-32', ], ]); @@ -199,7 +199,7 @@ public function testOverideStructuralRulesFull(): void ], ], 'structural_rules' => [ - 'inherit' => 'parent/structural_rules', + 'inherit' => 'parent', ], ]); @@ -223,7 +223,7 @@ public function testOverideStructuralRulesPartial1(): void ], ], 'structural_rules' => [ - 'inherit' => 'parent/structural_rules', + 'inherit' => 'parent', 'allow_extra_columns' => true, ], ]); @@ -240,7 +240,7 @@ public function testOverideStructuralRulesPartial2(): void $schema = new Schema([ 'includes' => ['parent' => ['structural_rules' => []]], 'structural_rules' => [ - 'inherit' => 'parent/structural_rules', + 'inherit' => 'parent', 'allow_extra_columns' => true, ], ]); @@ -275,13 +275,13 @@ public function testOverideColumnFull(): void $schema = new Schema([ 'includes' => ['parent' => ['columns' => [$parentColum0, $parentColum1]]], 'columns' => [ - ['inherit' => 'parent/columns/0'], - ['inherit' => 'parent/columns/1'], - ['inherit' => 'parent/columns/0:'], - ['inherit' => 'parent/columns/1:'], - ['inherit' => 'parent/columns/Name'], - ['inherit' => 'parent/columns/0:Name'], - ['inherit' => 'parent/columns/1:Name'], + ['inherit' => 'parent/0'], + ['inherit' => 'parent/1'], + ['inherit' => 'parent/0:'], + ['inherit' => 'parent/1:'], + ['inherit' => 'parent/Name'], + ['inherit' => 'parent/0:Name'], + ['inherit' => 'parent/1:Name'], ], ]); @@ -315,7 +315,7 @@ public function testOverideColumnPartial(): void 'includes' => ['parent' => ['columns' => [$parentColum]]], 'columns' => [ [ - 'inherit' => 'parent/columns/Name', + 'inherit' => 'parent/Name', 'name' => 'Child name', 'rules' => [ 'is_int' => true, @@ -366,7 +366,7 @@ public function testOverideColumnRulesFull(): void 'columns' => [ [ 'name' => 'Child name', - 'rules' => ['inherit' => 'parent/columns/0:/rules'], + 'rules' => ['inherit' => 'parent/0:'], ], ], ]); @@ -410,7 +410,7 @@ public function testOverideColumnRulesPartial(): void [ 'name' => 'Child name', 'rules' => [ - 'inherit' => 'parent/columns/0:/rules', + 'inherit' => 'parent/0:', 'allow_values' => ['d', 'c'], 'length_max' => 100, ], @@ -456,7 +456,7 @@ public function testOverideColumnAggregateRulesFull(): void 'columns' => [ [ 'name' => 'Child name', - 'aggregate_rules' => ['inherit' => 'parent/columns/0:/aggregate_rules'], + 'aggregate_rules' => ['inherit' => 'parent/0:'], ], ], ]); @@ -498,7 +498,7 @@ public function testOverideColumnAggregateRulesPartial(): void [ 'name' => 'Child name', 'aggregate_rules' => [ - 'inherit' => 'parent/columns/0:/aggregate_rules', + 'inherit' => 'parent/0:', 'sum_max' => 4200, 'sum_min' => 1, ], diff --git a/tests/schemas/inherit/child-of-child.yml b/tests/schemas/inherit/child-of-child.yml index 465cdc4e..c03e6ce5 100644 --- a/tests/schemas/inherit/child-of-child.yml +++ b/tests/schemas/inherit/child-of-child.yml @@ -24,17 +24,17 @@ filename_pattern: /child-of-child-\d.csv$/i csv: - inherit: 'parent-1_0/csv' - delimiter: 'dd' - quote_char: 'qq' - enclosure: 'ee' + inherit: parent-1_0 + delimiter: dd + quote_char: qq + enclosure: ee encoding: utf-32 bom: false structural_rules: - inherit: 'parent-1_0/structural_rules' + inherit: parent-1_0 allow_extra_columns: false columns: - - inherit: 'parent-1_0/columns/Second Column' + - inherit: parent-1_0/Second Column diff --git a/tests/schemas/inherit/child.yml b/tests/schemas/inherit/child.yml index 64959153..73d67dde 100644 --- a/tests/schemas/inherit/child.yml +++ b/tests/schemas/inherit/child.yml @@ -21,43 +21,43 @@ includes: filename_pattern: - inherit: 'parent/filename_pattern' + inherit: parent csv: - inherit: 'parent/csv' + inherit: parent header: true structural_rules: - inherit: 'parent/structural_rules' + inherit: parent strict_column_order: true columns: # 0 - - inherit: 'parent/columns/Name' + - inherit: parent/Name # 1 - - inherit: 'parent/columns/Name' + - inherit: parent/Name name: Overridden name by column name # 2 - - inherit: 'parent/columns/0:' + - inherit: 'parent/0:' name: Overridden name by column index # 3 - - inherit: 'parent/columns/0:Name' + - inherit: parent/0:Name name: Overridden name by column index and column name # 4 - - inherit: 'parent/columns/0:Name' + - inherit: parent/0:Name name: Overridden name by column index and column name + added rules rules: length_min: 1 # 5 - - inherit: 'parent/columns/0:Name' + - inherit: parent/0:Name name: Overridden name by column index and column name + added aggregate rules aggregate_rules: nth_num: [ 10, 0.05 ] @@ -65,12 +65,12 @@ columns: # 6 - name: Overridden only rules rules: - inherit: 'parent/columns/0:Name/rules' + inherit: parent/0:Name # 7 - name: Overridden only aggregation rules aggregate_rules: - inherit: 'parent/columns/0:Name/aggregate_rules' + inherit: parent/0:Name # 8 - - inherit: 'parent/columns/Second Column' + - inherit: parent/Second Column diff --git a/tests/schemas/inherit/parent.yml b/tests/schemas/inherit/parent.yml index 2f51732b..4c47f6a8 100644 --- a/tests/schemas/inherit/parent.yml +++ b/tests/schemas/inherit/parent.yml @@ -19,9 +19,9 @@ filename_pattern: /parent-\d.csv$/i csv: header: false - delimiter: 'd' - quote_char: 'q' - enclosure: 'e' + delimiter: d + quote_char: q + enclosure: e encoding: utf-16 bom: true From d2e941e96db77bc37d5181251cdfda3c647d4669 Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sat, 6 Apr 2024 03:29:55 +0400 Subject: [PATCH 04/24] Remove inheritance functionality in src/SchemaDataPrep.php Removed the 'inherit' attribute from different functions in 'SchemaDataPrep.php'. Both tests and examples have been updated to reflect these changes. This increases code readability and reliability, while reducing complexity. --- README.md | 4 ++-- src/SchemaDataPrep.php | 51 ++++++++++++++++-------------------------- tests/ReadmeTest.php | 7 +----- tests/schemas/todo.yml | 13 ++--------- 4 files changed, 24 insertions(+), 51 deletions(-) diff --git a/README.md b/README.md index e1051432..0cdc1540 100644 --- a/README.md +++ b/README.md @@ -15,7 +15,7 @@ [![Static Badge](https://img.shields.io/badge/Rules-118-green?label=Cell%20rules&labelColor=blue&color=gray)](src/Rules/Cell) [![Static Badge](https://img.shields.io/badge/Rules-206-green?label=Aggregate%20rules&labelColor=blue&color=gray)](src/Rules/Aggregate) [![Static Badge](https://img.shields.io/badge/Rules-8-green?label=Extra%20checks&labelColor=blue&color=gray)](#extra-checks) -[![Static Badge](https://img.shields.io/badge/Rules-17/11/25-green?label=Plan%20to%20add&labelColor=gray&color=gray)](tests/schemas/todo.yml) +[![Static Badge](https://img.shields.io/badge/Rules-17/11/20-green?label=Plan%20to%20add&labelColor=gray&color=gray)](tests/schemas/todo.yml) A console utility designed for validating CSV files against a strictly defined schema and validation rules outlined @@ -1364,12 +1364,12 @@ It's random ideas and plans. No promises and deadlines. Feel free to [help me!]( * Flag to ignore file name pattern. It's useful when you have a lot of files, and you don't want to validate the file name. * **Validation** + * Multi `filename_pattern`. Support list of regexs. * Multi values in one cell. * Custom cell rule as a callback. It's useful when you have a complex rule that can't be described in the schema file. * Custom agregate rule as a callback. It's useful when you have a complex rule that can't be described in the schema file. * Configurable keyword for null/empty values. By default, it's an empty string. But you will use `null`, `nil`, `none`, `empty`, etc. Overridable on the column level. * Handle empty files and files with only a header row, or only with one line of data. One column wthout header is also possible. - * Inheritance of schemas, rules and columns. Define parent schema and override some rules in the child schemas. Make it DRY and easy to maintain. * If option `--schema` is not specified, then validate only super base level things (like "is it a CSV file?"). * Complex rules (like "if field `A` is not empty, then field `B` should be not empty too"). * Extending with custom rules and custom report formats. Plugins? diff --git a/src/SchemaDataPrep.php b/src/SchemaDataPrep.php index 357712da..698adaae 100644 --- a/src/SchemaDataPrep.php +++ b/src/SchemaDataPrep.php @@ -31,7 +31,6 @@ final class SchemaDataPrep 'inlcudes' => [], 'csv' => [ - 'inherit' => null, 'header' => true, 'delimiter' => ',', 'quote_char' => '\\', @@ -46,7 +45,6 @@ final class SchemaDataPrep ], 'column' => [ - 'inherit' => '', 'name' => '', 'description' => '', 'example' => null, @@ -78,7 +76,7 @@ public function buildData(): Data 'name' => $this->buildName(), 'description' => $this->buildDescription(), 'includes' => $this->buildIncludes(), - 'filename_pattern' => $this->buildFilenamePattern(), + 'filename_pattern' => $this->buildByKey('filename_pattern')[0], 'csv' => $this->buildByKey('csv'), 'structural_rules' => $this->buildByKey('structural_rules'), 'columns' => $this->buildColumns(), @@ -146,17 +144,24 @@ private function getParentSchema(string $alias): Schema throw new \InvalidArgumentException("Unknown included alias: \"{$alias}\""); } - private function buildFilenamePattern(): string + private function buildIncludes(): array { - $inherit = $this->data->findString('filename_pattern.inherit'); - - if ($inherit !== '') { - $inheritParts = self::parseAliasParts($inherit); - $parent = $this->getParentSchema($inheritParts['alias']); - return $parent->getData()->get('filename_pattern'); + $result = []; + foreach ($this->aliases as $alias => $schema) { + $result[$alias] = $schema->getFilename(); } - return $this->data->getString('filename_pattern', self::DEFAULTS['filename_pattern']); + return $result; + } + + private function buildName(): string + { + return $this->data->getString('name', self::DEFAULTS['name']); + } + + private function buildDescription(): string + { + return $this->data->getString('description', self::DEFAULTS['description']); } private function buildByKey(string $key = 'structural_rules'): array @@ -190,7 +195,9 @@ private function buildColumns(): array $parent = $this->getParentSchema($inheritParts['alias']); $parentColumn = $parent->getColumn($inheritParts['column']); if ($parentColumn === null) { - throw new \InvalidArgumentException("Unknown column: \"{$inheritParts['column']}\""); + throw new \InvalidArgumentException( + "Unknown column: \"{$inheritParts['column']}\" by alias: \"{$inheritParts['alias']}\"", + ); } $parentConfig = $parentColumn->getData()->getArrayCopy(); @@ -208,26 +215,6 @@ private function buildColumns(): array return $columns; } - private function buildIncludes(): array - { - $result = []; - foreach ($this->aliases as $alias => $schema) { - $result[$alias] = $schema->getFilename(); - } - - return $result; - } - - private function buildName(): string - { - return $this->data->getString('name', self::DEFAULTS['name']); - } - - private function buildDescription(): string - { - return $this->data->getString('description', self::DEFAULTS['description']); - } - private function buildRules(array $rules, string $typeOfRules): array { $inherit = $rules['inherit'] ?? ''; diff --git a/tests/ReadmeTest.php b/tests/ReadmeTest.php index a1080e44..a1b9505e 100644 --- a/tests/ReadmeTest.php +++ b/tests/ReadmeTest.php @@ -105,15 +105,10 @@ public function testBadgeOfRules(): void 'csv.auto_detect', 'csv.end_of_line', 'csv.null_values', + 'filename_pattern - multiple', 'column.faker', 'column.null_values', 'column.multiple + column.multiple_separator', - 'inherit.', - 'inherit.csv', - 'inherit.structural_rules', - 'inherit.rules', - 'inherit.aggregate_rules', - 'inherit.complex_rules', ]) + \count($todoYml->findArray('structural_rules')) + \count($todoYml->findArray('complex_rules')), ]); diff --git a/tests/schemas/todo.yml b/tests/schemas/todo.yml index 83312538..1ac62c24 100644 --- a/tests/schemas/todo.yml +++ b/tests/schemas/todo.yml @@ -12,17 +12,9 @@ # File contains just ideas. It's invalid! -# Include another schemas -includes: # Alias is always required - - /path/schema_1.yml as alias_1 # Full path to another schema. - - ./path/schema_2.yml as alias_2 # Relative path based on the current schema path. - - ../path/schema_3.yml as alias_3 # Relative path based on the current schema path. Go up one level. - csv: # How to parse file before validation - inherit: alias_1 # Inherited from another schema. Options above will overwrite inherited options. - auto_detect: false # If true, then the cintrol chars will be detected automatically. - end_of_line: LF # End of line character. LF => \n, CRLF => \r\n, CR => \r + auto_detect: false # If true, then the control chars will be detected automatically. empty_values: # List of values that will be treated as empty - "" # By default, only empty string is treated as empty (string length = 0). - null @@ -41,8 +33,7 @@ structural_rules: columns: - - inherit: alias_1\Column Name - empty_values: [''] # Override csv.empty_values. List of values that will be treated as empty. + - empty_values: [''] # Override csv.empty_values. List of values that will be treated as empty. # Multi prop multiple: true From ae3e85a2786816f859665dcc6b961272f182f064 Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sat, 6 Apr 2024 03:58:14 +0400 Subject: [PATCH 05/24] Replace 'inherit' with 'preset' in SchemaDataPrep and update tests The 'inherit' attribute has been replaced with 'preset' in the SchemaDataPrep class to align with new design decisions. Corresponding tests are updated to reflect these changes, ensuring the new attribute works expectedly. This change simplifies the codebase and improves readability. --- src/SchemaDataPrep.php | 46 +++++++------- ...maInheritTest.php => SchemaPresetTest.php} | 62 +++++++++---------- tests/SchemaTest.php | 2 +- .../{inherit => preset}/child-of-child.yml | 9 ++- tests/schemas/{inherit => preset}/child.yml | 27 ++++---- tests/schemas/{inherit => preset}/parent.yml | 3 +- 6 files changed, 73 insertions(+), 76 deletions(-) rename tests/{SchemaInheritTest.php => SchemaPresetTest.php} (93%) rename tests/schemas/{inherit => preset}/child-of-child.yml (80%) rename tests/schemas/{inherit => preset}/child.yml (74%) rename tests/schemas/{inherit => preset}/parent.yml (93%) diff --git a/src/SchemaDataPrep.php b/src/SchemaDataPrep.php index 698adaae..c3b94dfe 100644 --- a/src/SchemaDataPrep.php +++ b/src/SchemaDataPrep.php @@ -53,8 +53,8 @@ final class SchemaDataPrep 'aggregate_rules' => [], ], - 'rules' => ['inherit' => ''], - 'aggregate_rules' => ['inherit' => ''], + 'rules' => ['preset' => ''], + 'aggregate_rules' => ['preset' => ''], ]; private AbstractData $data; @@ -166,17 +166,17 @@ private function buildDescription(): string private function buildByKey(string $key = 'structural_rules'): array { - $inherit = $this->data->findString("{$key}.inherit"); + $preset = $this->data->findString("{$key}.preset"); $parentConfig = []; - if ($inherit !== '') { - $inheritParts = self::parseAliasParts($inherit); - $parent = $this->getParentSchema($inheritParts['alias']); + if ($preset !== '') { + $presetParts = self::parseAliasParts($preset); + $parent = $this->getParentSchema($presetParts['alias']); $parentConfig = $parent->getData()->getArray($key); } $result = Utils::mergeConfigs((array)self::DEFAULTS[$key], $parentConfig, $this->data->getArray($key)); - unset($result['inherit']); + unset($result['preset']); return $result; } @@ -187,16 +187,16 @@ private function buildColumns(): array foreach ($this->data->getArray('columns') as $columnIndex => $column) { $columnData = new Data($column); - $columnInherit = $columnData->getString('inherit'); + $columnpreset = $columnData->getString('preset'); $parentConfig = []; - if ($columnInherit !== '') { - $inheritParts = self::parseAliasParts($columnInherit); - $parent = $this->getParentSchema($inheritParts['alias']); - $parentColumn = $parent->getColumn($inheritParts['column']); + if ($columnpreset !== '') { + $presetParts = self::parseAliasParts($columnpreset); + $parent = $this->getParentSchema($presetParts['alias']); + $parentColumn = $parent->getColumn($presetParts['column']); if ($parentColumn === null) { throw new \InvalidArgumentException( - "Unknown column: \"{$inheritParts['column']}\" by alias: \"{$inheritParts['alias']}\"", + "Unknown column: \"{$presetParts['column']}\" by alias: \"{$presetParts['alias']}\"", ); } @@ -207,7 +207,7 @@ private function buildColumns(): array $actualColumn['rules'] = $this->buildRules($actualColumn['rules'], 'rules'); $actualColumn['aggregate_rules'] = $this->buildRules($actualColumn['aggregate_rules'], 'aggregate_rules'); - unset($actualColumn['inherit']); + unset($actualColumn['preset']); $columns[$columnIndex] = $actualColumn; } @@ -217,29 +217,29 @@ private function buildColumns(): array private function buildRules(array $rules, string $typeOfRules): array { - $inherit = $rules['inherit'] ?? ''; + $preset = $rules['preset'] ?? ''; $parentConfig = []; - if ($inherit !== '') { - $inheritParts = self::parseAliasParts($inherit); - $parent = $this->getParentSchema($inheritParts['alias']); - $parentColumn = $parent->getColumn($inheritParts['column']); + if ($preset !== '') { + $presetParts = self::parseAliasParts($preset); + $parent = $this->getParentSchema($presetParts['alias']); + $parentColumn = $parent->getColumn($presetParts['column']); if ($parentColumn === null) { - throw new \InvalidArgumentException("Unknown column: \"{$inheritParts['column']}\""); + throw new \InvalidArgumentException("Unknown column: \"{$presetParts['column']}\""); } $parentConfig = $parentColumn->getData()->getArray($typeOfRules); } $actualRules = Utils::mergeConfigs((array)self::DEFAULTS[$typeOfRules], $parentConfig, $rules); - unset($actualRules['inherit']); + unset($actualRules['preset']); return $actualRules; } - private static function parseAliasParts(string $inherit): array + private static function parseAliasParts(string $preset): array { - $parts = \explode('/', $inherit); + $parts = \explode('/', $preset); self::validateAlias($parts[0]); if (\count($parts) === 1) { diff --git a/tests/SchemaInheritTest.php b/tests/SchemaPresetTest.php similarity index 93% rename from tests/SchemaInheritTest.php rename to tests/SchemaPresetTest.php index ac5a2d96..1e287016 100644 --- a/tests/SchemaInheritTest.php +++ b/tests/SchemaPresetTest.php @@ -18,7 +18,7 @@ use JBZoo\CsvBlueprint\Schema; -final class SchemaInheritTest extends TestCase +final class SchemaPresetTest extends TestCase { public function testDefaults(): void { @@ -118,7 +118,7 @@ public function testOverideFilenamePattern(): void 'parent' => ['filename_pattern' => '/.*/i'], ], 'filename_pattern' => [ - 'inherit' => 'parent', + 'preset' => 'parent', ], ]); @@ -141,7 +141,7 @@ public function testOverideCsvFull(): void ], ], ], - 'csv' => ['inherit' => 'parent'], + 'csv' => ['preset' => 'parent'], ]); isSame([ @@ -170,7 +170,7 @@ public function testOverideCsvPartial(): void ], ], 'csv' => [ - 'inherit' => 'parent', + 'preset' => 'parent', 'encoding' => 'utf-32', ], ]); @@ -199,7 +199,7 @@ public function testOverideStructuralRulesFull(): void ], ], 'structural_rules' => [ - 'inherit' => 'parent', + 'preset' => 'parent', ], ]); @@ -223,7 +223,7 @@ public function testOverideStructuralRulesPartial1(): void ], ], 'structural_rules' => [ - 'inherit' => 'parent', + 'preset' => 'parent', 'allow_extra_columns' => true, ], ]); @@ -240,7 +240,7 @@ public function testOverideStructuralRulesPartial2(): void $schema = new Schema([ 'includes' => ['parent' => ['structural_rules' => []]], 'structural_rules' => [ - 'inherit' => 'parent', + 'preset' => 'parent', 'allow_extra_columns' => true, ], ]); @@ -275,13 +275,13 @@ public function testOverideColumnFull(): void $schema = new Schema([ 'includes' => ['parent' => ['columns' => [$parentColum0, $parentColum1]]], 'columns' => [ - ['inherit' => 'parent/0'], - ['inherit' => 'parent/1'], - ['inherit' => 'parent/0:'], - ['inherit' => 'parent/1:'], - ['inherit' => 'parent/Name'], - ['inherit' => 'parent/0:Name'], - ['inherit' => 'parent/1:Name'], + ['preset' => 'parent/0'], + ['preset' => 'parent/1'], + ['preset' => 'parent/0:'], + ['preset' => 'parent/1:'], + ['preset' => 'parent/Name'], + ['preset' => 'parent/0:Name'], + ['preset' => 'parent/1:Name'], ], ]); @@ -315,9 +315,9 @@ public function testOverideColumnPartial(): void 'includes' => ['parent' => ['columns' => [$parentColum]]], 'columns' => [ [ - 'inherit' => 'parent/Name', - 'name' => 'Child name', - 'rules' => [ + 'preset' => 'parent/Name', + 'name' => 'Child name', + 'rules' => [ 'is_int' => true, 'length_min' => 2, 'length' => 5, @@ -366,7 +366,7 @@ public function testOverideColumnRulesFull(): void 'columns' => [ [ 'name' => 'Child name', - 'rules' => ['inherit' => 'parent/0:'], + 'rules' => ['preset' => 'parent/0:'], ], ], ]); @@ -410,7 +410,7 @@ public function testOverideColumnRulesPartial(): void [ 'name' => 'Child name', 'rules' => [ - 'inherit' => 'parent/0:', + 'preset' => 'parent/0:', 'allow_values' => ['d', 'c'], 'length_max' => 100, ], @@ -456,7 +456,7 @@ public function testOverideColumnAggregateRulesFull(): void 'columns' => [ [ 'name' => 'Child name', - 'aggregate_rules' => ['inherit' => 'parent/0:'], + 'aggregate_rules' => ['preset' => 'parent/0:'], ], ], ]); @@ -498,7 +498,7 @@ public function testOverideColumnAggregateRulesPartial(): void [ 'name' => 'Child name', 'aggregate_rules' => [ - 'inherit' => 'parent/0:', + 'preset' => 'parent/0:', 'sum_max' => 4200, 'sum_min' => 1, ], @@ -525,12 +525,12 @@ public function testOverideColumnAggregateRulesPartial(): void public function testRealParent(): void { - $schema = new Schema('./tests/schemas/inherit/parent.yml'); + $schema = new Schema('./tests/schemas/preset/parent.yml'); isSame([ 'name' => 'Parent schema', - 'description' => 'Testing inheritance.', + 'description' => '', 'includes' => [], - 'filename_pattern' => '/parent-\d.csv$/i', + 'filename_pattern' => '/preset-\d.csv$/i', 'csv' => [ 'header' => false, 'delimiter' => 'd', @@ -578,14 +578,14 @@ public function testRealParent(): void public function testRealChild(): void { - $schema = new Schema('./tests/schemas/inherit/child.yml'); + $schema = new Schema('./tests/schemas/preset/child.yml'); isSame([ 'name' => 'Child schema', - 'description' => 'Testing inheritance from parent schema.', + 'description' => '', 'includes' => [ - 'parent' => PROJECT_ROOT . '/tests/schemas/inherit/parent.yml', + 'preset' => PROJECT_ROOT . '/tests/schemas/preset/parent.yml', ], - 'filename_pattern' => '/parent-\d.csv$/i', + 'filename_pattern' => '/preset-\d.csv$/i', 'csv' => [ 'header' => true, 'delimiter' => 'd', @@ -709,12 +709,12 @@ public function testRealChild(): void public function testRealChildOfChild(): void { - $schema = new Schema('./tests/schemas/inherit/child-of-child.yml'); + $schema = new Schema('./tests/schemas/preset/child-of-child.yml'); isSame([ 'name' => 'Child of child schema', - 'description' => 'Testing inheritance from child schema.', + 'description' => '', 'includes' => [ - 'parent-1_0' => PROJECT_ROOT . '/tests/schemas/inherit/child.yml', + 'preset-1' => PROJECT_ROOT . '/tests/schemas/preset/child.yml', ], 'filename_pattern' => '/child-of-child-\d.csv$/i', 'csv' => [ diff --git a/tests/SchemaTest.php b/tests/SchemaTest.php index 19aebad6..c5529888 100644 --- a/tests/SchemaTest.php +++ b/tests/SchemaTest.php @@ -195,7 +195,7 @@ public function testValidateValidSchemaFixtures(): void { $schemas = (new Finder()) ->in(PROJECT_ROOT . '/tests/schemas') - ->in(PROJECT_ROOT . '/tests/schemas/inherit') + ->in(PROJECT_ROOT . '/tests/schemas/preset') ->in(PROJECT_ROOT . '/tests/Benchmarks') ->in(PROJECT_ROOT . '/schema-examples') ->name('*.yml') diff --git a/tests/schemas/inherit/child-of-child.yml b/tests/schemas/preset/child-of-child.yml similarity index 80% rename from tests/schemas/inherit/child-of-child.yml rename to tests/schemas/preset/child-of-child.yml index c03e6ce5..8285e285 100644 --- a/tests/schemas/inherit/child-of-child.yml +++ b/tests/schemas/preset/child-of-child.yml @@ -14,17 +14,16 @@ name: Child of child schema -description: Testing inheritance from child schema. includes: - parent-1_0: child.yml + preset-1: child.yml filename_pattern: /child-of-child-\d.csv$/i csv: - inherit: parent-1_0 + preset: preset-1 delimiter: dd quote_char: qq enclosure: ee @@ -33,8 +32,8 @@ csv: structural_rules: - inherit: parent-1_0 + preset: preset-1 allow_extra_columns: false columns: - - inherit: parent-1_0/Second Column + - preset: preset-1/Second Column diff --git a/tests/schemas/inherit/child.yml b/tests/schemas/preset/child.yml similarity index 74% rename from tests/schemas/inherit/child.yml rename to tests/schemas/preset/child.yml index 73d67dde..0d5e2069 100644 --- a/tests/schemas/inherit/child.yml +++ b/tests/schemas/preset/child.yml @@ -14,50 +14,49 @@ name: Child schema -description: Testing inheritance from parent schema. includes: - parent: ./../inherit/parent.yml + preset: ./../preset/parent.yml filename_pattern: - inherit: parent + preset: preset csv: - inherit: parent + preset: preset header: true structural_rules: - inherit: parent + preset: preset strict_column_order: true columns: # 0 - - inherit: parent/Name + - preset: preset/Name # 1 - - inherit: parent/Name + - preset: preset/Name name: Overridden name by column name # 2 - - inherit: 'parent/0:' + - preset: 'preset/0:' name: Overridden name by column index # 3 - - inherit: parent/0:Name + - preset: preset/0:Name name: Overridden name by column index and column name # 4 - - inherit: parent/0:Name + - preset: preset/0:Name name: Overridden name by column index and column name + added rules rules: length_min: 1 # 5 - - inherit: parent/0:Name + - preset: preset/0:Name name: Overridden name by column index and column name + added aggregate rules aggregate_rules: nth_num: [ 10, 0.05 ] @@ -65,12 +64,12 @@ columns: # 6 - name: Overridden only rules rules: - inherit: parent/0:Name + preset: preset/0:Name # 7 - name: Overridden only aggregation rules aggregate_rules: - inherit: parent/0:Name + preset: preset/0:Name # 8 - - inherit: parent/Second Column + - preset: preset/Second Column diff --git a/tests/schemas/inherit/parent.yml b/tests/schemas/preset/parent.yml similarity index 93% rename from tests/schemas/inherit/parent.yml rename to tests/schemas/preset/parent.yml index 4c47f6a8..37bca64f 100644 --- a/tests/schemas/inherit/parent.yml +++ b/tests/schemas/preset/parent.yml @@ -13,9 +13,8 @@ # This schema is invalid because does not match the CSV file (tests/fixtures/demo.csv). name: Parent schema -description: Testing inheritance. -filename_pattern: /parent-\d.csv$/i +filename_pattern: /preset-\d.csv$/i csv: header: false From eb877dc0f5fd50e49aa0d5a624cf7760f1345807 Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sat, 6 Apr 2024 04:01:39 +0400 Subject: [PATCH 06/24] Refactor 'includes' to 'presets' across files The term 'includes' was replaced with 'presets' in multiple files including the SchemaDataPrep class, test files and schemas. This change was made to improve code readability and adherence to design decisions. The updates also ensure that tests properly validate the usage of the new terminology. --- src/SchemaDataPrep.php | 16 ++++----- src/Validators/ValidatorSchema.php | 10 +++--- tests/SchemaPresetTest.php | 48 ++++++++++++------------- tests/schemas/preset/child-of-child.yml | 5 +-- tests/schemas/preset/child.yml | 6 +--- 5 files changed, 39 insertions(+), 46 deletions(-) diff --git a/src/SchemaDataPrep.php b/src/SchemaDataPrep.php index c3b94dfe..ffa72d31 100644 --- a/src/SchemaDataPrep.php +++ b/src/SchemaDataPrep.php @@ -75,7 +75,7 @@ public function buildData(): Data $result = [ 'name' => $this->buildName(), 'description' => $this->buildDescription(), - 'includes' => $this->buildIncludes(), + 'presets' => $this->buildPresets(), 'filename_pattern' => $this->buildByKey('filename_pattern')[0], 'csv' => $this->buildByKey('csv'), 'structural_rules' => $this->buildByKey('structural_rules'), @@ -114,25 +114,25 @@ public static function validateAlias(string $alias): void */ private function prepareAliases(AbstractData $data): array { - $includes = []; + $presets = []; - foreach ($data->getArray('includes') as $alias => $includedPathOrArray) { + foreach ($data->getArray('presets') as $alias => $includedPathOrArray) { $alias = (string)$alias; self::validateAlias($alias); if (\is_array($includedPathOrArray)) { - $includes[$alias] = new Schema($includedPathOrArray); + $presets[$alias] = new Schema($includedPathOrArray); } elseif (\file_exists($includedPathOrArray)) { - $includes[$alias] = (new Schema($includedPathOrArray)); + $presets[$alias] = (new Schema($includedPathOrArray)); } elseif (\file_exists("{$this->basepath}/{$includedPathOrArray}")) { - $includes[$alias] = (new Schema("{$this->basepath}/{$includedPathOrArray}")); + $presets[$alias] = (new Schema("{$this->basepath}/{$includedPathOrArray}")); } else { throw new \InvalidArgumentException("Unknown included file: \"{$includedPathOrArray}\""); } } - return $includes; + return $presets; } private function getParentSchema(string $alias): Schema @@ -144,7 +144,7 @@ private function getParentSchema(string $alias): Schema throw new \InvalidArgumentException("Unknown included alias: \"{$alias}\""); } - private function buildIncludes(): array + private function buildPresets(): array { $result = []; foreach ($this->aliases as $alias => $schema) { diff --git a/src/Validators/ValidatorSchema.php b/src/Validators/ValidatorSchema.php index 634b08c9..3b551e5a 100644 --- a/src/Validators/ValidatorSchema.php +++ b/src/Validators/ValidatorSchema.php @@ -147,18 +147,18 @@ private static function validateMeta( $errors = new ErrorSuite(); $actualMetaAsArray = $actualMeta->getArrayCopy(); - $actualIncludes = $actualMetaAsArray['includes'] ?? []; - unset($expectedMeta['includes'], $actualMetaAsArray['includes']); + $actualPresets = $actualMetaAsArray['presets'] ?? []; + unset($expectedMeta['presets'], $actualMetaAsArray['presets']); $metaErrors = Utils::compareArray($expectedMeta, $actualMetaAsArray, 'meta', '.'); - foreach ($actualIncludes as $alias => $includedFile) { + foreach ($actualPresets as $alias => $includedFile) { if ($alias === '') { - $errors->addError(new Error('includes', 'Defined alias is empty')); + $errors->addError(new Error('presets', 'Defined alias is empty')); } if (!\is_string($includedFile)) { - $errors->addError(new Error('includes', 'Included filepath must be a string')); + $errors->addError(new Error('presets', 'Included filepath must be a string')); } } diff --git a/tests/SchemaPresetTest.php b/tests/SchemaPresetTest.php index 1e287016..77b01f7c 100644 --- a/tests/SchemaPresetTest.php +++ b/tests/SchemaPresetTest.php @@ -26,7 +26,7 @@ public function testDefaults(): void isSame([ 'name' => '', 'description' => '', - 'includes' => [], + 'presets' => [], 'filename_pattern' => '', 'csv' => [ 'header' => true, @@ -51,7 +51,7 @@ public function testOverideDefaults(): void $schema = new Schema([ 'name' => 'Qwerty', 'description' => 'Some description.', - 'includes' => [], + 'presets' => [], 'filename_pattern' => '/.*/i', 'csv' => [ 'header' => false, @@ -74,7 +74,7 @@ public function testOverideDefaults(): void isSame([ 'name' => 'Qwerty', 'description' => 'Some description.', - 'includes' => [], + 'presets' => [], 'filename_pattern' => '/.*/i', 'csv' => [ 'header' => false, @@ -114,7 +114,7 @@ public function testOverideDefaults(): void public function testOverideFilenamePattern(): void { $schema = new Schema([ - 'includes' => [ + 'presets' => [ 'parent' => ['filename_pattern' => '/.*/i'], ], 'filename_pattern' => [ @@ -129,7 +129,7 @@ public function testOverideFilenamePattern(): void public function testOverideCsvFull(): void { $schema = new Schema([ - 'includes' => [ + 'presets' => [ 'parent' => [ 'csv' => [ 'header' => false, @@ -159,7 +159,7 @@ public function testOverideCsvFull(): void public function testOverideCsvPartial(): void { $schema = new Schema([ - 'includes' => [ + 'presets' => [ 'parent' => [ 'csv' => [ 'header' => false, @@ -190,7 +190,7 @@ public function testOverideCsvPartial(): void public function testOverideStructuralRulesFull(): void { $schema = new Schema([ - 'includes' => [ + 'presets' => [ 'parent' => [ 'structural_rules' => [ 'strict_column_order' => false, @@ -214,7 +214,7 @@ public function testOverideStructuralRulesFull(): void public function testOverideStructuralRulesPartial1(): void { $schema = new Schema([ - 'includes' => [ + 'presets' => [ 'parent' => [ 'structural_rules' => [ 'strict_column_order' => true, @@ -238,7 +238,7 @@ public function testOverideStructuralRulesPartial1(): void public function testOverideStructuralRulesPartial2(): void { $schema = new Schema([ - 'includes' => ['parent' => ['structural_rules' => []]], + 'presets' => ['parent' => ['structural_rules' => []]], 'structural_rules' => [ 'preset' => 'parent', 'allow_extra_columns' => true, @@ -273,8 +273,8 @@ public function testOverideColumnFull(): void ]; $schema = new Schema([ - 'includes' => ['parent' => ['columns' => [$parentColum0, $parentColum1]]], - 'columns' => [ + 'presets' => ['parent' => ['columns' => [$parentColum0, $parentColum1]]], + 'columns' => [ ['preset' => 'parent/0'], ['preset' => 'parent/1'], ['preset' => 'parent/0:'], @@ -312,8 +312,8 @@ public function testOverideColumnPartial(): void ]; $schema = new Schema([ - 'includes' => ['parent' => ['columns' => [$parentColum]]], - 'columns' => [ + 'presets' => ['parent' => ['columns' => [$parentColum]]], + 'columns' => [ [ 'preset' => 'parent/Name', 'name' => 'Child name', @@ -362,8 +362,8 @@ public function testOverideColumnRulesFull(): void ]; $schema = new Schema([ - 'includes' => ['parent' => ['columns' => [$parentColum]]], - 'columns' => [ + 'presets' => ['parent' => ['columns' => [$parentColum]]], + 'columns' => [ [ 'name' => 'Child name', 'rules' => ['preset' => 'parent/0:'], @@ -405,8 +405,8 @@ public function testOverideColumnRulesPartial(): void ]; $schema = new Schema([ - 'includes' => ['parent' => ['columns' => [$parentColum]]], - 'columns' => [ + 'presets' => ['parent' => ['columns' => [$parentColum]]], + 'columns' => [ [ 'name' => 'Child name', 'rules' => [ @@ -452,8 +452,8 @@ public function testOverideColumnAggregateRulesFull(): void ]; $schema = new Schema([ - 'includes' => ['parent' => ['columns' => [$parentColum]]], - 'columns' => [ + 'presets' => ['parent' => ['columns' => [$parentColum]]], + 'columns' => [ [ 'name' => 'Child name', 'aggregate_rules' => ['preset' => 'parent/0:'], @@ -493,8 +493,8 @@ public function testOverideColumnAggregateRulesPartial(): void ]; $schema = new Schema([ - 'includes' => ['parent' => ['columns' => [$parentColum]]], - 'columns' => [ + 'presets' => ['parent' => ['columns' => [$parentColum]]], + 'columns' => [ [ 'name' => 'Child name', 'aggregate_rules' => [ @@ -529,7 +529,7 @@ public function testRealParent(): void isSame([ 'name' => 'Parent schema', 'description' => '', - 'includes' => [], + 'presets' => [], 'filename_pattern' => '/preset-\d.csv$/i', 'csv' => [ 'header' => false, @@ -582,7 +582,7 @@ public function testRealChild(): void isSame([ 'name' => 'Child schema', 'description' => '', - 'includes' => [ + 'presets' => [ 'preset' => PROJECT_ROOT . '/tests/schemas/preset/parent.yml', ], 'filename_pattern' => '/preset-\d.csv$/i', @@ -713,7 +713,7 @@ public function testRealChildOfChild(): void isSame([ 'name' => 'Child of child schema', 'description' => '', - 'includes' => [ + 'presets' => [ 'preset-1' => PROJECT_ROOT . '/tests/schemas/preset/child.yml', ], 'filename_pattern' => '/child-of-child-\d.csv$/i', diff --git a/tests/schemas/preset/child-of-child.yml b/tests/schemas/preset/child-of-child.yml index 8285e285..eb816933 100644 --- a/tests/schemas/preset/child-of-child.yml +++ b/tests/schemas/preset/child-of-child.yml @@ -15,13 +15,11 @@ name: Child of child schema -includes: +presets: preset-1: child.yml - filename_pattern: /child-of-child-\d.csv$/i - csv: preset: preset-1 delimiter: dd @@ -30,7 +28,6 @@ csv: encoding: utf-32 bom: false - structural_rules: preset: preset-1 allow_extra_columns: false diff --git a/tests/schemas/preset/child.yml b/tests/schemas/preset/child.yml index 0d5e2069..91553e0d 100644 --- a/tests/schemas/preset/child.yml +++ b/tests/schemas/preset/child.yml @@ -15,24 +15,20 @@ name: Child schema -includes: +presets: preset: ./../preset/parent.yml - filename_pattern: preset: preset - csv: preset: preset header: true - structural_rules: preset: preset strict_column_order: true - columns: # 0 - preset: preset/Name From aadc22cf33baf7b1900304a960b147667f525c21 Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sat, 6 Apr 2024 04:03:00 +0400 Subject: [PATCH 07/24] Refactor 'includes' to 'presets' across files The term 'includes' was replaced with 'presets' in multiple files including the SchemaDataPrep class, test files and schemas. This change was made to improve code readability and adherence to design decisions. The updates also ensure that tests properly validate the usage of the new terminology. --- README.md | 4 ++-- schema-examples/full.json | 4 ++-- schema-examples/full.php | 4 ++-- schema-examples/full.yml | 4 ++-- schema-examples/full_clean.yml | 4 ++-- 5 files changed, 10 insertions(+), 10 deletions(-) diff --git a/README.md b/README.md index 0cdc1540..73b860e8 100644 --- a/README.md +++ b/README.md @@ -292,8 +292,8 @@ description: | # Any description of the CSV file. Not u supporting a wide range of data validation rules from basic type checks to complex regex validations. This example serves as a comprehensive guide for creating robust CSV file validations. -includes: - parent-alias: ./readme_sample.yml # Include another schema and define an alias for it. +presets: + preset-alias: ./readme_sample.yml # Include another schema and define an alias for it. # Regular expression to match the file name. If not set, then no pattern check. diff --git a/schema-examples/full.json b/schema-examples/full.json index 097af6d1..19c49468 100644 --- a/schema-examples/full.json +++ b/schema-examples/full.json @@ -2,8 +2,8 @@ "name" : "CSV Blueprint Schema Example", "description" : "This YAML file provides a detailed description and validation rules for CSV files\nto be processed by CSV Blueprint tool. It includes specifications for file name patterns,\nCSV formatting options, and extensive validation criteria for individual columns and their values,\nsupporting a wide range of data validation rules from basic type checks to complex regex validations.\nThis example serves as a comprehensive guide for creating robust CSV file validations.\n", - "includes" : { - "parent-alias" : ".\/readme_sample.yml" + "presets" : { + "preset-alias" : ".\/readme_sample.yml" }, "filename_pattern" : "\/demo(-\\d+)?\\.csv$\/i", diff --git a/schema-examples/full.php b/schema-examples/full.php index 7e0e66db..4d68c6fd 100644 --- a/schema-examples/full.php +++ b/schema-examples/full.php @@ -23,8 +23,8 @@ This example serves as a comprehensive guide for creating robust CSV file validations. ', - 'includes' => [ - 'parent-alias' => './readme_sample.yml', + 'presets' => [ + 'preset-alias' => './readme_sample.yml', ], 'filename_pattern' => '/demo(-\\d+)?\\.csv$/i', diff --git a/schema-examples/full.yml b/schema-examples/full.yml index 5d1dbfec..600af5cd 100644 --- a/schema-examples/full.yml +++ b/schema-examples/full.yml @@ -22,8 +22,8 @@ description: | # Any description of the CSV file. Not u supporting a wide range of data validation rules from basic type checks to complex regex validations. This example serves as a comprehensive guide for creating robust CSV file validations. -includes: - parent-alias: ./readme_sample.yml # Include another schema and define an alias for it. +presets: + preset-alias: ./readme_sample.yml # Include another schema and define an alias for it. # Regular expression to match the file name. If not set, then no pattern check. diff --git a/schema-examples/full_clean.yml b/schema-examples/full_clean.yml index b40ad3d1..51f66445 100644 --- a/schema-examples/full_clean.yml +++ b/schema-examples/full_clean.yml @@ -21,8 +21,8 @@ description: | supporting a wide range of data validation rules from basic type checks to complex regex validations. This example serves as a comprehensive guide for creating robust CSV file validations. -includes: - parent-alias: ./readme_sample.yml +presets: + preset-alias: ./readme_sample.yml filename_pattern: '/demo(-\d+)?\.csv$/i' From 367d9c0d18b2056c2debf041997750f2c6a4b041 Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sat, 6 Apr 2024 14:39:49 +0400 Subject: [PATCH 08/24] Add new sample and usage schemas and update ContainsNone rule Two new schemas for sample and usage presets were added under the 'schema-examples' directory. These presets contain common validation rules for user data. In the ContainsNone rule, the error message was updated for better clarity by changing the phrase from containing "any of the following" to "the string". This makes it clear that a particular string is being contained, which improves readability and understandability of error messages. --- README.md | 2 +- schema-examples/preset_samples.yml | 134 ++++++++++++++++++++++++++ schema-examples/preset_usage.yml | 49 ++++++++++ src/Rules/Cell/ContainsNone.php | 4 +- src/Utils.php | 24 +++++ tests/Rules/Cell/ContainsNoneTest.php | 4 +- tests/SchemaTest.php | 3 +- tests/UtilsTest.php | 2 +- tests/schemas/todo.yml | 3 + 9 files changed, 218 insertions(+), 7 deletions(-) create mode 100644 schema-examples/preset_samples.yml create mode 100644 schema-examples/preset_usage.yml diff --git a/README.md b/README.md index 73b860e8..01182edd 100644 --- a/README.md +++ b/README.md @@ -15,7 +15,7 @@ [![Static Badge](https://img.shields.io/badge/Rules-118-green?label=Cell%20rules&labelColor=blue&color=gray)](src/Rules/Cell) [![Static Badge](https://img.shields.io/badge/Rules-206-green?label=Aggregate%20rules&labelColor=blue&color=gray)](src/Rules/Aggregate) [![Static Badge](https://img.shields.io/badge/Rules-8-green?label=Extra%20checks&labelColor=blue&color=gray)](#extra-checks) -[![Static Badge](https://img.shields.io/badge/Rules-17/11/20-green?label=Plan%20to%20add&labelColor=gray&color=gray)](tests/schemas/todo.yml) +[![Static Badge](https://img.shields.io/badge/Rules-20/11/20-green?label=Plan%20to%20add&labelColor=gray&color=gray)](tests/schemas/todo.yml) A console utility designed for validating CSV files against a strictly defined schema and validation rules outlined diff --git a/schema-examples/preset_samples.yml b/schema-examples/preset_samples.yml new file mode 100644 index 00000000..e7f1ad7a --- /dev/null +++ b/schema-examples/preset_samples.yml @@ -0,0 +1,134 @@ +# +# JBZoo Toolbox - Csv-Blueprint. +# +# This file is part of the JBZoo Toolbox project. +# For the full copyright and license information, please view the LICENSE +# file that was distributed with this source code. +# +# @license MIT +# @copyright Copyright (C) JBZoo.com, All rights reserved. +# @see https://github.com/JBZoo/Csv-Blueprint +# + +name: Common presets for user data +description: | + This schema contains common presets for user data. + It can be used as a base for other schemas. + +filename_pattern: /users-.*\.csv$/i + +csv: + delimiter: ';' + +columns: + - name: id + description: A unique identifier, usually used to denote a primary key in databases + example: 12345 + rules: + not_empty: true + is_trimmed: true + is_int: true + num_min: 1 + aggregate_rules: + is_unique: true + sorted: [ asc, numeric ] + + - name: profile_status + description: User's profile status in database. Enum + example: active + rules: + not_empty: true + allow_values: [ active, inactive, pending, deleted ] + + - name: login + description: User's login name + example: johndoe + rules: + not_empty: true + is_trimmed: true + is_lowercase: true + is_slug: true + length_min: 3 + length_max: 20 + is_alnum: true + aggregate_rules: + is_unique: true + + - name: password + description: User's password + example: '9RfzE$8NKD' + rules: + not_empty: true + is_trimmed: true + regex: /^[a-zA-Z\d!@#$%^&*()_+\-=\[\]{};':"\\|,.<>\/?~]{6,}$/ # Safe list of special characters for passwords + contains_none: [ "password", "123456", "qwerty", " " ] + charset: UTF-8 + length_min: 6 + length_max: 20 + + - name: full_name + description: User's full name + example: 'John Doe Smith' + rules: + not_empty: true + is_trimmed: true + charset: UTF-8 + contains: " " + word_count_min: 2 + word_count_max: 8 + is_capitalize: true + aggregate_rules: + is_unique: true + + - name: email + description: User's email address + example: user@example.com + rules: + not_empty: true + is_trimmed: true + is_email: true + is_lowercase: true + aggregate_rules: + is_unique: true + + - name: birthday + description: Validates the user's birthday. + example: '1990-01-01' + rules: + not_empty: true # The birthday field must not be empty. + is_trimmed: true # Trims the value before validation. + date_format: Y-m-d # Checks if the date matches the YYYY-MM-DD format. + is_date: true # Validates if the value is a valid date. + date_age_greater: 0 # Ensures the date is in the past. + date_age_less: 150 # Ensures the user is not older than 150 years. + date_max: now # Ensures the date is not in the future. + + - name: phone_number + description: User's phone number in US + example: '+1 650 253 00 00' + rules: + not_empty: true + is_trimmed: true + starts_with: '+1' + phone: US + + - name: balance + description: User's balance in USD + example: '100.00' + rules: + not_empty: true + is_trimmed: true + is_float: true + num_min: 0.00 + num_max: 1000000000.00 # 1 billion + precision: 2 + + + - name: short_description + description: A brief description of the item + example: 'Lorem ipsum dolor sit amet' + rules: + not_empty: true + contains: " " + length_max: 255 + is_trimmed: true diff --git a/schema-examples/preset_usage.yml b/schema-examples/preset_usage.yml new file mode 100644 index 00000000..120e63c7 --- /dev/null +++ b/schema-examples/preset_usage.yml @@ -0,0 +1,49 @@ +# +# JBZoo Toolbox - Csv-Blueprint. +# +# This file is part of the JBZoo Toolbox project. +# For the full copyright and license information, please view the LICENSE +# file that was distributed with this source code. +# +# @license MIT +# @copyright Copyright (C) JBZoo.com, All rights reserved. +# @see https://github.com/JBZoo/Csv-Blueprint +# + +name: +description: | + This schema contains common presets for user data. + It can be used as a base for other schemas. + +presets: + users: ./preset_samples.yml + +filename_pattern: + preset: users + +csv: + preset: users + enclosure: '|' # Overridden value + +columns: + - preset: users/id + - preset: users/login + - preset: users/password + - preset: users/email + - preset: users/full_name + - preset: users/birthday + - preset: users/balance + + - preset: users/short_description + rules: + length_max: 255 # Overridden value + + - preset: users/phone_number + name: phone + + - name: admin_note + description: Admin note + rules: + not_empty: true + length_min: 1 + length_max: 10 diff --git a/src/Rules/Cell/ContainsNone.php b/src/Rules/Cell/ContainsNone.php index 43a50af6..b8c9cdd5 100644 --- a/src/Rules/Cell/ContainsNone.php +++ b/src/Rules/Cell/ContainsNone.php @@ -41,8 +41,8 @@ public function validateRule(string $cellValue): ?string foreach ($exclusions as $exclusion) { if (\strpos($cellValue, $exclusion) !== false) { - return "Value \"{$cellValue}\" must not contain any of the following: " . - Utils::printList($exclusions, 'green'); + return "Value \"{$cellValue}\" must not contain the string: " . + Utils::printList($exclusion, 'green'); } } diff --git a/src/Utils.php b/src/Utils.php index 315691cb..b3ee0d77 100644 --- a/src/Utils.php +++ b/src/Utils.php @@ -204,9 +204,33 @@ public static function compareArray( ): array { $differences = []; + // Exclude array params for some rules because it's not necessary to compare them. + // They have random values, and it's hard to predict them. + $excludeArrayParamsFor = [ + 'rules.contains_none', + 'rules.allow_values', + 'rules.not_allow_values', + 'rules.contains_none', + 'rules.contains_one', + 'rules.contains_any', + 'rules.contains_all', + 'rules.ip_v4_range', + ]; + foreach ($actualSchema as $key => $value) { $curPath = $path === '' ? (string)$key : "{$path}.{$key}"; + if (\in_array($curPath, $excludeArrayParamsFor, true)) { + if (!\is_array($value)) { + $differences[$columnId . '/' . $curPath] = [ + $columnId, + 'Expected type "array", actual "' . \gettype($value) . '" in ' . + ".{$keyPrefix}.{$curPath}", + ]; + } + continue; + } + if (!\array_key_exists($key, $expectedSchema)) { if (\strlen($keyPrefix) <= 1) { $message = "Unknown key: .{$curPath}"; diff --git a/tests/Rules/Cell/ContainsNoneTest.php b/tests/Rules/Cell/ContainsNoneTest.php index 6e83d742..b34f011c 100644 --- a/tests/Rules/Cell/ContainsNoneTest.php +++ b/tests/Rules/Cell/ContainsNoneTest.php @@ -45,13 +45,13 @@ public function testNegative(): void $rule = $this->create(['a', 'b', 'c']); isSame( - 'Value "a" must not contain any of the following: ["a", "b", "c"]', + 'Value "a" must not contain the string: "a"', $rule->test('a'), ); $rule = $this->create(['a', 'b', 'c']); isSame( - 'Value "ddddb" must not contain any of the following: ["a", "b", "c"]', + 'Value "ddddb" must not contain the string: "b"', $rule->test('ddddb'), ); } diff --git a/tests/SchemaTest.php b/tests/SchemaTest.php index c5529888..308102a5 100644 --- a/tests/SchemaTest.php +++ b/tests/SchemaTest.php @@ -209,7 +209,8 @@ public function testValidateValidSchemaFixtures(): void foreach ($schemas as $schemaFile) { $filepath = $schemaFile->getPathname(); - isSame('', (string)(new Schema($filepath))->validate(), $filepath); + $validated = (new Schema($filepath))->validate()->render(ErrorSuite::RENDER_TABLE); + isSame('', (string)$validated, "{$filepath}\n----------\n{$validated}"); } } diff --git a/tests/UtilsTest.php b/tests/UtilsTest.php index 88c364aa..8c3cadbc 100644 --- a/tests/UtilsTest.php +++ b/tests/UtilsTest.php @@ -125,7 +125,7 @@ public function testColorsTags(): void $tags = \explode( '|', - '|i|c|q|e' . + 'i|c|q|e' . '|comment|info|error|question' . '|black|red|green|yellow|blue|magenta|cyan|white|default' . '|bl|b|u|r|bg', diff --git a/tests/schemas/todo.yml b/tests/schemas/todo.yml index 1ac62c24..1a4f68fc 100644 --- a/tests/schemas/todo.yml +++ b/tests/schemas/todo.yml @@ -42,6 +42,9 @@ columns: rules: is_null: true # see csv.empty_values and column.empty_values + password_strength: 3 # 0-4 + is_password: true # /^[a-zA-Z\d!@#$%^&*()_+\-=\[\]{};':"\\|,.<>\/?~]{8,}$/ + _list: true # Example: starts_with_list: [ 'a', 'b', 'c' ] # identifier is_bsn: true # Validates a Dutch citizen service number (BSN). From 733d78d14512e2f1d7d833b705e61c56a0a5543c Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sat, 6 Apr 2024 14:40:59 +0400 Subject: [PATCH 09/24] Update preset_usage.yml with more descriptive names and explanations The 'name' and 'description' fields in preset_usage.yml have been updated for better clarity. The 'name' field now specifies that it's a "Real schema with presets" and the 'description' field explains how the schema uses and overrides presets. These changes will aid in understanding the purpose and functionality of the presets. --- schema-examples/preset_usage.yml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/schema-examples/preset_usage.yml b/schema-examples/preset_usage.yml index 120e63c7..691779a4 100644 --- a/schema-examples/preset_usage.yml +++ b/schema-examples/preset_usage.yml @@ -10,10 +10,10 @@ # @see https://github.com/JBZoo/Csv-Blueprint # -name: +name: Real schema with presets for user profiles. description: | - This schema contains common presets for user data. - It can be used as a base for other schemas. + This schema uses presets for user data. + Also, it demonstrates how to override preset values. presets: users: ./preset_samples.yml From 650ce569049b894510763035db7482bd8c812b81 Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sat, 6 Apr 2024 15:54:14 +0400 Subject: [PATCH 10/24] Refactor scripts to increase CSV schemas manageability The codebase has been refactored to allow the reuse of schemas through presets. This introduces flexibility, reduces redundancy, and enhances maintainability. The updated approach simplifies schema setup, propagates centralized updates to all dependent schemas, and maintains consistency across various CSV files and schemas. A comprehensive example demonstrating all available preset features has been included. --- README.md | 309 ++++++++++++++++++ schema-examples/preset_database.yml | 34 ++ schema-examples/preset_features.yml | 65 ++++ schema-examples/preset_usage.yml | 37 ++- .../{preset_samples.yml => preset_users.yml} | 24 +- tests/ReadmeTest.php | 61 ++++ tests/Tools.php | 20 +- 7 files changed, 507 insertions(+), 43 deletions(-) create mode 100644 schema-examples/preset_database.yml create mode 100644 schema-examples/preset_features.yml rename schema-examples/{preset_samples.yml => preset_users.yml} (83%) diff --git a/README.md b/README.md index 01182edd..98b701a3 100644 --- a/README.md +++ b/README.md @@ -28,6 +28,7 @@ specifications, making it invaluable in scenarios where data quality and consist - [Introduction](#introduction) - [Usage](#usage) - [Schema definition](#schema-definition) +- [Presets and reusable schemas](#presets-and-reusable-schemas) - [Complete CLI help message](#complete-cli-help-message) - [Report examples](#report-examples) - [Benchmarks](#benchmarks) @@ -894,6 +895,314 @@ ensure thorough validation of your CSV files. These additional checks further secure the integrity and consistency of your CSV data against the defined validation schema. +## Presets and reusable schemas + +Presets enhance the efficiency and reusability of schema definitions for CSV file validation, streamlining the +validation process across various files and schemas. Their benefits include: + +- **Consistency Across Schemas**: Presets guarantee uniform validation rules for common fields like user IDs, email + addresses, and phone numbers across different CSV files. This consistency is crucial for maintaining data integrity + and reliability. + +- **Ease of Maintenance**: Centralized updates to presets automatically propagate changes to all schemas using them. + This approach eliminates the need to manually update each schema, significantly reducing maintenance efforts. + +- **Flexibility and Customization**: While offering a foundational set of validation rules, presets also allow for + field-specific rule overrides to meet the unique requirements of individual schemas. This ensures a balance between + consistency and customization. + +- **Rapid Development**: Presets facilitate quick schema setup for new CSV files by reusing established + validation rules. This allows for a faster development cycle, focusing on unique fields without redefining common + rules. + +- **Error Reduction**: Utilizing consistent and tested presets reduces the likelihood of errors in manual schema + definitions, leading to improved data quality and reliability. + +- **Efficiency in Large-scale Projects**: In large projects with extensive data volumes, presets provide a standardized + approach to applying common validation logic, simplifying data management and validation tasks. + +Overall, presets offer a compelling solution for anyone involved in CSV file validation, enhancing consistency, maintenance, flexibility, development speed, error minimization, and project efficiency. + + +### Example with presets + +Let's look at a real life example. Suppose you have a "library" of different user profile validation rules that can be +used in a wide variety of CSV files. + +In order not to care about integrity and not to suffer from copy and paste, you can reuse any existing schema. +In fact, this can be considered as partial inheritance. + +**Important notes** + - You can make the chain of inheritance infinitely long (of course if you like to take risks). + - Any of the files can be used alone or as a library. The syntax is the same. + - Schemas with presets validate themselves and if there are any obvious issues, you will see them when you try to use + the schema. + - Alias in presets must match the regex pattern + `^[a-z0-9_-]+$`. + + +Let's take a look at what this looks like in code. +- Let's define a couple of basic rules for [database columns](schema-examples/preset_database.yml). +- And also one of the files will contain rules specific only to the [users profile](schema-examples/preset_users.yml). +- And of course let's [make a schema](schema-examples/preset_usage.yml) that will simultaneously reuse the rules from these two files. + +As a result, you don't just get a bunch of schemas for validation, which is difficult to manage, but something like a +framework(!) that will be targeted to the specifics of your project, especially when there are dozens or even hundreds +of CSV files and rules. It will be much easier to achieve consistency. Very often it's quite important. + +[preset_database.yml](schema-examples/preset_database.yml) + +```yml +name: Common presets for common database columns +description: This schema contains basic rules for database user data. + +columns: + - name: id + description: A unique identifier, usually used to denote a primary key in databases. + example: 12345 + rules: + not_empty: true + is_trimmed: true + is_int: true + num_min: 1 + aggregate_rules: + is_unique: true + sorted: [ asc, numeric ] + + - name: status + description: Status in database + example: active + rules: + not_empty: true + allow_values: [ active, inactive, pending, deleted ] +``` + + +[preset_users.yml](schema-examples/preset_users.yml) + +```yml +name: Common presets for user data +description: This schema contains common presets for user data. It can be used as a base for other schemas. + +filename_pattern: /users-.*\.csv$/i + +csv: + delimiter: ';' + +columns: + - name: login + description: User's login name + example: johndoe + rules: + not_empty: true + is_trimmed: true + is_lowercase: true + is_slug: true + length_min: 3 + length_max: 20 + is_alnum: true + aggregate_rules: + is_unique: true + + - name: password + description: User's password + example: '9RfzENKD' + rules: + not_empty: true + is_trimmed: true + regex: /^[a-zA-Z\d!@#$%^&*()_+\-=\[\]{};':"\|,.<>\/?~]{6,}$/ # Safe list of special characters for passwords + contains_none: [ "password", "123456", "qwerty", " " ] + charset: UTF-8 + length_min: 6 + length_max: 20 + + - name: full_name + description: User's full name + example: 'John Doe Smith' + rules: + not_empty: true + is_trimmed: true + charset: UTF-8 + contains: " " + word_count_min: 2 + word_count_max: 8 + is_capitalize: true + aggregate_rules: + is_unique: true + + - name: email + description: User's email address + example: user@example.com + rules: + not_empty: true + is_trimmed: true + is_email: true + is_lowercase: true + aggregate_rules: + is_unique: true + + - name: birthday + description: Validates the user's birthday. + example: '1990-01-01' + rules: + not_empty: true # The birthday field must not be empty. + is_trimmed: true # Trims the value before validation. + date_format: Y-m-d # Checks if the date matches the YYYY-MM-DD format. + is_date: true # Validates if the value is a valid date. + date_age_greater: 0 # Ensures the date is in the past. + date_age_less: 150 # Ensures the user is not older than 150 years. + date_max: now # Ensures the date is not in the future. + + - name: phone_number + description: User's phone number in US + example: '+1 650 253 00 00' + rules: + not_empty: true + is_trimmed: true + starts_with: '+1' + phone: US + + - name: balance + description: User's balance in USD + example: '100.00' + rules: + not_empty: true + is_trimmed: true + is_float: true + num_min: 0.00 + num_max: 1000000000.00 # 1 billion + precision: 2 + + - name: short_description + description: A brief description of the item + example: 'Lorem ipsum dolor sit amet' + rules: + not_empty: true + contains: " " + length_max: 255 + is_trimmed: true +``` + + + +[preset_usage.yml](schema-examples/preset_usage.yml) + +```yml +name: Schema uses presets and add new columns + specific rules. +description: This schema uses presets. Also, it demonstrates how to override preset values. + +presets: # Include any other schemas and defined for each alias + users: ./preset_users.yml # Include the schema with common user data + db: ./preset_database.yml # Include the schema with basic database columns + +filename_pattern: + preset: users # Take the filename pattern from the preset + +csv: + preset: users # Take the CSV settings from the preset + enclosure: '|' # Overridden value + +columns: + - preset: db/id + - preset: db/status + - preset: users/login + - preset: users/email + - preset: users/full_name + - preset: users/birthday + + - preset: users/password + rules: + length_min: 10 # Overridden value to force a strong password + + - name: user_balance + rules: + preset: users/balance # Take only rules from the preset + + - preset: users/short_description + rules: + length_max: 255 # Overridden value + + - name: phone # Overridden value + preset: users/phone_number + + - name: admin_note + description: Admin note + rules: + not_empty: true + length_min: 1 + length_max: 10 + aggregate_rules: # In practice this will be a rare case, but the opportunity is there. + preset: db/id # Take only aggregate rules from the preset. + is_unique: true # Added new sprcific rule +``` + + +As a result, readability and maintainability became dramatically easier. +You can easily add new rules, change existing, etc. + + +### Complete example with all available syntax of presets + + +```yml +name: Complite list of preset features +description: This schema contains all the features of the presets. + +presets: + # The basepath for the preset is `.` (current directory) + # Define alias "db" for schema in `./preset_database.yml` + db: preset_database.yml # Or just `db: preset_database.yml`. It's up to you. + + # For example, you can use a relative path + users: ./../schema-examples/preset_users.yml + + # Or you can use an absolute path + # db-3: /full/path/preset_database.yml + + # Or you can use an absolute path + # db: /full/path/preset_database.yml + +filename_pattern: + preset: users # Take the filename pattern from the preset + +csv: + preset: users # Take the CSV settings from the preset + +columns: + # Use name of column from the preset. "db" is alias. "id" is column `name` in `preset_database.yml` + - preset: 'db/id' + + # Use column index. "db" is alias. "0" is column index in `preset_database.yml` + - preset: 'db/0' + - preset: 'db/0:' + + # Use column index and column name. It useful if column name is not unique. + - preset: 'db/0:id' + + # Override only `rules` from the preset + - name: My column + rules: + preset: 'db/status' + + # Override only `aggregate_rules` from the preset + - name: My column + aggregate_rules: + preset: 'db/0:id' + + # Combo. If you're a risk-taker or have a high level of inner zen. :) + # Creating a column from three other columns. In fact, it will merge all three at once with key replacement. + - name: Crazy combo! + example: ~ + preset: 'users/login' + rules: + preset: 'users/email' + aggregate_rules: + preset: 'db/0' +``` + + + + ## Complete CLI help message This section outlines all available options and commands provided by the tool, leveraging the JBZoo/Cli package for its diff --git a/schema-examples/preset_database.yml b/schema-examples/preset_database.yml new file mode 100644 index 00000000..13d2723f --- /dev/null +++ b/schema-examples/preset_database.yml @@ -0,0 +1,34 @@ +# +# JBZoo Toolbox - Csv-Blueprint. +# +# This file is part of the JBZoo Toolbox project. +# For the full copyright and license information, please view the LICENSE +# file that was distributed with this source code. +# +# @license MIT +# @copyright Copyright (C) JBZoo.com, All rights reserved. +# @see https://github.com/JBZoo/Csv-Blueprint +# + +name: Presets for database columns +description: This schema contains basic rules for database user data. + +columns: + - name: id + description: A unique identifier, usually used to denote a primary key in databases. + example: 12345 + rules: + not_empty: true + is_trimmed: true + is_int: true + num_min: 1 + aggregate_rules: + is_unique: true + sorted: [ asc, numeric ] + + - name: status + description: Status in database + example: active + rules: + not_empty: true + allow_values: [ active, inactive, pending, deleted ] diff --git a/schema-examples/preset_features.yml b/schema-examples/preset_features.yml new file mode 100644 index 00000000..34e22436 --- /dev/null +++ b/schema-examples/preset_features.yml @@ -0,0 +1,65 @@ +# +# JBZoo Toolbox - Csv-Blueprint. +# +# This file is part of the JBZoo Toolbox project. +# For the full copyright and license information, please view the LICENSE +# file that was distributed with this source code. +# +# @license MIT +# @copyright Copyright (C) JBZoo.com, All rights reserved. +# @see https://github.com/JBZoo/Csv-Blueprint +# + +name: Complite list of preset features +description: This schema contains all the features of the presets. + +presets: + # The basepath for the preset is `.` (current directory) + # Define alias "db" for schema in `./preset_database.yml` + db: preset_database.yml # Or just `db: preset_database.yml`. It's up to you. + + # For example, you can use a relative path + users: ./../schema-examples/preset_users.yml + + # Or you can use an absolute path + # db-3: /full/path/preset_database.yml + + # Or you can use an absolute path + # db: /full/path/preset_database.yml + +filename_pattern: + preset: users # Take the filename pattern from the preset + +csv: + preset: users # Take the CSV settings from the preset + +columns: + # Use name of column from the preset. "db" is alias. "id" is column `name` in `preset_database.yml` + - preset: 'db/id' + + # Use column index. "db" is alias. "0" is column index in `preset_database.yml` + - preset: 'db/0' + - preset: 'db/0:' + + # Use column index and column name. It useful if column name is not unique. + - preset: 'db/0:id' + + # Override only `rules` from the preset + - name: My column + rules: + preset: 'db/status' + + # Override only `aggregate_rules` from the preset + - name: My column + aggregate_rules: + preset: 'db/0:id' + + # Combo. If you're a risk-taker or have a high level of inner zen. :) + # Creating a column from three other columns. In fact, it will merge all three at once with key replacement. + - name: Crazy combo! + example: ~ + preset: 'users/login' + rules: + preset: 'users/email' + aggregate_rules: + preset: 'db/0' diff --git a/schema-examples/preset_usage.yml b/schema-examples/preset_usage.yml index 691779a4..4bdae994 100644 --- a/schema-examples/preset_usage.yml +++ b/schema-examples/preset_usage.yml @@ -10,36 +10,42 @@ # @see https://github.com/JBZoo/Csv-Blueprint # -name: Real schema with presets for user profiles. -description: | - This schema uses presets for user data. - Also, it demonstrates how to override preset values. +name: Schema uses presets and add new columns + specific rules. +description: This schema uses presets. Also, it demonstrates how to override preset values. -presets: - users: ./preset_samples.yml +presets: # Include any other schemas and defined for each alias + users: ./preset_users.yml # Include the schema with common user data + db: ./preset_database.yml # Include the schema with basic database columns filename_pattern: - preset: users + preset: users # Take the filename pattern from the preset csv: - preset: users - enclosure: '|' # Overridden value + preset: users # Take the CSV settings from the preset + enclosure: '|' # Overridden value columns: - - preset: users/id + - preset: db/id + - preset: db/status - preset: users/login - - preset: users/password - preset: users/email - preset: users/full_name - preset: users/birthday - - preset: users/balance + + - preset: users/password + rules: + length_min: 10 # Overridden value to force a strong password + + - name: user_balance + rules: + preset: users/balance # Take only rules from the preset - preset: users/short_description rules: length_max: 255 # Overridden value - - preset: users/phone_number - name: phone + - name: phone # Overridden value + preset: users/phone_number - name: admin_note description: Admin note @@ -47,3 +53,6 @@ columns: not_empty: true length_min: 1 length_max: 10 + aggregate_rules: # In practice this will be a rare case, but the opportunity is there. + preset: db/id # Take only aggregate rules from the preset. + is_unique: true # Added new sprcific rule diff --git a/schema-examples/preset_samples.yml b/schema-examples/preset_users.yml similarity index 83% rename from schema-examples/preset_samples.yml rename to schema-examples/preset_users.yml index e7f1ad7a..75782854 100644 --- a/schema-examples/preset_samples.yml +++ b/schema-examples/preset_users.yml @@ -11,9 +11,7 @@ # name: Common presets for user data -description: | - This schema contains common presets for user data. - It can be used as a base for other schemas. +description: This schema contains common presets for user data. It can be used as a base for other schemas. filename_pattern: /users-.*\.csv$/i @@ -21,25 +19,6 @@ csv: delimiter: ';' columns: - - name: id - description: A unique identifier, usually used to denote a primary key in databases - example: 12345 - rules: - not_empty: true - is_trimmed: true - is_int: true - num_min: 1 - aggregate_rules: - is_unique: true - sorted: [ asc, numeric ] - - - name: profile_status - description: User's profile status in database. Enum - example: active - rules: - not_empty: true - allow_values: [ active, inactive, pending, deleted ] - - name: login description: User's login name example: johndoe @@ -123,7 +102,6 @@ columns: num_max: 1000000000.00 # 1 billion precision: 2 - - name: short_description description: A brief description of the item example: 'Lorem ipsum dolor sit amet' diff --git a/tests/ReadmeTest.php b/tests/ReadmeTest.php index a1b9505e..6d26b334 100644 --- a/tests/ReadmeTest.php +++ b/tests/ReadmeTest.php @@ -16,6 +16,7 @@ namespace JBZoo\PHPUnit; +use JBZoo\CsvBlueprint\SchemaDataPrep; use JBZoo\Utils\Cli; use JBZoo\Utils\Str; use Symfony\Component\Console\Input\StringInput; @@ -168,6 +169,66 @@ public function testCheckSimpleYmlSchemaExampleInReadme(): void Tools::insertInReadme('readme-sample-yml', $text); } + public function testCheckPresetUsersExampleInReadme(): void + { + $ymlContent = \implode( + "\n", + \array_slice(\explode("\n", \file_get_contents('./schema-examples/preset_users.yml')), 12), + ); + + $text = \implode("\n", ['```yml', \trim($ymlContent), '```']); + + Tools::insertInReadme('preset-users-yml', $text); + } + + public function testCheckPresetFeaturesExampleInReadme(): void + { + $ymlContent = \implode( + "\n", + \array_slice(\explode("\n", \file_get_contents('./schema-examples/preset_features.yml')), 12), + ); + + $text = \implode("\n", ['```yml', \trim($ymlContent), '```']); + + Tools::insertInReadme('preset-features-yml', $text); + } + + public function testCheckPresetRegexInReadme(): void + { + $ymlContent = \implode( + "\n", + \array_slice(\explode("\n", \file_get_contents('./schema-examples/preset_features.yml')), 12), + ); + + $text = SchemaDataPrep::getAliasRegex(); + + Tools::insertInReadme('preset-regex', "`{$text}`", true); + } + + public function testCheckPresetDatabaseExampleInReadme(): void + { + $ymlContent = \implode( + "\n", + \array_slice(\explode("\n", \file_get_contents('./schema-examples/preset_database.yml')), 12), + ); + + $text = \implode("\n", ['```yml', \trim($ymlContent), '```']); + + Tools::insertInReadme('preset-database-yml', $text); + } + + public function testCheckPresetUsageExampleInReadme(): void + { + $ymlContent = \implode( + "\n", + \array_slice(\explode("\n", \file_get_contents('./schema-examples/preset_usage.yml')), 12), + ); + + $text = \implode("\n", ['```yml', \trim($ymlContent), '```']); + + Tools::insertInReadme('preset-usage-yml', $text); + } + public function testAdditionalValidationRules(): void { $list[] = ''; diff --git a/tests/Tools.php b/tests/Tools.php index f05d65b2..f6c990d3 100644 --- a/tests/Tools.php +++ b/tests/Tools.php @@ -108,18 +108,26 @@ public static function getAggregateRule( return ['columns' => [['name' => $columnName, 'aggregate_rules' => [$ruleName => $options]]]]; } - public static function insertInReadme(string $code, string $content): void + public static function insertInReadme(string $code, string $content, bool $isInline = false): void { isFile(self::README); $prefix = 'auto-update:'; isFileContains("", self::README); isFileContains("", self::README); - $replacement = \implode("\n", [ - "", - \trim($content), - "", - ]); + if ($isInline) { + $replacement = \implode('', [ + "", + \trim($content), + "", + ]); + } else { + $replacement = \implode("\n", [ + "", + \trim($content), + "", + ]); + } $result = \preg_replace( "/<\\!-- {$prefix}{$code} -->(.*?)<\\!-- {$prefix}\\/{$code} -->/s", From dcb15c5b122b5c765bedbbdaf64a88fd7ad3ac2d Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sat, 6 Apr 2024 15:55:28 +0400 Subject: [PATCH 11/24] Update preset regex and simplify Readme test Removed the unnecessary YML content manipulation from the Readme test in ReadmeTest.php. Furthermore, updated the regex pattern for alias in presets in the README.md file to case insensitive, easing the matching process. Improved the readability of the description for the preset_database.yml. --- README.md | 4 ++-- tests/ReadmeTest.php | 6 ------ 2 files changed, 2 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index 98b701a3..9b8b2463 100644 --- a/README.md +++ b/README.md @@ -938,7 +938,7 @@ In fact, this can be considered as partial inheritance. - Schemas with presets validate themselves and if there are any obvious issues, you will see them when you try to use the schema. - Alias in presets must match the regex pattern - `^[a-z0-9_-]+$`. + `/^[a-z0-9-_]+$/i`. Let's take a look at what this looks like in code. @@ -953,7 +953,7 @@ of CSV files and rules. It will be much easier to achieve consistency. Very ofte [preset_database.yml](schema-examples/preset_database.yml) ```yml -name: Common presets for common database columns +name: Presets for database columns description: This schema contains basic rules for database user data. columns: diff --git a/tests/ReadmeTest.php b/tests/ReadmeTest.php index 6d26b334..16553d7f 100644 --- a/tests/ReadmeTest.php +++ b/tests/ReadmeTest.php @@ -195,13 +195,7 @@ public function testCheckPresetFeaturesExampleInReadme(): void public function testCheckPresetRegexInReadme(): void { - $ymlContent = \implode( - "\n", - \array_slice(\explode("\n", \file_get_contents('./schema-examples/preset_features.yml')), 12), - ); - $text = SchemaDataPrep::getAliasRegex(); - Tools::insertInReadme('preset-regex', "`{$text}`", true); } From eaa839ba57c43db2e73e4fb456cf163a2df85f8d Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sat, 6 Apr 2024 15:58:37 +0400 Subject: [PATCH 12/24] Update preset regex and simplify Readme test Removed the unnecessary YML content manipulation from the Readme test in ReadmeTest.php. Furthermore, updated the regex pattern for alias in presets in the README.md file to case insensitive, easing the matching process. Improved the readability of the description for the preset_database.yml. --- README.md | 3 ++- tests/ReadmeTest.php | 2 +- tests/Tools.php | 2 +- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 9b8b2463..5dc0b2f1 100644 --- a/README.md +++ b/README.md @@ -938,7 +938,8 @@ In fact, this can be considered as partial inheritance. - Schemas with presets validate themselves and if there are any obvious issues, you will see them when you try to use the schema. - Alias in presets must match the regex pattern - `/^[a-z0-9-_]+$/i`. + `"/^[a-z0-9-_]+$/i"` . + Otherwise, it might break the syntax. Let's take a look at what this looks like in code. diff --git a/tests/ReadmeTest.php b/tests/ReadmeTest.php index 16553d7f..bf88009f 100644 --- a/tests/ReadmeTest.php +++ b/tests/ReadmeTest.php @@ -196,7 +196,7 @@ public function testCheckPresetFeaturesExampleInReadme(): void public function testCheckPresetRegexInReadme(): void { $text = SchemaDataPrep::getAliasRegex(); - Tools::insertInReadme('preset-regex', "`{$text}`", true); + Tools::insertInReadme('preset-regex', " `\"{$text}\"` ", true); } public function testCheckPresetDatabaseExampleInReadme(): void diff --git a/tests/Tools.php b/tests/Tools.php index f6c990d3..06a07dfb 100644 --- a/tests/Tools.php +++ b/tests/Tools.php @@ -118,7 +118,7 @@ public static function insertInReadme(string $code, string $content, bool $isInl if ($isInline) { $replacement = \implode('', [ "", - \trim($content), + $content, "", ]); } else { From 18ed6983a6fcb62bf3c0b6e97c537946d0e37ea5 Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sat, 6 Apr 2024 16:24:23 +0400 Subject: [PATCH 13/24] Refine schema examples, README, and readability of tests This commit streamlines the instructions and readability in the preset schema examples and README.md for improved comprehensibility. Unnecessary manipulations are removed from the Readme test, reducing visual clutter. Additionally, minor modifications have been made to the alias regex pattern in presets for better matching, enhancing the overall user experience. This update aims to reduce complexities and support better user understanding of pattern usage. --- README.md | 82 ++++++++++++++++------------- schema-examples/preset_features.yml | 36 +++++++------ schema-examples/preset_usage.yml | 2 +- schema-examples/preset_users.yml | 4 +- tests/ReadmeTest.php | 2 +- 5 files changed, 69 insertions(+), 57 deletions(-) diff --git a/README.md b/README.md index 5dc0b2f1..3743a932 100644 --- a/README.md +++ b/README.md @@ -900,25 +900,20 @@ These additional checks further secure the integrity and consistency of your CSV Presets enhance the efficiency and reusability of schema definitions for CSV file validation, streamlining the validation process across various files and schemas. Their benefits include: -- **Consistency Across Schemas**: Presets guarantee uniform validation rules for common fields like user IDs, email +- **Consistency Across Schemas:** Presets guarantee uniform validation rules for common fields like user IDs, email addresses, and phone numbers across different CSV files. This consistency is crucial for maintaining data integrity and reliability. - -- **Ease of Maintenance**: Centralized updates to presets automatically propagate changes to all schemas using them. +- **Ease of Maintenance:** Centralized updates to presets automatically propagate changes to all schemas using them. This approach eliminates the need to manually update each schema, significantly reducing maintenance efforts. - -- **Flexibility and Customization**: While offering a foundational set of validation rules, presets also allow for +- **Flexibility and Customization:** While offering a foundational set of validation rules, presets also allow for field-specific rule overrides to meet the unique requirements of individual schemas. This ensures a balance between consistency and customization. - -- **Rapid Development**: Presets facilitate quick schema setup for new CSV files by reusing established +- **Rapid Development:** Presets facilitate quick schema setup for new CSV files by reusing established validation rules. This allows for a faster development cycle, focusing on unique fields without redefining common rules. - -- **Error Reduction**: Utilizing consistent and tested presets reduces the likelihood of errors in manual schema +- **Error Reduction:** Utilizing consistent and tested presets reduces the likelihood of errors in manual schema definitions, leading to improved data quality and reliability. - -- **Efficiency in Large-scale Projects**: In large projects with extensive data volumes, presets provide a standardized +- **Efficiency in Large-scale Projects:** In large projects with extensive data volumes, presets provide a standardized approach to applying common validation logic, simplifying data management and validation tasks. Overall, presets offer a compelling solution for anyone involved in CSV file validation, enhancing consistency, maintenance, flexibility, development speed, error minimization, and project efficiency. @@ -929,16 +924,19 @@ Overall, presets offer a compelling solution for anyone involved in CSV file val Let's look at a real life example. Suppose you have a "library" of different user profile validation rules that can be used in a wide variety of CSV files. -In order not to care about integrity and not to suffer from copy and paste, you can reuse any existing schema. +In order not to care about integrity and not to suffer from copy and paste, you can reuse ANY(!) existing schema. In fact, this can be considered as partial inheritance. **Important notes** - - You can make the chain of inheritance infinitely long (of course if you like to take risks). - - Any of the files can be used alone or as a library. The syntax is the same. + - You can make the chain of inheritance infinitely long. + I.e. make chains of the form `grant-parent.yml` -> `parent.yml` -> `child.yml` -> `grandchild.yml` -> `great-grandchild.yml` -> etc. + Of course if you like to take risks ;). + - Any(!) of the schema files can be used alone or as a library. The syntax is the same. - Schemas with presets validate themselves and if there are any obvious issues, you will see them when you try to use - the schema. + the schema. But logical conflicts between rules are not checked (It's almost impossible from a code perspective). + As mentioned above, rules work in isolation and are not aware of each other. So the set of rules is your responsibility as always. - Alias in presets must match the regex pattern - `"/^[a-z0-9-_]+$/i"` . + "/^[a-z0-9-_]+$/i". Otherwise, it might break the syntax. @@ -951,7 +949,7 @@ As a result, you don't just get a bunch of schemas for validation, which is diff framework(!) that will be targeted to the specifics of your project, especially when there are dozens or even hundreds of CSV files and rules. It will be much easier to achieve consistency. Very often it's quite important. -[preset_database.yml](schema-examples/preset_database.yml) +[Database preset](schema-examples/preset_database.yml) ```yml name: Presets for database columns @@ -979,11 +977,13 @@ columns: ``` -[preset_users.yml](schema-examples/preset_users.yml) +[User data preset](schema-examples/preset_users.yml) ```yml name: Common presets for user data -description: This schema contains common presets for user data. It can be used as a base for other schemas. +description: > + This schema contains common presets for user data. + It can be used as a base for other schemas. filename_pattern: /users-.*\.csv$/i @@ -1086,7 +1086,7 @@ columns: -[preset_usage.yml](schema-examples/preset_usage.yml) +[Usage of presets](schema-examples/preset_usage.yml) ```yml name: Schema uses presets and add new columns + specific rules. @@ -1126,7 +1126,7 @@ columns: - name: phone # Overridden value preset: users/phone_number - - name: admin_note + - name: admin_note # New column specific only this schema description: Admin note rules: not_empty: true @@ -1142,7 +1142,7 @@ As a result, readability and maintainability became dramatically easier. You can easily add new rules, change existing, etc. -### Complete example with all available syntax of presets +### Complete example with all available syntax ```yml @@ -1150,58 +1150,64 @@ name: Complite list of preset features description: This schema contains all the features of the presets. presets: - # The basepath for the preset is `.` (current directory) - # Define alias "db" for schema in `./preset_database.yml` - db: preset_database.yml # Or just `db: preset_database.yml`. It's up to you. + # The basepath for the preset is `.` (current directory of the current schema file). + # Define alias "db" for schema in `./preset_database.yml`. + db: preset_database.yml # Or `db: ./preset_database.yml`. It's up to you. - # For example, you can use a relative path + # For example, you can use a relative path. users: ./../schema-examples/preset_users.yml - # Or you can use an absolute path - # db-3: /full/path/preset_database.yml - - # Or you can use an absolute path + # Or you can use an absolute path. # db: /full/path/preset_database.yml filename_pattern: - preset: users # Take the filename pattern from the preset + preset: users # Take the filename pattern from the preset. csv: - preset: users # Take the CSV settings from the preset + preset: users # Take the CSV settings from the preset. columns: - # Use name of column from the preset. "db" is alias. "id" is column `name` in `preset_database.yml` + # Use name of column from the preset. + # "db" is alias. "id" is column `name` in `preset_database.yml`. - preset: 'db/id' - # Use column index. "db" is alias. "0" is column index in `preset_database.yml` + # Use column index. "db" is alias. "0" is column index in `preset_database.yml`. - preset: 'db/0' - preset: 'db/0:' # Use column index and column name. It useful if column name is not unique. - preset: 'db/0:id' - # Override only `rules` from the preset + # Use only `rules` of "status" column from the preset. - name: My column rules: preset: 'db/status' - # Override only `aggregate_rules` from the preset + # Override only `aggregate_rules` from the preset. + # Use only `aggregate_rules` of "id" column from the preset. + # We strictly take only the very first column (index = 0). - name: My column aggregate_rules: preset: 'db/0:id' - # Combo. If you're a risk-taker or have a high level of inner zen. :) + # Combo!!! If you're a risk-taker or have a high level of inner zen. :) # Creating a column from three other columns. In fact, it will merge all three at once with key replacement. - name: Crazy combo! - example: ~ + description: > # Just a great advice. + I like to take risks, too. + Be careful. Use your power wisely. + example: ~ # Ignore inherited "example" value. Set it `null`. preset: 'users/login' rules: preset: 'users/email' + not_empty: true # Disable the rule from the preset. aggregate_rules: preset: 'db/0' ``` +**Note:** All provided YAML examples pass built-in validation, yet they may not make practical sense. +These are intended solely for demonstration and to illustrate potential configurations and features. ## Complete CLI help message diff --git a/schema-examples/preset_features.yml b/schema-examples/preset_features.yml index 34e22436..37f33b7b 100644 --- a/schema-examples/preset_features.yml +++ b/schema-examples/preset_features.yml @@ -14,52 +14,56 @@ name: Complite list of preset features description: This schema contains all the features of the presets. presets: - # The basepath for the preset is `.` (current directory) - # Define alias "db" for schema in `./preset_database.yml` - db: preset_database.yml # Or just `db: preset_database.yml`. It's up to you. + # The basepath for the preset is `.` (current directory of the current schema file). + # Define alias "db" for schema in `./preset_database.yml`. + db: preset_database.yml # Or `db: ./preset_database.yml`. It's up to you. - # For example, you can use a relative path + # For example, you can use a relative path. users: ./../schema-examples/preset_users.yml - # Or you can use an absolute path - # db-3: /full/path/preset_database.yml - - # Or you can use an absolute path + # Or you can use an absolute path. # db: /full/path/preset_database.yml filename_pattern: - preset: users # Take the filename pattern from the preset + preset: users # Take the filename pattern from the preset. csv: - preset: users # Take the CSV settings from the preset + preset: users # Take the CSV settings from the preset. columns: - # Use name of column from the preset. "db" is alias. "id" is column `name` in `preset_database.yml` + # Use name of column from the preset. + # "db" is alias. "id" is column `name` in `preset_database.yml`. - preset: 'db/id' - # Use column index. "db" is alias. "0" is column index in `preset_database.yml` + # Use column index. "db" is alias. "0" is column index in `preset_database.yml`. - preset: 'db/0' - preset: 'db/0:' # Use column index and column name. It useful if column name is not unique. - preset: 'db/0:id' - # Override only `rules` from the preset + # Use only `rules` of "status" column from the preset. - name: My column rules: preset: 'db/status' - # Override only `aggregate_rules` from the preset + # Override only `aggregate_rules` from the preset. + # Use only `aggregate_rules` of "id" column from the preset. + # We strictly take only the very first column (index = 0). - name: My column aggregate_rules: preset: 'db/0:id' - # Combo. If you're a risk-taker or have a high level of inner zen. :) + # Combo!!! If you're a risk-taker or have a high level of inner zen. :) # Creating a column from three other columns. In fact, it will merge all three at once with key replacement. - name: Crazy combo! - example: ~ + description: > # Just a great advice. + I like to take risks, too. + Be careful. Use your power wisely. + example: ~ # Ignore inherited "example" value. Set it `null`. preset: 'users/login' rules: preset: 'users/email' + not_empty: true # Disable the rule from the preset. aggregate_rules: preset: 'db/0' diff --git a/schema-examples/preset_usage.yml b/schema-examples/preset_usage.yml index 4bdae994..f0f22e34 100644 --- a/schema-examples/preset_usage.yml +++ b/schema-examples/preset_usage.yml @@ -47,7 +47,7 @@ columns: - name: phone # Overridden value preset: users/phone_number - - name: admin_note + - name: admin_note # New column specific only this schema description: Admin note rules: not_empty: true diff --git a/schema-examples/preset_users.yml b/schema-examples/preset_users.yml index 75782854..1d7a263e 100644 --- a/schema-examples/preset_users.yml +++ b/schema-examples/preset_users.yml @@ -11,7 +11,9 @@ # name: Common presets for user data -description: This schema contains common presets for user data. It can be used as a base for other schemas. +description: > + This schema contains common presets for user data. + It can be used as a base for other schemas. filename_pattern: /users-.*\.csv$/i diff --git a/tests/ReadmeTest.php b/tests/ReadmeTest.php index bf88009f..8fe79622 100644 --- a/tests/ReadmeTest.php +++ b/tests/ReadmeTest.php @@ -196,7 +196,7 @@ public function testCheckPresetFeaturesExampleInReadme(): void public function testCheckPresetRegexInReadme(): void { $text = SchemaDataPrep::getAliasRegex(); - Tools::insertInReadme('preset-regex', " `\"{$text}\"` ", true); + Tools::insertInReadme('preset-regex', "\"{$text}\"", true); } public function testCheckPresetDatabaseExampleInReadme(): void From d98fd8465287bc8411e478c1864575f8a6ecf843 Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sat, 6 Apr 2024 16:28:49 +0400 Subject: [PATCH 14/24] Improve README and schema example clarity This commit refines text and instructions in README.md and the preset schema examples to enhance understanding and reduce complexity. Changes include removing unnecessary elements from the README text, and altering the regex pattern in the schema examples for improved match accuracy. --- README.md | 14 +++++++------- schema-examples/preset_usage.yml | 2 +- schema-examples/preset_users.yml | 4 ++-- 3 files changed, 10 insertions(+), 10 deletions(-) diff --git a/README.md b/README.md index 3743a932..05a4709a 100644 --- a/README.md +++ b/README.md @@ -927,7 +927,7 @@ used in a wide variety of CSV files. In order not to care about integrity and not to suffer from copy and paste, you can reuse ANY(!) existing schema. In fact, this can be considered as partial inheritance. -**Important notes** +Important notes - You can make the chain of inheritance infinitely long. I.e. make chains of the form `grant-parent.yml` -> `parent.yml` -> `child.yml` -> `grandchild.yml` -> `great-grandchild.yml` -> etc. Of course if you like to take risks ;). @@ -941,9 +941,9 @@ In fact, this can be considered as partial inheritance. Let's take a look at what this looks like in code. -- Let's define a couple of basic rules for [database columns](schema-examples/preset_database.yml). -- And also one of the files will contain rules specific only to the [users profile](schema-examples/preset_users.yml). -- And of course let's [make a schema](schema-examples/preset_usage.yml) that will simultaneously reuse the rules from these two files. +- Define a couple of basic rules for [database columns](schema-examples/preset_database.yml). +- Also, one of the files will contain rules specific only to the [users profile](schema-examples/preset_users.yml). +- And of course, let's [make a schema](schema-examples/preset_usage.yml) that will simultaneously reuse the rules from these two files. As a result, you don't just get a bunch of schemas for validation, which is difficult to manage, but something like a framework(!) that will be targeted to the specifics of your project, especially when there are dozens or even hundreds @@ -1011,7 +1011,7 @@ columns: rules: not_empty: true is_trimmed: true - regex: /^[a-zA-Z\d!@#$%^&*()_+\-=\[\]{};':"\|,.<>\/?~]{6,}$/ # Safe list of special characters for passwords + regex: /^[a-zA-Z\d!@#$%^&*()_+\-=\[\]{};':"\|,.<>\/?~]{6,}$/ # Safe list of special characters for passwords. contains_none: [ "password", "123456", "qwerty", " " ] charset: UTF-8 length_min: 6 @@ -1071,7 +1071,7 @@ columns: is_trimmed: true is_float: true num_min: 0.00 - num_max: 1000000000.00 # 1 billion + num_max: 1000000000.00 # 1 billion is max amount in our system. precision: 2 - name: short_description @@ -1104,13 +1104,13 @@ csv: enclosure: '|' # Overridden value columns: + # Grap only needed columns from the preset in specific order - preset: db/id - preset: db/status - preset: users/login - preset: users/email - preset: users/full_name - preset: users/birthday - - preset: users/password rules: length_min: 10 # Overridden value to force a strong password diff --git a/schema-examples/preset_usage.yml b/schema-examples/preset_usage.yml index f0f22e34..b33628b8 100644 --- a/schema-examples/preset_usage.yml +++ b/schema-examples/preset_usage.yml @@ -25,13 +25,13 @@ csv: enclosure: '|' # Overridden value columns: + # Grap only needed columns from the preset in specific order - preset: db/id - preset: db/status - preset: users/login - preset: users/email - preset: users/full_name - preset: users/birthday - - preset: users/password rules: length_min: 10 # Overridden value to force a strong password diff --git a/schema-examples/preset_users.yml b/schema-examples/preset_users.yml index 1d7a263e..87227d74 100644 --- a/schema-examples/preset_users.yml +++ b/schema-examples/preset_users.yml @@ -41,7 +41,7 @@ columns: rules: not_empty: true is_trimmed: true - regex: /^[a-zA-Z\d!@#$%^&*()_+\-=\[\]{};':"\\|,.<>\/?~]{6,}$/ # Safe list of special characters for passwords + regex: /^[a-zA-Z\d!@#$%^&*()_+\-=\[\]{};':"\\|,.<>\/?~]{6,}$/ # Safe list of special characters for passwords. contains_none: [ "password", "123456", "qwerty", " " ] charset: UTF-8 length_min: 6 @@ -101,7 +101,7 @@ columns: is_trimmed: true is_float: true num_min: 0.00 - num_max: 1000000000.00 # 1 billion + num_max: 1000000000.00 # 1 billion is max amount in our system. precision: 2 - name: short_description From 95fb68890ccb920b74e4cdffe53997d83364a110 Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sat, 6 Apr 2024 16:36:56 +0400 Subject: [PATCH 15/24] Improve README and schema example clarity This commit refines text and instructions in README.md and the preset schema examples to enhance understanding and reduce complexity. Changes include removing unnecessary elements from the README text, and altering the regex pattern in the schema examples for improved match accuracy. --- README.md | 18 +++++++----------- schema-examples/preset_features.yml | 3 ++- schema-examples/preset_usage.yml | 15 +++++---------- 3 files changed, 14 insertions(+), 22 deletions(-) diff --git a/README.md b/README.md index 05a4709a..2f3e8089 100644 --- a/README.md +++ b/README.md @@ -1111,22 +1111,17 @@ columns: - preset: users/email - preset: users/full_name - preset: users/birthday + + # Just a bit changed column from the preset - preset: users/password rules: length_min: 10 # Overridden value to force a strong password - - name: user_balance - rules: - preset: users/balance # Take only rules from the preset - - - preset: users/short_description - rules: - length_max: 255 # Overridden value - - - name: phone # Overridden value + - name: phone # Overridden name of the column preset: users/phone_number - - name: admin_note # New column specific only this schema + # New column specific only this schema + - name: admin_note description: Admin note rules: not_empty: true @@ -1191,7 +1186,8 @@ columns: preset: 'db/0:id' # Combo!!! If you're a risk-taker or have a high level of inner zen. :) - # Creating a column from three other columns. In fact, it will merge all three at once with key replacement. + # Creating a column from three other columns. + # In fact, it will merge all three at once with key replacement. - name: Crazy combo! description: > # Just a great advice. I like to take risks, too. diff --git a/schema-examples/preset_features.yml b/schema-examples/preset_features.yml index 37f33b7b..25495186 100644 --- a/schema-examples/preset_features.yml +++ b/schema-examples/preset_features.yml @@ -55,7 +55,8 @@ columns: preset: 'db/0:id' # Combo!!! If you're a risk-taker or have a high level of inner zen. :) - # Creating a column from three other columns. In fact, it will merge all three at once with key replacement. + # Creating a column from three other columns. + # In fact, it will merge all three at once with key replacement. - name: Crazy combo! description: > # Just a great advice. I like to take risks, too. diff --git a/schema-examples/preset_usage.yml b/schema-examples/preset_usage.yml index b33628b8..7ac9b55b 100644 --- a/schema-examples/preset_usage.yml +++ b/schema-examples/preset_usage.yml @@ -32,22 +32,17 @@ columns: - preset: users/email - preset: users/full_name - preset: users/birthday + + # Just a bit changed column from the preset - preset: users/password rules: length_min: 10 # Overridden value to force a strong password - - name: user_balance - rules: - preset: users/balance # Take only rules from the preset - - - preset: users/short_description - rules: - length_max: 255 # Overridden value - - - name: phone # Overridden value + - name: phone # Overridden name of the column preset: users/phone_number - - name: admin_note # New column specific only this schema + # New column specific only this schema + - name: admin_note description: Admin note rules: not_empty: true From 41bfee7499072bbb1fa14ce9c3d34fc16a56f475 Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sat, 6 Apr 2024 21:44:36 +0400 Subject: [PATCH 16/24] Refine preset usage in schema examples and update related instructions The preset usage in schema examples has been improved across json, php, and yml files. The corresponding documentation in README.md has been updated for better clarity and understanding. This change also fixes an issue with regex pattern matching while reducing complexity in schema preset utilization. --- README.md | 47 +++++++++-------- schema-examples/full.json | 9 +++- schema-examples/full.php | 8 ++- schema-examples/full.yml | 15 ++++-- schema-examples/full_clean.yml | 9 +++- schema-examples/preset_database.yml | 2 +- schema-examples/preset_usage.yml | 32 ++++++------ tests/ExampleSchemasTest.php | 79 ++++++++++++++++++++++++++--- tests/ReadmeTest.php | 4 +- 9 files changed, 148 insertions(+), 57 deletions(-) diff --git a/README.md b/README.md index 2f3e8089..ce28b23a 100644 --- a/README.md +++ b/README.md @@ -293,19 +293,20 @@ description: | # Any description of the CSV file. Not u supporting a wide range of data validation rules from basic type checks to complex regex validations. This example serves as a comprehensive guide for creating robust CSV file validations. -presets: - preset-alias: ./readme_sample.yml # Include another schema and define an alias for it. - +presets: # Include another schema and define an alias for it. + my-preset: ./preset_users.yml # Define preset alias "my-preset". See README.md for details. # Regular expression to match the file name. If not set, then no pattern check. # This allows you to pre-validate the file name before processing its contents. # Feel free to check parent directories as well. # See: https://www.php.net/manual/en/reference.pcre.pattern.syntax.php filename_pattern: /demo(-\d+)?\.csv$/i +# preset: my-preset # See README.md for details. # Here are default values to parse CSV file. # You can skip this section if you don't need to override the default values. csv: + preset: my-preset # See README.md for details. header: true # If the first row is a header. If true, name of each column is required. delimiter: , # Delimiter character in CSV file. quote_char: \ # Quote character in CSV file. @@ -317,6 +318,7 @@ csv: # They are not(!) related to the data in the columns. # You can skip this section if you don't need to override the default values. structural_rules: # Here are default values. + preset: my-preset # See README.md for details. strict_column_order: true # Ensure columns in CSV follow the same order as defined in this YML schema. It works only if "csv.header" is true. allow_extra_columns: false # Allow CSV files to have more columns than specified in this YML schema. @@ -325,7 +327,8 @@ structural_rules: # Here are default values. # This will not affect the validator, but will make it easier for you to navigate. # For convenience, use the first line as a header (if possible). columns: - - name: Column Name (header) # Any custom name of the column in the CSV file (first row). Required if "csv.header" is true. + - preset: my-preset/login # Add preset rules for the column. See README.md for details. + name: Column Name (header) # Any custom name of the column in the CSV file (first row). Required if "csv.header" is true. description: Lorem ipsum # Description of the column. Not used in the validation process. example: Some example # Example of the column value. Schema will also check this value on its own. @@ -338,6 +341,8 @@ columns: # Data validation for each(!) value in the column. Please, see notes in README.md # Every rule is optional. rules: + preset: my-preset/login # Add preset rules for the column. See README.md for details. + # General rules not_empty: true # Value is not an empty string. Actually checks if the string length is not 0. exact_value: Some string # Exact value for string in the column. @@ -546,6 +551,8 @@ columns: # Data validation for the entire(!) column using different data aggregation methods. # Every rule is optional. aggregate_rules: + preset: my-preset/login # Add preset aggregate rules for the column. See README.md for details. + is_unique: true # All values in the column are unique. # Check if the column is sorted in a specific order. @@ -1092,36 +1099,34 @@ columns: name: Schema uses presets and add new columns + specific rules. description: This schema uses presets. Also, it demonstrates how to override preset values. -presets: # Include any other schemas and defined for each alias - users: ./preset_users.yml # Include the schema with common user data - db: ./preset_database.yml # Include the schema with basic database columns +presets: # Include any other schemas and defined for each alias. + users: ./preset_users.yml # Include the schema with common user data. + db: ./preset_database.yml # Include the schema with basic database columns. filename_pattern: - preset: users # Take the filename pattern from the preset + preset: users # Take the filename pattern from the preset. + +structural_rules: # Take the global rules from the preset. + preset: users csv: - preset: users # Take the CSV settings from the preset - enclosure: '|' # Overridden value + preset: users # Take the CSV settings from the preset. + enclosure: '|' # Overridden enclosure only for this schema. columns: - # Grap only needed columns from the preset in specific order + # Grap only needed columns from the preset in specific order. - preset: db/id - preset: db/status - preset: users/login - preset: users/email - preset: users/full_name - preset: users/birthday - - # Just a bit changed column from the preset + - name: phone # Rename the column. "phone_number" => "phone". + preset: users/phone_number - preset: users/password rules: - length_min: 10 # Overridden value to force a strong password - - - name: phone # Overridden name of the column - preset: users/phone_number - - # New column specific only this schema - - name: admin_note + length_min: 10 # Overridden value to force a strong password. + - name: admin_note # New column specific only this schema. description: Admin note rules: not_empty: true @@ -1129,7 +1134,7 @@ columns: length_max: 10 aggregate_rules: # In practice this will be a rare case, but the opportunity is there. preset: db/id # Take only aggregate rules from the preset. - is_unique: true # Added new sprcific rule + is_unique: true # Added new specific aggregate rule. ``` diff --git a/schema-examples/full.json b/schema-examples/full.json index 19c49468..8ea8fc5f 100644 --- a/schema-examples/full.json +++ b/schema-examples/full.json @@ -2,13 +2,14 @@ "name" : "CSV Blueprint Schema Example", "description" : "This YAML file provides a detailed description and validation rules for CSV files\nto be processed by CSV Blueprint tool. It includes specifications for file name patterns,\nCSV formatting options, and extensive validation criteria for individual columns and their values,\nsupporting a wide range of data validation rules from basic type checks to complex regex validations.\nThis example serves as a comprehensive guide for creating robust CSV file validations.\n", - "presets" : { - "preset-alias" : ".\/readme_sample.yml" + "presets" : { + "my-preset" : ".\/preset_users.yml" }, "filename_pattern" : "\/demo(-\\d+)?\\.csv$\/i", "csv" : { + "preset" : "my-preset", "header" : true, "delimiter" : ",", "quote_char" : "\\", @@ -18,18 +19,21 @@ }, "structural_rules" : { + "preset" : "my-preset", "strict_column_order" : true, "allow_extra_columns" : false }, "columns" : [ { + "preset" : "my-preset/login", "name" : "Column Name (header)", "description" : "Lorem ipsum", "example" : "Some example", "required" : true, "rules" : { + "preset" : "my-preset/login", "not_empty" : true, "exact_value" : "Some string", "allow_values" : ["y", "n", ""], @@ -170,6 +174,7 @@ "credit_card" : "Any" }, "aggregate_rules" : { + "preset" : "my-preset/login", "is_unique" : true, "sorted" : ["asc", "natural"], diff --git a/schema-examples/full.php b/schema-examples/full.php index 4d68c6fd..95703d45 100644 --- a/schema-examples/full.php +++ b/schema-examples/full.php @@ -24,12 +24,13 @@ ', 'presets' => [ - 'preset-alias' => './readme_sample.yml', + 'my-preset' => './preset_users.yml', ], 'filename_pattern' => '/demo(-\\d+)?\\.csv$/i', 'csv' => [ + 'preset' => 'my-preset', 'header' => true, 'delimiter' => ',', 'quote_char' => '\\', @@ -39,18 +40,21 @@ ], 'structural_rules' => [ + 'preset' => 'my-preset', 'strict_column_order' => true, 'allow_extra_columns' => false, ], 'columns' => [ [ + 'preset' => 'my-preset/login', 'name' => 'Column Name (header)', 'description' => 'Lorem ipsum', 'example' => 'Some example', 'required' => true, 'rules' => [ + 'preset' => 'my-preset/login', 'not_empty' => true, 'exact_value' => 'Some string', 'allow_values' => ['y', 'n', ''], @@ -192,6 +196,8 @@ ], 'aggregate_rules' => [ + 'preset' => 'my-preset/login', + 'is_unique' => true, 'sorted' => ['asc', 'natural'], diff --git a/schema-examples/full.yml b/schema-examples/full.yml index 600af5cd..b3c29bd4 100644 --- a/schema-examples/full.yml +++ b/schema-examples/full.yml @@ -22,19 +22,20 @@ description: | # Any description of the CSV file. Not u supporting a wide range of data validation rules from basic type checks to complex regex validations. This example serves as a comprehensive guide for creating robust CSV file validations. -presets: - preset-alias: ./readme_sample.yml # Include another schema and define an alias for it. - +presets: # Include another schema and define an alias for it. + my-preset: ./preset_users.yml # Define preset alias "my-preset". See README.md for details. # Regular expression to match the file name. If not set, then no pattern check. # This allows you to pre-validate the file name before processing its contents. # Feel free to check parent directories as well. # See: https://www.php.net/manual/en/reference.pcre.pattern.syntax.php filename_pattern: /demo(-\d+)?\.csv$/i +# preset: my-preset # See README.md for details. # Here are default values to parse CSV file. # You can skip this section if you don't need to override the default values. csv: + preset: my-preset # See README.md for details. header: true # If the first row is a header. If true, name of each column is required. delimiter: , # Delimiter character in CSV file. quote_char: \ # Quote character in CSV file. @@ -46,6 +47,7 @@ csv: # They are not(!) related to the data in the columns. # You can skip this section if you don't need to override the default values. structural_rules: # Here are default values. + preset: my-preset # See README.md for details. strict_column_order: true # Ensure columns in CSV follow the same order as defined in this YML schema. It works only if "csv.header" is true. allow_extra_columns: false # Allow CSV files to have more columns than specified in this YML schema. @@ -54,7 +56,8 @@ structural_rules: # Here are default values. # This will not affect the validator, but will make it easier for you to navigate. # For convenience, use the first line as a header (if possible). columns: - - name: Column Name (header) # Any custom name of the column in the CSV file (first row). Required if "csv.header" is true. + - preset: my-preset/login # Add preset rules for the column. See README.md for details. + name: Column Name (header) # Any custom name of the column in the CSV file (first row). Required if "csv.header" is true. description: Lorem ipsum # Description of the column. Not used in the validation process. example: Some example # Example of the column value. Schema will also check this value on its own. @@ -67,6 +70,8 @@ columns: # Data validation for each(!) value in the column. Please, see notes in README.md # Every rule is optional. rules: + preset: my-preset/login # Add preset rules for the column. See README.md for details. + # General rules not_empty: true # Value is not an empty string. Actually checks if the string length is not 0. exact_value: Some string # Exact value for string in the column. @@ -275,6 +280,8 @@ columns: # Data validation for the entire(!) column using different data aggregation methods. # Every rule is optional. aggregate_rules: + preset: my-preset/login # Add preset aggregate rules for the column. See README.md for details. + is_unique: true # All values in the column are unique. # Check if the column is sorted in a specific order. diff --git a/schema-examples/full_clean.yml b/schema-examples/full_clean.yml index 51f66445..fc278628 100644 --- a/schema-examples/full_clean.yml +++ b/schema-examples/full_clean.yml @@ -22,11 +22,12 @@ description: | This example serves as a comprehensive guide for creating robust CSV file validations. presets: - preset-alias: ./readme_sample.yml + my-preset: ./preset_users.yml filename_pattern: '/demo(-\d+)?\.csv$/i' csv: + preset: my-preset header: true delimiter: ',' quote_char: \ @@ -35,16 +36,19 @@ csv: bom: false structural_rules: + preset: my-preset strict_column_order: true allow_extra_columns: false columns: - - name: 'Column Name (header)' + - preset: my-preset/login + name: 'Column Name (header)' description: 'Lorem ipsum' example: 'Some example' required: true rules: + preset: my-preset/login not_empty: true exact_value: 'Some string' allow_values: [ 'y', 'n', '' ] @@ -185,6 +189,7 @@ columns: credit_card: Any aggregate_rules: + preset: my-preset/login is_unique: true sorted: [ asc, natural ] first_num_min: 1.0 diff --git a/schema-examples/preset_database.yml b/schema-examples/preset_database.yml index 13d2723f..ec7c8555 100644 --- a/schema-examples/preset_database.yml +++ b/schema-examples/preset_database.yml @@ -15,7 +15,7 @@ description: This schema contains basic rules for database user data. columns: - name: id - description: A unique identifier, usually used to denote a primary key in databases. + description: Unique identifier, usually used to denote a primary key in databases. example: 12345 rules: not_empty: true diff --git a/schema-examples/preset_usage.yml b/schema-examples/preset_usage.yml index 7ac9b55b..91d2c918 100644 --- a/schema-examples/preset_usage.yml +++ b/schema-examples/preset_usage.yml @@ -13,36 +13,34 @@ name: Schema uses presets and add new columns + specific rules. description: This schema uses presets. Also, it demonstrates how to override preset values. -presets: # Include any other schemas and defined for each alias - users: ./preset_users.yml # Include the schema with common user data - db: ./preset_database.yml # Include the schema with basic database columns +presets: # Include any other schemas and defined for each alias. + users: ./preset_users.yml # Include the schema with common user data. + db: ./preset_database.yml # Include the schema with basic database columns. filename_pattern: - preset: users # Take the filename pattern from the preset + preset: users # Take the filename pattern from the preset. + +structural_rules: # Take the global rules from the preset. + preset: users csv: - preset: users # Take the CSV settings from the preset - enclosure: '|' # Overridden value + preset: users # Take the CSV settings from the preset. + enclosure: '|' # Overridden enclosure only for this schema. columns: - # Grap only needed columns from the preset in specific order + # Grap only needed columns from the preset in specific order. - preset: db/id - preset: db/status - preset: users/login - preset: users/email - preset: users/full_name - preset: users/birthday - - # Just a bit changed column from the preset + - name: phone # Rename the column. "phone_number" => "phone". + preset: users/phone_number - preset: users/password rules: - length_min: 10 # Overridden value to force a strong password - - - name: phone # Overridden name of the column - preset: users/phone_number - - # New column specific only this schema - - name: admin_note + length_min: 10 # Overridden value to force a strong password. + - name: admin_note # New column specific only this schema. description: Admin note rules: not_empty: true @@ -50,4 +48,4 @@ columns: length_max: 10 aggregate_rules: # In practice this will be a rare case, but the opportunity is there. preset: db/id # Take only aggregate rules from the preset. - is_unique: true # Added new sprcific rule + is_unique: true # Added new specific aggregate rule. diff --git a/tests/ExampleSchemasTest.php b/tests/ExampleSchemasTest.php index 26720b19..82b951f7 100644 --- a/tests/ExampleSchemasTest.php +++ b/tests/ExampleSchemasTest.php @@ -29,6 +29,7 @@ final class ExampleSchemasTest extends TestCase public function testFullListOfRules(): void { $rulesInConfig = yml(Tools::SCHEMA_FULL_YML)->findArray('columns.0.rules'); + unset($rulesInConfig['preset']); $rulesInConfig = \array_keys($rulesInConfig); \sort($rulesInConfig, \SORT_NATURAL); @@ -82,17 +83,75 @@ public function testFullListOfRules(): void ); } + public function testFullListOfAggregateRules(): void + { + $rulesInConfig = yml(Tools::SCHEMA_FULL_YML)->findArray('columns.0.aggregate_rules'); + unset($rulesInConfig['preset']); + $rulesInConfig = \array_keys($rulesInConfig); + \sort($rulesInConfig, \SORT_NATURAL); + + $finder = (new Finder()) + ->files() + ->in(PROJECT_ROOT . '/src/Rules/Aggregate') + ->ignoreDotFiles(false) + ->ignoreVCS(true) + ->name('/\\.php$/') + ->sortByName(true); + + foreach ($finder as $file) { + $ruleName = Utils::camelToKebabCase($file->getFilenameWithoutExtension()); + + if (\str_contains($ruleName, 'abstract')) { + continue; + } + + if (\str_contains($ruleName, 'combo_')) { + $ruleName = \str_replace('combo_', '', $ruleName); + $rulesInCode[] = $ruleName; + $rulesInCode[] = "{$ruleName}_min"; + $rulesInCode[] = "{$ruleName}_greater"; + $rulesInCode[] = "{$ruleName}_not"; + $rulesInCode[] = "{$ruleName}_less"; + $rulesInCode[] = "{$ruleName}_max"; + } else { + $rulesInCode[] = $ruleName; + } + } + \sort($rulesInCode, \SORT_NATURAL); + + isSame( + $rulesInCode, + $rulesInConfig, + "New: \n" . \array_reduce( + \array_diff($rulesInConfig, $rulesInCode), + static fn (string $carry, string $item) => $carry . "{$item}: NEW\n", + '', + ), + ); + + isSame( + $rulesInCode, + $rulesInConfig, + "Not exists: \n" . \array_reduce( + \array_diff($rulesInCode, $rulesInConfig), + static fn (string $carry, string $item) => $carry . "{$item}: FIXME\n", + '', + ), + ); + } + public function testCsvDefaultValues(): void { - isSame(yml(Tools::SCHEMA_FULL_YML)->findArray('csv'), (new Schema([]))->getCsvParams()); + $full = yml(Tools::SCHEMA_FULL_YML)->findArray('csv'); + unset($full['preset']); + isSame($full, (new Schema([]))->getCsvParams()); } public function testStructuralRules(): void { - isSame( - yml(Tools::SCHEMA_FULL_YML)->findArray('structural_rules'), - (new Schema([]))->getStructuralRulesParams(), - ); + $full = yml(Tools::SCHEMA_FULL_YML)->findArray('structural_rules'); + unset($full['preset']); + isSame($full, (new Schema([]))->getStructuralRulesParams()); } public function testCheckPhpExample(): void @@ -123,8 +182,14 @@ public function testUniqueNameOfRules(): void { $yml = yml(Tools::SCHEMA_FULL_YML); - $rules = \array_keys($yml->findArray('columns.0.rules')); - $agRules = \array_keys($yml->findArray('columns.0.aggregate_rules')); + $rules = $yml->findArray('columns.0.rules'); + unset($rules['preset']); + $rules = \array_keys($rules); + + $agRules = $yml->findArray('columns.0.aggregate_rules'); + unset($agRules['preset']); + $agRules = \array_keys($agRules); + $notUnique = \array_intersect($rules, $agRules); isSame([], $notUnique, 'Rules names should be unique: ' . \implode(', ', $notUnique)); diff --git a/tests/ReadmeTest.php b/tests/ReadmeTest.php index 8fe79622..27d6e515 100644 --- a/tests/ReadmeTest.php +++ b/tests/ReadmeTest.php @@ -93,8 +93,8 @@ public function testTableOutputExample(): void public function testBadgeOfRules(): void { - $cellRules = \count(yml(Tools::SCHEMA_FULL_YML)->findArray('columns.0.rules')); - $aggRules = \count(yml(Tools::SCHEMA_FULL_YML)->findArray('columns.0.aggregate_rules')); + $cellRules = \count(yml(Tools::SCHEMA_FULL_YML)->findArray('columns.0.rules')) - 1; + $aggRules = \count(yml(Tools::SCHEMA_FULL_YML)->findArray('columns.0.aggregate_rules')) - 1; $extraRules = \count(self::EXTRA_RULES); $totalRules = $cellRules + $aggRules + $extraRules; From 1f425fcf5a47594609af8f467df3bdd9335ac3b4 Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sat, 6 Apr 2024 21:59:17 +0400 Subject: [PATCH 17/24] Handle schema data preparation exceptions Modified Schema.php to handle potential exceptions during schema data preparation. This improves error handling by throwing an InvalidArgumentException with detailed information. Corresponding unit test for handling invalid preset file data has also been added in SchemaPresetTest.php. Minor text refinement was made in README.md. --- README.md | 2 +- src/Schema.php | 9 ++++++++- tests/SchemaPresetTest.php | 10 ++++++++++ 3 files changed, 19 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index ce28b23a..aaa1ac54 100644 --- a/README.md +++ b/README.md @@ -964,7 +964,7 @@ description: This schema contains basic rules for database user data. columns: - name: id - description: A unique identifier, usually used to denote a primary key in databases. + description: Unique identifier, usually used to denote a primary key in databases. example: 12345 rules: not_empty: true diff --git a/src/Schema.php b/src/Schema.php index 8da2530a..0592f215 100644 --- a/src/Schema.php +++ b/src/Schema.php @@ -73,7 +73,14 @@ public function __construct(null|array|string $csvSchemaFilenameOrArray = null) $basepath = \dirname($filename); } - $this->data = (new SchemaDataPrep($data, $basepath))->buildData(); + try { + $this->data = (new SchemaDataPrep($data, $basepath))->buildData(); + } catch (\Exception $e) { + throw new \InvalidArgumentException( + "Invalid schema \"{$this->filename}\" data.\nUnexpected error: \"{$e->getMessage()}\"", + ); + } + $this->columns = $this->prepareColumns(); } diff --git a/tests/SchemaPresetTest.php b/tests/SchemaPresetTest.php index 77b01f7c..0bd36deb 100644 --- a/tests/SchemaPresetTest.php +++ b/tests/SchemaPresetTest.php @@ -745,4 +745,14 @@ public function testRealChildOfChild(): void ], $schema->getData()->getArrayCopy()); isSame('', (string)$schema->validate()); } + + public function testInvalidPresetFile(): void + { + $this->expectExceptionMessage( + "Invalid schema \"_custom_array_\" data.\n" . + 'Unexpected error: "Unknown included file: "invalid.yml""', + ); + + $schema = new Schema(['presets' => ['alias' => 'invalid.yml']]); + } } From c21112b24384e1ebff16999997e63d89807da08c Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sat, 6 Apr 2024 22:49:09 +0400 Subject: [PATCH 18/24] Add ability to dump final schema Added an option to dump the final schema of the CSV file for inspection after all includes and inheritance. Implementation involves modification in Commands classes to use this new option, and within the Schema class to convert the schema data to a string in YAML format. Documentation and relevant test files have been updated to reflect this change. --- README.md | 41 ++++++++++++++++++++++--------- action.yml | 26 ++++++++++++-------- src/Commands/AbstractValidate.php | 16 ++++++++++++ src/Commands/ValidateCsv.php | 5 +++- src/Commands/ValidateSchema.php | 4 ++- src/Schema.php | 15 +++++++++++ src/SchemaDataPrep.php | 4 +-- tests/GithubActionsTest.php | 21 ++++++++-------- tests/Tools.php | 6 ++++- 9 files changed, 100 insertions(+), 38 deletions(-) diff --git a/README.md b/README.md index aaa1ac54..bf15ebd3 100644 --- a/README.md +++ b/README.md @@ -126,30 +126,45 @@ You can find launch examples in the [workflow demo](https://github.com/JBZoo/Csv ```yml -- uses: jbzoo/csv-blueprint@master # See the specific version on releases page +- uses: jbzoo/csv-blueprint@master # See the specific version on releases page. `@master` is latest. with: - # Path(s) to validate. You can specify path in which CSV files will be searched. Feel free to use glob pattrens. Usage examples: /full/path/file.csv, p/file.csv, p/*.csv, p/**/*.csv, p/**/name-*.csv, **/*.csv, etc. + # Specify the path(s) to the CSV files you want to validate. + # This can include a direct path to a file or a directory to search with a maximum depth of 10 levels. + # Examples: /full/path/name.csv; p/file.csv; p/*.csv; p/**/*.csv; p/**/name-*.csv; **/*.csv # Required: true csv: './tests/**/*.csv' - # Schema filepath. It can be a YAML, JSON or PHP. See examples on GitHub. + # Specify the path(s) to the schema file(s), supporting YAML, JSON, or PHP formats. + # Similar to CSV paths, you can direct to specific files or search directories with glob patterns. + # Examples: /full/path/name.yml; p/file.yml; p/*.yml; p/**/*.yml; p/**/name-*.yml; **/*.yml # Required: true schema: './tests/**/*.yml' # Report format. Available options: text, table, github, gitlab, teamcity, junit. - # Default value: table - # You can skip it - report: table + # Default value: 'table' + # Required: true + report: 'table' # Quick mode. It will not validate all rows. It will stop after the first error. - # Default value: no - # You can skip it - quick: no + # Default value: 'no' + # Required: true + quick: 'no' # Skip schema validation. If you are sure that the schema is correct, you can skip this check. - # Default value: no - # You can skip it - skip-schema: no + # Default value: 'no' + # Required: true + skip-schema: 'no' + + # Extra options for the CSV Blueprint. Only for debbuging and profiling. + # Available options: + # ANSI output. You can disable ANSI colors if you want with `--no-ansi`. + # Verbosity level: Available options: `-v`, `-vv`, `-vvv` + # Add flag `--profile` if you want to see profiling info. Add details with `-vvv`. + # Add flag `--debug` if you want to see more really deep details. + # Add flag `--dump-schema` if you want to see the final schema after all includes and inheritance. + # Default value: 'extra: --ansi' + # You can skip it. + extra: 'extra: --ansi' ``` @@ -1254,6 +1269,7 @@ Options: Returns a non-zero exit code if any error is detected. Enable by setting to any non-empty value or "yes". [default: "no"] + --dump-schema Dumps the schema of the CSV file if you want to see the final schema after inheritance. --debug Intended solely for debugging and advanced profiling purposes. Activating this option provides detailed process insights, useful for troubleshooting and performance analysis. @@ -1309,6 +1325,7 @@ Options: Returns a non-zero exit code if any error is detected. Enable by setting to any non-empty value or "yes". [default: "no"] + --dump-schema Dumps the schema of the CSV file if you want to see the final schema after inheritance. --debug Intended solely for debugging and advanced profiling purposes. Activating this option provides detailed process insights, useful for troubleshooting and performance analysis. diff --git a/action.yml b/action.yml index 61b89c99..03e982bf 100644 --- a/action.yml +++ b/action.yml @@ -20,13 +20,16 @@ branding: inputs: csv: - description: > - Path(s) to validate. You can specify path in which CSV files will be searched. - Feel free to use glob pattrens. Usage examples: - /full/path/file.csv, p/file.csv, p/*.csv, p/**/*.csv, p/**/name-*.csv, **/*.csv, etc. + description: | + Specify the path(s) to the CSV files you want to validate. + This can include a direct path to a file or a directory to search with a maximum depth of 10 levels. + Examples: /full/path/name.csv; p/file.csv; p/*.csv; p/**/*.csv; p/**/name-*.csv; **/*.csv required: true schema: - description: 'Schema filepath. It can be a YAML, JSON or PHP. See examples on GitHub.' + description: | + Specify the path(s) to the schema file(s), supporting YAML, JSON, or PHP formats. + Similar to CSV paths, you can direct to specific files or search directories with glob patterns. + Examples: /full/path/name.yml; p/file.yml; p/*.yml; p/**/*.yml; p/**/name-*.yml; **/*.yml required: true report: description: 'Report format. Available options: text, table, github, gitlab, teamcity, junit.' @@ -43,11 +46,14 @@ inputs: # Only for debbuging and profiling extra: - description: > - ANSI output. You can disable ANSI colors if you want with `--no-ansi`. - Verbosity level: Available options: `-v`, `-vv`, `-vvv` - Add flag `--profile` if you want to see profiling info. Add details with `-vvv`. - Add flag `--debug` if you want to see more really deep details. + description: | + Extra options for the CSV Blueprint. Only for debbuging and profiling. + Available options: + ANSI output. You can disable ANSI colors if you want with `--no-ansi`. + Verbosity level: Available options: `-v`, `-vv`, `-vvv` + Add flag `--profile` if you want to see profiling info. Add details with `-vvv`. + Add flag `--debug` if you want to see more really deep details. + Add flag `--dump-schema` if you want to see the final schema after all includes and inheritance. default: 'extra: --ansi' runs: diff --git a/src/Commands/AbstractValidate.php b/src/Commands/AbstractValidate.php index 79c7f94f..41879feb 100644 --- a/src/Commands/AbstractValidate.php +++ b/src/Commands/AbstractValidate.php @@ -18,6 +18,7 @@ use JBZoo\Cli\CliCommand; use JBZoo\CsvBlueprint\Exception; +use JBZoo\CsvBlueprint\Schema; use JBZoo\CsvBlueprint\Utils; use JBZoo\CsvBlueprint\Validators\ErrorSuite; use Symfony\Component\Console\Input\InputOption; @@ -60,6 +61,12 @@ protected function configure(): void ]), 'no', ) + ->addOption( + 'dump-schema', + null, + InputOption::VALUE_NONE, + 'Dumps the schema of the CSV file if you want to see the final schema after inheritance.', + ) ->addOption( 'debug', null, @@ -153,6 +160,15 @@ protected function renderIssues(string $prefix, int $number, string $filepath, i $this->out("{$prefix}{$number} {$issues} in {$filepath}", $indent); } + protected function printDumpOfSchema(Schema $schema): void + { + if ($this->getOptBool('dump-schema')) { + $this->_('```yaml'); + $this->_($schema->dumpAsYamlString()); + $this->_('```'); + } + } + protected static function renderPrefix(int $index, int $totalFiles): string { if ($totalFiles <= 1) { diff --git a/src/Commands/ValidateCsv.php b/src/Commands/ValidateCsv.php index 62bb7e7a..ebc6ca10 100644 --- a/src/Commands/ValidateCsv.php +++ b/src/Commands/ValidateCsv.php @@ -143,7 +143,10 @@ private function validateSchemas(array $schemaFilenames): int } try { - $schemaErrors = (new Schema($schemaFilename->getPathname()))->validate($quickCheck); + $schema = new Schema($schemaFilename->getPathname()); + $this->printDumpOfSchema($schema); + + $schemaErrors = $schema->validate($quickCheck); if ($schemaErrors->count() > 0) { $this->renderIssues($prefix, $schemaErrors->count(), $schemaPath, 2); $this->outReport($schemaErrors, 4); diff --git a/src/Commands/ValidateSchema.php b/src/Commands/ValidateSchema.php index 80255dc0..f3534b98 100644 --- a/src/Commands/ValidateSchema.php +++ b/src/Commands/ValidateSchema.php @@ -75,7 +75,9 @@ protected function executeAction(): int $schemaErrors = new ErrorSuite($filename); try { - $schemaErrors = (new Schema($filename))->validate($this->isQuickMode()); + $schema = new Schema($filename); + $schemaErrors = $schema->validate($this->isQuickMode()); + $this->printDumpOfSchema($schema); } catch (ParseException $e) { $schemaErrors->addError(new Error('schema.syntax', $e->getMessage(), '', $e->getParsedLine())); } catch (\Throwable $e) { diff --git a/src/Schema.php b/src/Schema.php index 0592f215..501fdcd1 100644 --- a/src/Schema.php +++ b/src/Schema.php @@ -21,6 +21,7 @@ use JBZoo\CsvBlueprint\Validators\ValidatorSchema; use JBZoo\Data\AbstractData; use JBZoo\Data\Data; +use Symfony\Component\Yaml\Yaml; use function JBZoo\Data\json; use function JBZoo\Data\phpArray; @@ -251,6 +252,20 @@ public function getStructuralRulesParams(): array ]; } + public function dumpAsYamlString(): string + { + return Yaml::dump( + $this->getData()->getArrayCopy(), + 10, + 2, + Yaml::DUMP_NULL_AS_TILDE + | Yaml::DUMP_NUMERIC_KEY_AS_STRING + | Yaml::DUMP_MULTI_LINE_LITERAL_BLOCK + | Yaml::DUMP_EMPTY_ARRAY_AS_SEQUENCE + | Yaml::DUMP_EXCEPTION_ON_INVALID_TYPE, + ); + } + /** * @return Column[] */ diff --git a/src/SchemaDataPrep.php b/src/SchemaDataPrep.php index ffa72d31..62827bd5 100644 --- a/src/SchemaDataPrep.php +++ b/src/SchemaDataPrep.php @@ -53,8 +53,8 @@ final class SchemaDataPrep 'aggregate_rules' => [], ], - 'rules' => ['preset' => ''], - 'aggregate_rules' => ['preset' => ''], + 'rules' => [], + 'aggregate_rules' => [], ]; private AbstractData $data; diff --git a/tests/GithubActionsTest.php b/tests/GithubActionsTest.php index 960113ed..7817ef3e 100644 --- a/tests/GithubActionsTest.php +++ b/tests/GithubActionsTest.php @@ -51,32 +51,31 @@ public function testGitHubActionsReadMe(): void $examples = [ 'csv' => './tests/**/*.csv', 'schema' => './tests/**/*.yml', - 'report' => ErrorSuite::REPORT_DEFAULT, - 'quick' => 'no', - 'skip-schema' => 'no', + 'report' => "'" . ErrorSuite::REPORT_DEFAULT . "'", + 'quick' => "'no'", + 'skip-schema' => "'no'", + 'extra' => "'extra: --ansi'", ]; $expectedMessage = [ '```yml', - '- uses: jbzoo/csv-blueprint@master # See the specific version on releases page', + '- uses: jbzoo/csv-blueprint@master # See the specific version on releases page. `@master` is latest.', ' with:', ]; foreach ($inputs as $key => $input) { - if ($key === 'extra') { - continue; - } - - $expectedMessage[] = ' # ' . \trim($input['description']); + $expectedMessage[] = ' # ' . \trim(\str_replace("\n", "\n # ", \trim($input['description']))); if (isset($input['default'])) { - $expectedMessage[] = " # Default value: {$input['default']}"; + $expectedMessage[] = " # Default value: '{$input['default']}'"; } if (isset($input['default']) && $examples[$key] === $input['default']) { - $expectedMessage[] = ' # You can skip it'; + $expectedMessage[] = ' # You can skip it.'; } elseif (isset($input['required']) && $input['required']) { $expectedMessage[] = ' # Required: true'; + } elseif ($key === 'extra') { + $expectedMessage[] = ' # You can skip it.'; } if ($key === 'csv' || $key === 'schema') { diff --git a/tests/Tools.php b/tests/Tools.php index 06a07dfb..8fa18d28 100644 --- a/tests/Tools.php +++ b/tests/Tools.php @@ -140,7 +140,11 @@ public static function insertInReadme(string $code, string $content, bool $isInl isTrue(\file_put_contents(self::README, $result) > 0); $hashAfter = \hash_file('md5', self::README); - isSame($hashAfter, $hashBefore, "README.md was not updated. Code: {$code}"); + isSame( + $hashAfter, + $hashBefore, + "README.md was not updated. Code: {$code}\n\n---------\n{$replacement}\n---------", + ); isFileContains($result, self::README); } From cfd0b50ef949630517556e1f4e2fb9ad3d2bcf4f Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sat, 6 Apr 2024 22:58:19 +0400 Subject: [PATCH 19/24] Highlight "Important notes," add troubleshooting section in README The 'Important notes' header in the README file has been bolded to improve readability. A new troubleshooting section, 'If something went wrong,' has been added with instructions on how to dump and validate the schema. --- README.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index bf15ebd3..044a3228 100644 --- a/README.md +++ b/README.md @@ -949,7 +949,7 @@ used in a wide variety of CSV files. In order not to care about integrity and not to suffer from copy and paste, you can reuse ANY(!) existing schema. In fact, this can be considered as partial inheritance. -Important notes +**Important notes** - You can make the chain of inheritance infinitely long. I.e. make chains of the form `grant-parent.yml` -> `parent.yml` -> `child.yml` -> `grandchild.yml` -> `great-grandchild.yml` -> etc. Of course if you like to take risks ;). @@ -961,6 +961,14 @@ Important notes "/^[a-z0-9-_]+$/i". Otherwise, it might break the syntax. +**If something went wrong** +If you're having trouble working with presets and don't understand how the CSV Blueprint under the hood understands +it, just add `--dump-schema` to see it. Also, there is a separate CLI command for validating schema: + +```shell +./csv-blueprint validate:schema --dump-schema --schema=./your/schema.yml +``` + Let's take a look at what this looks like in code. - Define a couple of basic rules for [database columns](schema-examples/preset_database.yml). From 0460bb79ac08a768034dce10949c4716f7609771 Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sat, 6 Apr 2024 23:56:22 +0400 Subject: [PATCH 20/24] Refactor schema validation and dump, update README Adjusted printDumpOfSchema to handle null schema and improve display of dumped schema in XML format. Moved calling of printDumpOfSchema method after possible exceptions in ValidateCsv. Also, clarified instructions for troubleshooting faulty presets in the README documentation. --- README.md | 5 +++-- src/Commands/AbstractValidate.php | 15 +++++++++++---- src/Commands/ValidateCsv.php | 5 +++-- src/Commands/ValidateSchema.php | 2 +- 4 files changed, 18 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index 044a3228..d0c81596 100644 --- a/README.md +++ b/README.md @@ -962,8 +962,9 @@ In fact, this can be considered as partial inheritance. Otherwise, it might break the syntax. **If something went wrong** -If you're having trouble working with presets and don't understand how the CSV Blueprint under the hood understands -it, just add `--dump-schema` to see it. Also, there is a separate CLI command for validating schema: + +If you're having trouble working with presets and don't understand how the CSV Blueprint under the hood understands it, +just add `--dump-schema` to see it. Also, there is a separate CLI command for validating schema: ```shell ./csv-blueprint validate:schema --dump-schema --schema=./your/schema.yml diff --git a/src/Commands/AbstractValidate.php b/src/Commands/AbstractValidate.php index 41879feb..7b5a8bb7 100644 --- a/src/Commands/AbstractValidate.php +++ b/src/Commands/AbstractValidate.php @@ -160,12 +160,19 @@ protected function renderIssues(string $prefix, int $number, string $filepath, i $this->out("{$prefix}{$number} {$issues} in {$filepath}", $indent); } - protected function printDumpOfSchema(Schema $schema): void + protected function printDumpOfSchema(?Schema $schema): void { + if ($schema === null) { + return; + } + $dump = $schema->dumpAsYamlString(); + $dump = \preg_replace('/^([ \t]*)([^:\n]+:)/m', '$1$2', $dump); + if ($this->getOptBool('dump-schema')) { - $this->_('```yaml'); - $this->_($schema->dumpAsYamlString()); - $this->_('```'); + $this->_('```yaml'); + $this->_("# File: {$schema->getFilename()}"); + $this->_($dump); + $this->_('```'); } } diff --git a/src/Commands/ValidateCsv.php b/src/Commands/ValidateCsv.php index ebc6ca10..538bf7d2 100644 --- a/src/Commands/ValidateCsv.php +++ b/src/Commands/ValidateCsv.php @@ -142,10 +142,10 @@ private function validateSchemas(array $schemaFilenames): int continue; } + $schema = null; + try { $schema = new Schema($schemaFilename->getPathname()); - $this->printDumpOfSchema($schema); - $schemaErrors = $schema->validate($quickCheck); if ($schemaErrors->count() > 0) { $this->renderIssues($prefix, $schemaErrors->count(), $schemaPath, 2); @@ -161,6 +161,7 @@ private function validateSchemas(array $schemaFilenames): int "{$prefix}Exception: {$e->getMessage()}", ], 2); } + $this->printDumpOfSchema($schema); } $this->out(''); diff --git a/src/Commands/ValidateSchema.php b/src/Commands/ValidateSchema.php index f3534b98..834c4458 100644 --- a/src/Commands/ValidateSchema.php +++ b/src/Commands/ValidateSchema.php @@ -77,7 +77,7 @@ protected function executeAction(): int try { $schema = new Schema($filename); $schemaErrors = $schema->validate($this->isQuickMode()); - $this->printDumpOfSchema($schema); + $this->printDumpOfSchema(new Schema($filename)); } catch (ParseException $e) { $schemaErrors->addError(new Error('schema.syntax', $e->getMessage(), '', $e->getParsedLine())); } catch (\Throwable $e) { From 9a0b46ac2f3f624e31e2de2d287ba72ca8fc3f4f Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sun, 7 Apr 2024 00:00:50 +0400 Subject: [PATCH 21/24] Update preset feature settings in yml files and README Simplified preset settings in 'preset_features.yml' and 'preset_usage.yml' files by combining several lines into one. These changes enhance the readability and maintainability of the code. Additionally, alignments and comments in 'preset_features.yml' and README file were adjusted for consistency and better readability. --- README.md | 22 +++++++--------------- schema-examples/preset_features.yml | 16 +++++++--------- schema-examples/preset_usage.yml | 6 ------ 3 files changed, 14 insertions(+), 30 deletions(-) diff --git a/README.md b/README.md index d0c81596..788f7c2c 100644 --- a/README.md +++ b/README.md @@ -1127,12 +1127,6 @@ presets: # Include any other schemas and defined for each alias. users: ./preset_users.yml # Include the schema with common user data. db: ./preset_database.yml # Include the schema with basic database columns. -filename_pattern: - preset: users # Take the filename pattern from the preset. - -structural_rules: # Take the global rules from the preset. - preset: users - csv: preset: users # Take the CSV settings from the preset. enclosure: '|' # Overridden enclosure only for this schema. @@ -1176,7 +1170,7 @@ description: This schema contains all the features of the presets. presets: # The basepath for the preset is `.` (current directory of the current schema file). # Define alias "db" for schema in `./preset_database.yml`. - db: preset_database.yml # Or `db: ./preset_database.yml`. It's up to you. + db: preset_database.yml # Or `db: ./preset_database.yml`. It's up to you. # For example, you can use a relative path. users: ./../schema-examples/preset_users.yml @@ -1184,11 +1178,9 @@ presets: # Or you can use an absolute path. # db: /full/path/preset_database.yml -filename_pattern: - preset: users # Take the filename pattern from the preset. - -csv: - preset: users # Take the CSV settings from the preset. +filename_pattern: { preset: users } # Take the filename pattern from the preset. +structural_rules: { preset: users } # Take the global rules from the preset. +csv: { preset: users } # Take the CSV settings from the preset. columns: # Use name of column from the preset. @@ -1218,14 +1210,14 @@ columns: # Creating a column from three other columns. # In fact, it will merge all three at once with key replacement. - name: Crazy combo! - description: > # Just a great advice. + description: > # Just a great advice. I like to take risks, too. Be careful. Use your power wisely. - example: ~ # Ignore inherited "example" value. Set it `null`. + example: ~ # Ignore inherited "example" value. Set it `null`. preset: 'users/login' rules: preset: 'users/email' - not_empty: true # Disable the rule from the preset. + not_empty: true # Disable the rule from the preset. aggregate_rules: preset: 'db/0' ``` diff --git a/schema-examples/preset_features.yml b/schema-examples/preset_features.yml index 25495186..baedf9aa 100644 --- a/schema-examples/preset_features.yml +++ b/schema-examples/preset_features.yml @@ -16,7 +16,7 @@ description: This schema contains all the features of the presets. presets: # The basepath for the preset is `.` (current directory of the current schema file). # Define alias "db" for schema in `./preset_database.yml`. - db: preset_database.yml # Or `db: ./preset_database.yml`. It's up to you. + db: preset_database.yml # Or `db: ./preset_database.yml`. It's up to you. # For example, you can use a relative path. users: ./../schema-examples/preset_users.yml @@ -24,11 +24,9 @@ presets: # Or you can use an absolute path. # db: /full/path/preset_database.yml -filename_pattern: - preset: users # Take the filename pattern from the preset. - -csv: - preset: users # Take the CSV settings from the preset. +filename_pattern: { preset: users } # Take the filename pattern from the preset. +structural_rules: { preset: users } # Take the global rules from the preset. +csv: { preset: users } # Take the CSV settings from the preset. columns: # Use name of column from the preset. @@ -58,13 +56,13 @@ columns: # Creating a column from three other columns. # In fact, it will merge all three at once with key replacement. - name: Crazy combo! - description: > # Just a great advice. + description: > # Just a great advice. I like to take risks, too. Be careful. Use your power wisely. - example: ~ # Ignore inherited "example" value. Set it `null`. + example: ~ # Ignore inherited "example" value. Set it `null`. preset: 'users/login' rules: preset: 'users/email' - not_empty: true # Disable the rule from the preset. + not_empty: true # Disable the rule from the preset. aggregate_rules: preset: 'db/0' diff --git a/schema-examples/preset_usage.yml b/schema-examples/preset_usage.yml index 91d2c918..c00298a0 100644 --- a/schema-examples/preset_usage.yml +++ b/schema-examples/preset_usage.yml @@ -17,12 +17,6 @@ presets: # Include any other schemas and defined for each alias. users: ./preset_users.yml # Include the schema with common user data. db: ./preset_database.yml # Include the schema with basic database columns. -filename_pattern: - preset: users # Take the filename pattern from the preset. - -structural_rules: # Take the global rules from the preset. - preset: users - csv: preset: users # Take the CSV settings from the preset. enclosure: '|' # Overridden enclosure only for this schema. From 60bfd1d75f9073f7b214efe780c8e989c4930436 Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sun, 7 Apr 2024 00:02:37 +0400 Subject: [PATCH 22/24] Update preset feature settings in yml files and README Simplified preset settings in 'preset_features.yml' and 'preset_usage.yml' files by combining several lines into one. These changes enhance the readability and maintainability of the code. Additionally, alignments and comments in 'preset_features.yml' and README file were adjusted for consistency and better readability. --- README.md | 11 +++++------ schema-examples/preset_usage.yml | 11 +++++------ 2 files changed, 10 insertions(+), 12 deletions(-) diff --git a/README.md b/README.md index 788f7c2c..41ccf44b 100644 --- a/README.md +++ b/README.md @@ -1139,18 +1139,17 @@ columns: - preset: users/email - preset: users/full_name - preset: users/birthday - - name: phone # Rename the column. "phone_number" => "phone". - preset: users/phone_number - - preset: users/password - rules: - length_min: 10 # Overridden value to force a strong password. + - preset: users/phone_number # Rename the column. "phone_number" => "phone". + name: phone + - preset: users/password # Overridden value to force a strong password. + rules: { length_min: 10 } - name: admin_note # New column specific only this schema. description: Admin note rules: not_empty: true length_min: 1 length_max: 10 - aggregate_rules: # In practice this will be a rare case, but the opportunity is there. + aggregate_rules: # In practice this will be a rare case, but the opportunity is there. preset: db/id # Take only aggregate rules from the preset. is_unique: true # Added new specific aggregate rule. ``` diff --git a/schema-examples/preset_usage.yml b/schema-examples/preset_usage.yml index c00298a0..258f2094 100644 --- a/schema-examples/preset_usage.yml +++ b/schema-examples/preset_usage.yml @@ -29,17 +29,16 @@ columns: - preset: users/email - preset: users/full_name - preset: users/birthday - - name: phone # Rename the column. "phone_number" => "phone". - preset: users/phone_number - - preset: users/password - rules: - length_min: 10 # Overridden value to force a strong password. + - preset: users/phone_number # Rename the column. "phone_number" => "phone". + name: phone + - preset: users/password # Overridden value to force a strong password. + rules: { length_min: 10 } - name: admin_note # New column specific only this schema. description: Admin note rules: not_empty: true length_min: 1 length_max: 10 - aggregate_rules: # In practice this will be a rare case, but the opportunity is there. + aggregate_rules: # In practice this will be a rare case, but the opportunity is there. preset: db/id # Take only aggregate rules from the preset. is_unique: true # Added new specific aggregate rule. From 7c3180d183f9bafce477528f7b51931810727cf5 Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sun, 7 Apr 2024 00:14:01 +0400 Subject: [PATCH 23/24] Add real example test and update README for preset usage Added a new test in ReadmeTest to check if real examples of preset usage are correctly inserted in README. Updated README to offer users more detailed real-life example showcasing the benefits and scalability of using presets. The example is presented in an expandable details section for improved readability. --- README.md | 168 +++++++++++++++++++++++++++++++++++++++++++ tests/ReadmeTest.php | 11 +++ 2 files changed, 179 insertions(+) diff --git a/README.md b/README.md index 41ccf44b..aee3f918 100644 --- a/README.md +++ b/README.md @@ -981,6 +981,9 @@ framework(!) that will be targeted to the specifics of your project, especially of CSV files and rules. It will be much easier to achieve consistency. Very often it's quite important. [Database preset](schema-examples/preset_database.yml) +
+ Click to see source code + ```yml name: Presets for database columns @@ -1008,7 +1011,14 @@ columns: ``` +
+ + [User data preset](schema-examples/preset_users.yml) + +
+ Click to see source code + ```yml name: Common presets for user data @@ -1116,6 +1126,161 @@ columns: ``` +This short and clear Yaml under the hood as roughly as follows. As you can see it simplifies your work a lot. + +
+ Click to see source code + + +```yml +name: 'Schema uses presets and add new columns + specific rules.' +description: 'This schema uses presets. Also, it demonstrates how to override preset values.' +presets: + users: ./schema-examples/preset_users.yml + db: ./schema-examples/preset_database.yml +filename_pattern: '' +csv: + header: true + delimiter: ; + quote_char: \ + enclosure: '|' + encoding: utf-8 + bom: false +structural_rules: + strict_column_order: true + allow_extra_columns: false +columns: + - + name: id + description: 'Unique identifier, usually used to denote a primary key in databases.' + example: 12345 + required: true + rules: + not_empty: true + is_trimmed: true + is_int: true + num_min: 1 + aggregate_rules: + is_unique: true + sorted: + - asc + - numeric + - + name: status + description: 'Status in database' + example: active + required: true + rules: + not_empty: true + allow_values: + - active + - inactive + - pending + - deleted + aggregate_rules: [] + - + name: login + description: "User's login name" + example: johndoe + required: true + rules: + not_empty: true + is_trimmed: true + is_lowercase: true + is_slug: true + length_min: 3 + length_max: 20 + is_alnum: true + aggregate_rules: + is_unique: true + - + name: email + description: "User's email address" + example: user@example.com + required: true + rules: + not_empty: true + is_trimmed: true + is_email: true + is_lowercase: true + aggregate_rules: + is_unique: true + - + name: full_name + description: "User's full name" + example: 'John Doe Smith' + required: true + rules: + not_empty: true + is_trimmed: true + charset: UTF-8 + contains: ' ' + word_count_min: 2 + word_count_max: 8 + is_capitalize: true + aggregate_rules: + is_unique: true + - + name: birthday + description: "Validates the user's birthday." + example: '1990-01-01' + required: true + rules: + not_empty: true + is_trimmed: true + date_format: Y-m-d + is_date: true + date_age_greater: 0 + date_age_less: 150 + date_max: now + aggregate_rules: [] + - + name: phone + description: "User's phone number in US" + example: '+1 650 253 00 00' + required: true + rules: + not_empty: true + is_trimmed: true + starts_with: '+1' + phone: US + aggregate_rules: [] + - + name: password + description: "User's password" + example: 9RfzENKD + required: true + rules: + not_empty: true + is_trimmed: true + regex: '/^[a-zA-Z\d!@#$%^&*()_+\-=\[\]{};'':"\|,.<>\/?~]{6,}$/' + contains_none: + - password + - '123456' + - qwerty + - ' ' + charset: UTF-8 + length_min: 10 + length_max: 20 + aggregate_rules: [] + - + name: admin_note + description: 'Admin note' + example: ~ + required: true + rules: + not_empty: true + length_min: 1 + length_max: 10 + aggregate_rules: + is_unique: true + sorted: + - asc + - numeric +``` + + +
[Usage of presets](schema-examples/preset_usage.yml) @@ -1155,6 +1320,9 @@ columns: ``` +
+ + As a result, readability and maintainability became dramatically easier. You can easily add new rules, change existing, etc. diff --git a/tests/ReadmeTest.php b/tests/ReadmeTest.php index 27d6e515..545cc919 100644 --- a/tests/ReadmeTest.php +++ b/tests/ReadmeTest.php @@ -16,6 +16,7 @@ namespace JBZoo\PHPUnit; +use JBZoo\CsvBlueprint\Schema; use JBZoo\CsvBlueprint\SchemaDataPrep; use JBZoo\Utils\Cli; use JBZoo\Utils\Str; @@ -223,6 +224,16 @@ public function testCheckPresetUsageExampleInReadme(): void Tools::insertInReadme('preset-usage-yml', $text); } + public function testCheckPresetUsageRealExampleInReadme(): void + { + $schema = new Schema('./schema-examples/preset_usage.yml'); + + $text = \implode("\n", ['```yml', \trim($schema->dumpAsYamlString()), '```']); + $text = \str_replace(PROJECT_ROOT, '.', $text); + + Tools::insertInReadme('preset-usage-real-yml', $text); + } + public function testAdditionalValidationRules(): void { $list[] = ''; From dcbaa7db55f7872dec0d49cc5774efaabe1bc5b8 Mon Sep 17 00:00:00 2001 From: SmetDenis Date: Sun, 7 Apr 2024 00:15:15 +0400 Subject: [PATCH 24/24] Add real example test and update README for preset usage Added a new test in ReadmeTest to check if real examples of preset usage are correctly inserted in README. Updated README to offer users more detailed real-life example showcasing the benefits and scalability of using presets. The example is presented in an expandable details section for improved readability. --- README.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index aee3f918..7b94bd70 100644 --- a/README.md +++ b/README.md @@ -195,7 +195,7 @@ make docker-build # local tag is "jbzoo/csv-blueprint:local" ### Phar binary
- Click to see using PHAR file + CLICK to see using PHAR file Ensure you have PHP installed on your machine. @@ -982,7 +982,7 @@ of CSV files and rules. It will be much easier to achieve consistency. Very ofte [Database preset](schema-examples/preset_database.yml)
- Click to see source code + CLICK to see source code ```yml @@ -1017,7 +1017,7 @@ columns: [User data preset](schema-examples/preset_users.yml)
- Click to see source code + CLICK to see source code ```yml @@ -1129,7 +1129,7 @@ columns: This short and clear Yaml under the hood as roughly as follows. As you can see it simplifies your work a lot.
- Click to see source code + CLICK to see what it looks like in memory. ```yml @@ -1592,7 +1592,7 @@ view [this live demo PR](https://github.com/JBZoo/Csv-Blueprint-Demo/pull/1/file ![GitHub Actions - PR](.github/assets/github-actions-pr.png)
- Click to see example in GitHub Actions terminal + CLICK to see example in GitHub Actions terminal ![GitHub Actions - Terminal](.github/assets/github-actions-termintal.png) @@ -1859,7 +1859,7 @@ In summary, the tool is developed with the highest standards of modern PHP pract It's random ideas and plans. No promises and deadlines. Feel free to [help me!](#contributing).
- Click to see the roadmap + CLICK to see the roadmap * **Batch processing** * If option `--csv` is not specified, then the STDIN is used. To build a pipeline in Unix-like systems. @@ -1954,7 +1954,7 @@ make codestyle - [Retry](https://github.com/JBZoo/Retry) - Tiny PHP library providing retry/backoff functionality with strategies and jitter.
- Click to see interesting fact + CLICK to see interesting fact I've achieved a personal milestone. The [initial release](https://github.com/JBZoo/Csv-Blueprint/releases/tag/0.1) of the project was crafted from the ground up in approximately 3 days, interspersed with regular breaks to care for a