Skip to content

Commit

Permalink
Refactor regex validation checks to a dedicated method (#64)
Browse files Browse the repository at this point in the history
The code has been updated to move regex validation checks to a Utility
method, improving reusability. Multiple classes including IsDomain,
IsFloat, IsGeohash, IsInt, IsUsaMarketName, RegEx, CsvValidator, all now
utilize this utility method. This enhancement provides a centralized
place for handling regex validations ensuring uniform error handling
across the application. Minor modifications were also made to related
comments and tests.
  • Loading branch information
SmetDenis committed Mar 19, 2024
1 parent fc08765 commit 14bd208
Show file tree
Hide file tree
Showing 17 changed files with 182 additions and 92 deletions.
33 changes: 17 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,7 @@ columns:
# Of course it's an ultimatum to verify any sort of string data.
# Please, be careful. Regex is a powerful tool, but it can be very dangerous if used incorrectly.
# Remember that if you want to solve a problem with regex, you now have two problems.
# But have it your way, then happy debugging! https://regex101.com.
regex: /^[\d]{2}$/

# Checks length of a string including spaces (multibyte safe).
Expand Down Expand Up @@ -525,22 +526,22 @@ Schema is invalid: ./tests/schemas/demo_invalid.yml
+-------+------------------+--------------+----- demo_invalid.yml -----------------------------------------------+
(1/1) Invalid file: ./tests/fixtures/demo.csv
+------+------------------+------------------+----------------------- demo.csv ---------------------------------------------------------------------+
| Line | id:Column | Rule | Message |
+------+------------------+------------------+------------------------------------------------------------------------------------------------------+
| 1 | | filename_pattern | Filename "./tests/fixtures/demo.csv" does not match pattern: "/demo-[12].csv$/i" |
| 6 | 0:Name | length_min | The length of the value "Carl" is 4, which is less than the expected "5" |
| 11 | 0:Name | length_min | The length of the value "Lois" is 4, which is less than the expected "5" |
| 1 | 1:City | ag:is_unique | Column has non-unique values. Unique: 9, total: 10 |
| 2 | 2:Float | num_max | The number of the value "4825.185", which is greater than the expected "4825.184" |
| 6 | 3:Birthday | date_min | The date of the value "1955-05-14" is parsed as "1955-05-14 00:00:00 +00:00", which is less than the |
| | | | expected "1955-05-15 00:00:00 +00:00 (1955-05-15)" |
| 8 | 3:Birthday | date_min | The date of the value "1955-05-14" is parsed as "1955-05-14 00:00:00 +00:00", which is less than the |
| | | | expected "1955-05-15 00:00:00 +00:00 (1955-05-15)" |
| 9 | 3:Birthday | date_max | The date of the value "2010-07-20" is parsed as "2010-07-20 00:00:00 +00:00", which is greater than |
| | | | the expected "2009-01-01 00:00:00 +00:00 (2009-01-01)" |
| 5 | 4:Favorite color | allow_values | Value "blue" is not allowed. Allowed values: ["red", "green", "Blue"] |
+------+------------------+------------------+----------------------- demo.csv ---------------------------------------------------------------------+
+-------+------------------+------------------+----------------------- demo.csv ---------------------------------------------------------------------+
| Line | id:Column | Rule | Message |
+-------+------------------+------------------+------------------------------------------------------------------------------------------------------+
| undef | | filename_pattern | Filename "./tests/fixtures/demo.csv" does not match pattern: "/demo-[12].csv$/i" |
| 6 | 0:Name | length_min | The length of the value "Carl" is 4, which is less than the expected "5" |
| 11 | 0:Name | length_min | The length of the value "Lois" is 4, which is less than the expected "5" |
| 1 | 1:City | ag:is_unique | Column has non-unique values. Unique: 9, total: 10 |
| 2 | 2:Float | num_max | The number of the value "4825.185", which is greater than the expected "4825.184" |
| 6 | 3:Birthday | date_min | The date of the value "1955-05-14" is parsed as "1955-05-14 00:00:00 +00:00", which is less than the |
| | | | expected "1955-05-15 00:00:00 +00:00 (1955-05-15)" |
| 8 | 3:Birthday | date_min | The date of the value "1955-05-14" is parsed as "1955-05-14 00:00:00 +00:00", which is less than the |
| | | | expected "1955-05-15 00:00:00 +00:00 (1955-05-15)" |
| 9 | 3:Birthday | date_max | The date of the value "2010-07-20" is parsed as "2010-07-20 00:00:00 +00:00", which is greater than |
| | | | the expected "2009-01-01 00:00:00 +00:00 (2009-01-01)" |
| 5 | 4:Favorite color | allow_values | Value "blue" is not allowed. Allowed values: ["red", "green", "Blue"] |
+-------+------------------+------------------+----------------------- demo.csv ---------------------------------------------------------------------+
Found 9 issues in CSV file.
Expand Down
4 changes: 4 additions & 0 deletions csv-blueprint.php
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,10 @@

\date_default_timezone_set('UTC');

\set_error_handler(static function ($severity, $message, $file, $line): void {
throw new \ErrorException($message, 0, $severity, $file, $line);
});

(new CliApplication('CSV Blueprint', '@git-version@'))
->registerCommandsByPath(PATH_ROOT . '/src/Commands', __NAMESPACE__)
->setLogo(
Expand Down
1 change: 1 addition & 0 deletions schema-examples/full.yml
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ columns:
# Of course it's an ultimatum to verify any sort of string data.
# Please, be careful. Regex is a powerful tool, but it can be very dangerous if used incorrectly.
# Remember that if you want to solve a problem with regex, you now have two problems.
# But have it your way, then happy debugging! https://regex101.com.
regex: /^[\d]{2}$/

# Checks length of a string including spaces (multibyte safe).
Expand Down
4 changes: 3 additions & 1 deletion src/Rules/Cell/IsDomain.php
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@

namespace JBZoo\CsvBlueprint\Rules\Cell;

use JBZoo\CsvBlueprint\Utils;

final class IsDomain extends AbstractCellRule
{
protected const HELP_OPTIONS = [
Expand All @@ -26,7 +28,7 @@ public function validateRule(string $cellValue): ?string
{
$domainPattern = '/^(?!-)[A-Za-z0-9-]+(\.[A-Za-z0-9-]+)*(\.[A-Za-z]{2,})$/';

if (\preg_match($domainPattern, $cellValue) === 0) {
if (Utils::testRegex($domainPattern, $cellValue)) {
return "Value \"<c>{$cellValue}</c>\" is not a valid domain";
}

Expand Down
4 changes: 3 additions & 1 deletion src/Rules/Cell/IsFloat.php
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@

namespace JBZoo\CsvBlueprint\Rules\Cell;

use JBZoo\CsvBlueprint\Utils;

class IsFloat extends AbstractCellRule
{
protected const HELP_OPTIONS = [
Expand All @@ -24,7 +26,7 @@ class IsFloat extends AbstractCellRule

public function validateRule(string $cellValue): ?string
{
if (\preg_match('/^-?\d+(\.\d+)?$/', $cellValue) === 0) {
if (Utils::testRegex('/^-?\d+(\.\d+)?$/', $cellValue)) {
return "Value \"<c>{$cellValue}</c>\" is not a float number";
}

Expand Down
4 changes: 3 additions & 1 deletion src/Rules/Cell/IsGeohash.php
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@

namespace JBZoo\CsvBlueprint\Rules\Cell;

use JBZoo\CsvBlueprint\Utils;

class IsGeohash extends AbstractCellRule
{
protected const HELP_OPTIONS = [
Expand All @@ -24,7 +26,7 @@ class IsGeohash extends AbstractCellRule

public function validateRule(string $cellValue): ?string
{
if (\preg_match('/^[0-9b-hj-km-np-z]{1,}$/', $cellValue) === 0) {
if (Utils::testRegex('/^[0-9b-hj-km-np-z]{1,}$/', $cellValue)) {
return "Value \"<c>{$cellValue}</c>\" is not a valid Geohash";
}

Expand Down
4 changes: 3 additions & 1 deletion src/Rules/Cell/IsInt.php
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@

namespace JBZoo\CsvBlueprint\Rules\Cell;

use JBZoo\CsvBlueprint\Utils;

final class IsInt extends AbstractCellRule
{
protected const HELP_OPTIONS = [
Expand All @@ -24,7 +26,7 @@ final class IsInt extends AbstractCellRule

public function validateRule(string $cellValue): ?string
{
if (\preg_match('/^-?\d+$/', $cellValue) === 0) {
if (Utils::testRegex('/^-?\d+$/', $cellValue)) {
return "Value \"<c>{$cellValue}</c>\" is not an integer";
}

Expand Down
4 changes: 3 additions & 1 deletion src/Rules/Cell/IsUsaMarketName.php
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@

namespace JBZoo\CsvBlueprint\Rules\Cell;

use JBZoo\CsvBlueprint\Utils;

final class IsUsaMarketName extends AllowValues
{
protected const HELP_OPTIONS = [
Expand All @@ -24,7 +26,7 @@ final class IsUsaMarketName extends AllowValues

public function validateRule(string $cellValue): ?string
{
if (\preg_match('/^[A-Za-z\s\'\-\.,]+, [A-Z]{2}$/u', $cellValue) === 0) {
if (Utils::testRegex('/^[A-Za-z\s\'\-\.,\(\)]+, [A-Z-]{2,6}$/u', $cellValue)) {
return "Invalid market name format for value \"<c>{$cellValue}</c>\". " .
'Market name must have format "<green>New York, NY</green>"';
}
Expand Down
3 changes: 2 additions & 1 deletion src/Rules/Cell/Regex.php
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ final class Regex extends AbstractCellRule
"Of course it's an ultimatum to verify any sort of string data.",
'Please, be careful. Regex is a powerful tool, but it can be very dangerous if used incorrectly.',
'Remember that if you want to solve a problem with regex, you now have two problems.',
'But have it your way, then happy debugging! https://regex101.com',
];

protected const HELP_OPTIONS = [
Expand All @@ -42,7 +43,7 @@ public function validateRule(string $cellValue): ?string
return 'Regex pattern is not defined';
}

if (\preg_match($regex, $cellValue) === 0) {
if (Utils::testRegex($regex, $cellValue)) {
return "Value \"<c>{$cellValue}</c>\" does not match the pattern \"<green>{$regex}</green>\"";
}

Expand Down
17 changes: 17 additions & 0 deletions src/Utils.php
Original file line number Diff line number Diff line change
Expand Up @@ -201,4 +201,21 @@ public static function matchTypes(
return isset($mapOfValidConvertions[$expectedType])
&& \in_array($actualType, $mapOfValidConvertions[$expectedType], true);
}

public static function testRegex(string $regex, string $cellValue): bool
{
if ($regex === '' || $cellValue === '') {
return false;
}

try {
if (\preg_match($regex, $cellValue) === 0) {
return true;
}
} catch (\Throwable) {
return false;
}

return false;
}
}
4 changes: 2 additions & 2 deletions src/Validators/CsvValidator.php
Original file line number Diff line number Diff line change
Expand Up @@ -103,14 +103,14 @@ private function validateFile(bool $quickStop = false): ErrorSuite
if (
$filenamePattern !== null
&& $filenamePattern !== ''
&& \preg_match($filenamePattern, $this->csv->getCsvFilename()) === 0
&& Utils::testRegex($filenamePattern, $this->csv->getCsvFilename())
) {
$error = new Error(
'filename_pattern',
'Filename "<c>' . Utils::cutPath($this->csv->getCsvFilename()) .
"</c>\" does not match pattern: \"<c>{$filenamePattern}</c>\"",
'',
ColumnValidator::FALLBACK_LINE,
Error::UNDEFINED_LINE,
);

$errors->addError($error);
Expand Down
7 changes: 4 additions & 3 deletions src/Validators/Error.php
Original file line number Diff line number Diff line change
Expand Up @@ -30,12 +30,13 @@ public function __construct(

public function __toString(): string
{
$columnStr = $this->getColumnName() === '' ? '' : ", column \"{$this->getColumnName()}\"";

if ($this->line === self::UNDEFINED_LINE) {
return "\"{$this->getRuleCode()}\", column \"{$this->getColumnName()}\". {$this->getMessage()}.";
return "\"{$this->getRuleCode()}\"{$columnStr}. {$this->getMessage()}.";
}

return "\"{$this->getRuleCode()}\" at line <red>{$this->getLine()}</red>, " .
"column \"{$this->getColumnName()}\". {$this->getMessage()}.";
return "\"{$this->getRuleCode()}\" at line <red>{$this->getLine()}</red>{$columnStr}. {$this->getMessage()}.";
}

public function getRuleCode(): string
Expand Down

0 comments on commit 14bd208

Please sign in to comment.