-
Notifications
You must be signed in to change notification settings - Fork 0
fix: Resolve PostgreSQL BYTEA binary encoding issue (Issue #50 PR-4) #61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add Person model with encrypted email/phone/note fields - Implement PersonObserver for automatic blind index generation - Add PersonRepository for blind index search and upsert - Create comprehensive 30+ test suite for Person functionality - Add PHPStan ignore patterns for PEST dynamic properties - Fix PersonRepository type guards for array parameters KNOWN ISSUE: PostgreSQL BYTEA binary handling needs resolution - TenantKey accessors have encoding mismatch - Need to handle stream resources from PDO correctly - Tests fail due to nonce length issues in Observer Next: Fix binary encoding, verify all tests pass
Root Cause: - PostgreSQL BYTEA returns stream resources via PDO - Laravel PDO doesn't support PDO::PARAM_LOB for proper binary binding - Binary data corrupted after DB roundtrip (nonces wrong length) Solution: - Created Binary cast (app/Casts/Binary.php) with base64 encoding - Migration changes BYTEA → VARCHAR(64) for indexes, TEXT for encrypted - Simplified TenantKey model (-80 LOC): removed accessor/mutator boilerplate - Updated PersonRepository WHERE clauses for base64 comparisons - All schema tests updated to expect VARCHAR/TEXT types Test Results: ✅ TenantKeyTest: 12/12 passing ✅ PersonTest: 22/22 passing (was 2/22 before fix) ✅ All schema tests passing ✅ PHPStan level 9: clean ✅ Pint: formatted Technical Details: - Base64 encoding: 24-byte nonce → 32 chars, 32-byte key → 44 chars - VARCHAR(64) provides headroom for future key size changes - TEXT for Laravel encrypted cast (variable-length output) - Binary cast implements CastsAttributes<string, string> Relates to #50 (PR-4)
- Add beforeEach cleanup to prevent test ordering issues - Fixes intermittent failure of 'throws exception when KEK file is missing' - KEK file now cleaned before each test for proper isolation
- Call cleanupKekFile() directly in test body - Ensures KEK file is removed even with parallel test execution - Fixes intermittent test failures in CI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR resolves PostgreSQL BYTEA binary data encoding issues by migrating from native binary storage to base64-encoded VARCHAR/TEXT storage. The change addresses PDO/PostgreSQL incompatibilities when handling binary data in envelope encryption and blind indexes.
- Implements a custom
Binarycast class that handles base64 encoding/decoding for reliable cross-database binary storage - Migrates
TenantKeybinary columns (dek_wrapped,dek_nonce,idx_wrapped,idx_nonce) from BYTEA to VARCHAR - Introduces comprehensive
Personmodel with encrypted fields and blind indexes for searchable encryption - Adds
PersonRepositorywith blind index search andPersonObserverfor automatic index generation
Reviewed Changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
app/Casts/Binary.php |
New custom cast for base64-encoded binary data in VARCHAR columns |
app/Models/TenantKey.php |
Replaces custom accessors with Binary cast for binary fields |
app/Models/Person.php |
New model with encrypted fields and transient plaintext properties |
app/Observers/PersonObserver.php |
Automatically generates blind indexes on Person create/update |
app/Repositories/PersonRepository.php |
Business logic for blind index search and Person CRUD |
app/Providers/AppServiceProvider.php |
Registers PersonObserver |
database/migrations/2025_11_01_210213_change_bytea_columns_to_varchar.php |
Migration converting BYTEA columns to VARCHAR/TEXT |
tests/Feature/TenantKeysSchemaTest.php |
Updated schema test for VARCHAR storage |
tests/Feature/PersonSchemaTest.php |
Schema tests for Person table structure |
tests/Feature/TenantKeyTest.php |
Added cleanup comment for clarity |
tests/Feature/PersonTest.php |
Comprehensive test suite for Person model and repository |
phpstan.neon |
PHPStan suppressions for test dynamic properties |
docs/BINARY_ENCODING_ISSUE.md |
Technical documentation of the encoding issue and solutions |
Comments suppressed due to low confidence (2)
app/Models/Person.php:1
- PHPDoc incorrectly documents columns as BYTEA. After the migration,
email_encandphone_encare TEXT columns, whileemail_idxandphone_idxare VARCHAR columns. Update the documentation accordingly.
<?php
app/Models/TenantKey.php:1
- PHPDoc incorrectly documents columns as BYTEA. After the migration, these are VARCHAR columns storing base64-encoded binary data. Update to reflect the actual column types.
<?php
- Extract HMAC_SHA256_OUTPUT_BYTES to shared tests/TestConstants.php - Remove fragile require_once from PersonTest - Create NormalizesPersonFields trait to avoid code duplication - Update PersonObserver and PersonRepository to use trait - Update Observer docblock: BYTEA → VARCHAR columns - Improve Binary cast docblock to explicitly mention PostgreSQL PDO All Copilot review comments addressed
Constant now properly defined in tests/TestConstants.php
Workaround for Issue #62: Parallel test execution causes intermittent TenantKey unwrap failures (~1-10% failure rate). This temporary solution: - Checks for .preflight-sequential-tests marker file - Uses sequential testing if marker exists - Allows PR #61 to proceed while proper fix developed See: #62 Will be removed when Issue #62 is properly fixed with per-process KEK files or test isolation improvements.
Copilot Review Comments AddressedAll 6 Copilot review comments have been systematically addressed in commits 1. ✅ Fragile
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 8 comments.
Comments suppressed due to low confidence (1)
app/Models/TenantKey.php:23
- The PHPDoc still references BYTEA columns, but the migration changed these to VARCHAR columns with base64 encoding. Update the documentation to reflect that these are VARCHAR columns storing base64-encoded binary data.
* @property string $dek_wrapped BYTEA wrapped DEK
* @property string $dek_nonce BYTEA nonce for DEK
* @property string $idx_wrapped BYTEA wrapped idx_key
* @property string $idx_nonce BYTEA nonce for idx_key
CRITICAL BUGS FIXED: - Remove Binary cast from *_idx columns (caused UTF-8 errors in WHERE) - Store blind indexes as base64 strings directly in Observer - Keep base64_encode() in Repository WHERE clauses - Update test assertions to check base64_decode(idx) length IMPROVEMENTS: - Split email_plain validation into two clear error messages - Update Person model PHPDoc from BYTEA to VARCHAR/TEXT types Root cause: Laravel Eloquent casts are NOT applied to WHERE clause parameters, only to model attribute access. This caused PostgreSQL to reject raw binary as invalid UTF-8 in queries. Tests: 79/79 passing (184 assertions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated 7 comments.
Comments suppressed due to low confidence (1)
app/Models/TenantKey.php:23
- PHPDoc comments incorrectly describe these properties as BYTEA columns when they are now VARCHAR columns storing base64-encoded binary data. Update the documentation to reflect the actual storage type.
* @property string $dek_wrapped BYTEA wrapped DEK
* @property string $dek_nonce BYTEA nonce for DEK
* @property string $idx_wrapped BYTEA wrapped idx_key
* @property string $idx_nonce BYTEA nonce for idx_key
database/migrations/2025_11_01_210213_change_bytea_columns_to_varchar.php
Show resolved
Hide resolved
database/migrations/2025_11_01_210213_change_bytea_columns_to_varchar.php
Show resolved
Hide resolved
- Add strict mode to base64_decode in tests for validation - Add runtime type validation in Binary::set() with PHPStan ignore - Remove duplicate comment in Person model casts() - Clarify TenantKey comment about VARCHAR storage strategy - Improve base64_decode error handling in PersonTest All tests passing (79/79), PHPStan level 9 clean, Pint compliant. Addresses review comments from PR #61
The PEST test jobs were failing in both quality.yml and php-ci.yml workflows because APP_KEY was not set in the CI environment, causing Laravel's encryption service to throw MissingAppKeyException. Solution: Add 'php artisan key:generate --env=testing' step after dependency installation and before running tests. Resolves CI test failures in PR #61.
Previous attempt to use 'php artisan key:generate --env=testing' failed because key:generate doesn't support --env flag and requires a writable .env file. Solution: Set APP_KEY directly as an environment variable in the test step, using a static base64-encoded 64-character key suitable for testing environments. Resolves CI workflow startup failures in PR #61.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (1)
app/Models/TenantKey.php:1
- The PHPDoc comments still reference 'BYTEA' storage but these columns are now stored as VARCHAR with base64 encoding. Update the comments to reflect the actual storage type (e.g., 'VARCHAR base64-encoded wrapped DEK').
<?php
database/migrations/2025_11_01_210213_change_bytea_columns_to_varchar.php
Show resolved
Hide resolved
Previous APP_KEY was 64 characters and invalid. Laravel's Encrypter requires exactly 32 bytes (44 characters when base64-encoded) for AES-256-CBC/GCM encryption. Error: 'Unsupported cipher or incorrect key length. Supported ciphers are: aes-128-cbc, aes-256-cbc, aes-128-gcm, aes-256-gcm.' Solution: Generated proper base64-encoded 32-byte random key using base64_encode(random_bytes(32)). Resolves test failures in PR #61.
Clarify that hardcoded APP_KEY in CI workflows is only for testing and not used in production environments. Addresses Copilot review comment about documenting test-only keys.
Summary
Resolves the PostgreSQL BYTEA binary encoding incompatibility that was causing 21 PersonTest failures. Implements a pragmatic solution using VARCHAR/TEXT storage with base64 encoding.
Problem
PDO::PARAM_LOBfor proper binary parameter bindingsodium_crypto_secretbox_open(): Argument #2 ($nonce) must be SODIUM_CRYPTO_SECRETBOX_NONCEBYTES bytes longSolution
1. Binary Cast (NEW)
app/Casts/Binary.phpwith base64 encoding strategyCastsAttributes<string, string>with PHPDoc generics2. Database Migration (NEW)
2025_11_01_210213_change_bytea_columns_to_varchar.phptenant_keys: dek_wrapped, dek_nonce, idx_wrapped, idx_nonce → VARCHAR(64)person: email_idx, phone_idx → VARCHAR(64)person: email_enc, phone_enc → TEXT (Laravel encrypted cast)3. Simplified TenantKey Model
casts()arrayhandleBinaryAttribute()helper code4. Updated PersonRepository
base64_encode()for index comparisonswhere()works with VARCHAR storage (nowhereRawneeded)5. Test Fixes
6. Code Quality Improvements (Post-Review)
NormalizesPersonFieldstrait to eliminate email/phone normalization duplication inPersonObserverandPersonRepository(~30 LOC saved)tests/TestConstants.phpto eliminate fragilerequire_oncedependencies between test filesBinarycastHMAC_SHA256_OUTPUT_BYTESconstantTest Results
✅ 79 tests passing (184 assertions)
✅ Code Quality
--parallelflag. Documented in Issue #62 and implemented temporary sequential testing workaround (.preflight-sequential-testsmarker file). Tests pass 100% consistently with sequential execution. Proper fix planned for separate PR.Technical Details
Base64 Encoding Overhead
Why VARCHAR Instead of BYTEA?
PDO::PARAM_LOB\xHEXformat stored as literal string, not binaryAlternative Approaches Attempted
\xprefix: PostgreSQL stores as literal string (50 bytes vs 24)Files Changed
Main Implementation:
app/Casts/Binary.php(NEW, 69 lines)database/migrations/2025_11_01_210213_change_bytea_columns_to_varchar.php(NEW, 49 lines)app/Models/TenantKey.php(-80 LOC refactor)app/Models/Person.php(+2 LOC)app/Repositories/PersonRepository.php(+4 LOC)tests/Feature/PersonTest.php(+3 LOC)tests/Feature/PersonSchemaTest.php(+14 LOC)tests/Feature/TenantKeysSchemaTest.php(+5 LOC)tests/Feature/TenantKeyTest.php(+4 LOC isolation fix)docs/BINARY_ENCODING_ISSUE.md(NEW, 176 lines - detailed debugging notes)Code Quality Improvements:
app/Traits/NormalizesPersonFields.php(NEW, 37 lines - DRY trait)app/Observers/PersonObserver.php(MODIFIED - uses trait, -15 LOC)app/Repositories/PersonRepository.php(MODIFIED - uses trait, -15 LOC)tests/TestConstants.php(NEW, 11 lines - shared constants)tests/Feature/TenantKeyTest.php(MODIFIED - uses shared constants)phpstan.neon(MODIFIED - removed obsolete ignore)scripts/preflight.sh(MODIFIED - workaround for Issue Parallel test execution causes intermittent TenantKey unwrap failures #62).preflight-sequential-tests(NEW - marker file for sequential testing)Checklist
down()method)Related
Documentation
See
docs/BINARY_ENCODING_ISSUE.mdfor detailed debugging notes, root cause analysis, and decision rationale.