- Version: 1.0
- Date: 2025-12-27
- Author: [Signalforger] signalforger@signalforge.eu
- Status: Draft
- First Published: N/A
- Target Version: PHP 8.5
- Implementation: N/A
- Introduction
- Proposal
- Examples
- Backward Compatibility
- Performance Impact
- Implementation
- Reflection
- Future Scope
- Comparison with Other Languages
- Security Implications
- Real-World Framework Examples
- Migration Guide
- Common Objections / FAQ
- Open Questions
- Voting
- Patches and Tests
- Changelog
- References
PHP's type system currently supports return type declarations for scalar types, classes, and the generic array type. However, the array type provides no structural information, forcing developers to rely on documentation and static analysis tools to understand array contents.
This RFC proposes adding array shape types and homogeneous array types to return type declarations, enabling runtime validation of array structure at function boundaries.
function getUser(int $id): array {
return $db->fetchAssoc("SELECT * FROM users WHERE id = ?", [$id]);
}
// What keys does this array have?
// What types are the values?
// No way to know without reading docs or using static analysis
$user = getUser(1);
echo $user['name']; // Hope this key exists!function getUser(int $id): array{id: int, name: string, email: string} {
return $db->fetchAssoc("SELECT * FROM users WHERE id = ?", [$id]);
// ✓ Validated at return: ensures id, name, email exist with correct types
}
$user = getUser(1);
echo $user['name']; // Guaranteed to exist (at time of return)Tools like PHPStan and Psalm already support array shapes via docblocks. Why add native syntax?
Static analyzers only work on code you control. They cannot validate:
- Data from databases
- External API responses
- User input
- Deserialized data (JSON, sessions)
// PHPStan can't help here - it doesn't know what the DB returns
$user = $db->fetchAssoc("SELECT * FROM users WHERE id = ?", [$id]);
echo $user['name']; // Hope it exists!
// Native types catch bad data at runtime
function getUser($id): array{id: int, name: string} {
return $db->fetchAssoc(...); // TypeError if DB schema changed
}Static analysis is optional and bypassable:
/** @return array{id: int} */
function getUser() {
// @phpstan-ignore-next-line
return ['id' => 'not-an-int']; // PHPStan silenced, bug ships
}Native types cannot be ignored - the engine enforces them.
Docblocks routinely get out of sync with actual code:
/**
* @return array{id: int, name: string} // Outdated - email was added!
*/
function getUser(): array {
return ['id' => 1, 'name' => 'alice', 'email' => 'a@b.com'];
}Native types ARE the contract - they cannot drift from implementation.
Native types enable runtime introspection for frameworks:
// Framework can auto-generate OpenAPI docs, validation, serialization
$returnType = (new ReflectionFunction('getUser'))->getReturnType();
// Returns ReflectionArrayShape with key/type info
// With docblocks, you must parse comments and handle PHPStan/Psalm/PhpStorm format differencesThis enables automatic API documentation, runtime validation frameworks, serialization libraries, and dependency injection containers to work with array types.
Native validation is engine-optimized with escape analysis, type caching, and loop unrolling:
// Native: ~1-20% overhead with optimizations
function getIds(): array<int> { return [1,2,3]; }
// Userland validation: always slower
function getIds(): array {
$arr = [1,2,3];
foreach ($arr as $v) {
if (!is_int($v)) throw new TypeError(...);
}
return $arr;
}Current fragmentation across tools:
/** @return array{id: int} */ // PHPStan
/** @psalm-return array{id: int} */ // Psalm
/** @return array{'id': int} */ // PhpStorm (quoted keys)
#[ArrayShape(['id' => 'int'])] // PhpStorm attributeNative syntax provides one standard for the entire ecosystem.
Native errors point to the exact problem:
TypeError: getUser(): Return value must be of type array{id: int, name: string},
key 'name' is missing in returned array
vs. silent corruption or vague errors later:
Warning: Undefined array key "name" in /app/src/View.php on line 847
Native types work in any PHP-aware editor immediately. Docblock-based types require installing PHPStan/Psalm, IDE plugins, configuration, and running analysis separately.
| Aspect | Native Types | Static Analysis |
|---|---|---|
| Runtime validation | Yes | No |
| Can be bypassed | No | Yes |
| Comment drift | Impossible | Common |
| Reflection access | Built-in | Parse comments |
| Performance | Engine-optimized | Userland |
| Setup required | None | Tools + config |
| External data | Validated | Trusted blindly |
Native types and static analysis are complementary - static analysis catches bugs before runtime, native types catch bugs that slip through (especially from external data sources).
This RFC introduces three new type declaration syntaxes for array types (both return types and parameter types):
An array where every element is of the same type:
function getIds(): array<int> {
return [1, 2, 3, 4, 5];
}
function getTags(): array<string> {
return ['php', 'web', 'backend'];
}An array with typed keys and typed values:
function getScores(): array<string, int> {
return ['alice' => 100, 'bob' => 85, 'charlie' => 92];
}
function getIndexedNames(): array<int, string> {
return [0 => 'first', 1 => 'second', 2 => 'third'];
}
// Key type must be int, string, or int|string
function getMixed(): array<int|string, float> {
return [0 => 1.5, 'pi' => 3.14];
}An array with a defined structure:
function getUser(): array{id: int, name: string, email: string} {
return [
'id' => 1,
'name' => 'Alice',
'email' => 'alice@example.com'
];
}
function getCoordinates(): array{0: float, 1: float} {
return [51.5074, -0.1278]; // Numeric keys
}Keys marked with ? are optional and don't need to be present:
function getConfig(): array{host: string, port?: int, ssl?: bool} {
// Only 'host' is required
return ['host' => 'localhost'];
}
function getFullConfig(): array{host: string, port?: int, ssl?: bool} {
// All keys provided - also valid
return ['host' => 'secure.example.com', 'port' => 443, 'ssl' => true];
}Array shapes and typed arrays work as parameter types too:
// Shape as parameter type
function processOrder(array{product: string, quantity: int, price: float} $order): float {
return $order['quantity'] * $order['price'];
}
// Typed array as parameter
function sumNumbers(array<int> $numbers): int {
return array_sum($numbers);
}
// Both parameter and return types
function transformUser(
array{id: int, name: string} $input
): array{id: int, name: string, processed: bool} {
return [...$input, 'processed' => true];
}Both syntaxes can be nested:
// Array of arrays
function getMatrix(): array<array<int>> {
return [[1, 2, 3], [4, 5, 6], [7, 8, 9]];
}
// Array of shapes
function getUsers(): array<array{id: int, name: string}> {
return [
['id' => 1, 'name' => 'Alice'],
['id' => 2, 'name' => 'Bob'],
];
}
// Shape containing arrays
function getApiResponse(): array{
success: bool,
data: array<array{id: int, title: string}>,
total: int
} {
return [
'success' => true,
'data' => [
['id' => 1, 'title' => 'First'],
['id' => 2, 'title' => 'Second'],
],
'total' => 2,
];
}Define reusable type aliases for array structures using the shape keyword:
declare(strict_arrays=1);
// Define shape type aliases at file scope
shape User = array{id: int, name: string, email: string};
shape Point = array{x: int, y: int};
shape Config = array{debug: bool, env: string, cache_ttl?: int};
// Use shapes as return types
function getUser(int $id): User {
return ['id' => $id, 'name' => 'Alice', 'email' => 'alice@example.com'];
}
// Use shapes as parameter types
function processUser(User $user): void {
echo "Processing: {$user['name']}";
}
// Calculate distance between two points
function distance(Point $a, Point $b): float {
return sqrt(($b['x'] - $a['x']) ** 2 + ($b['y'] - $a['y']) ** 2);
}Nested Shapes:
Shapes can reference other shapes for complex hierarchical structures:
shape Address = array{street: string, city: string, zip: string};
shape Person = array{name: string, age: int, address: Address};
function getPerson(): Person {
return [
'name' => 'Alice',
'age' => 30,
'address' => [
'street' => '123 Main St',
'city' => 'Springfield',
'zip' => '12345'
]
];
}Shapes with Typed Arrays:
shape Team = array{
name: string,
members: array<string>,
scores: array<int>
};
shape ApiResponse = array{
success: bool,
data: mixed,
errors: array<array{code: int, message: string}>
};Shape Autoloading:
Shapes can be autoloaded using standard spl_autoload_register():
spl_autoload_register(function($name) {
$file = __DIR__ . "/shapes/$name.php";
if (file_exists($file)) {
require_once $file;
}
});
// UserShape will be autoloaded from shapes/UserShape.php when first used
function getUser(): UserShape {
return ['id' => 1, 'name' => 'Alice', 'email' => 'alice@example.com'];
}shape_exists() Function:
Check if a shape type alias is defined:
// Check without triggering autoload
if (shape_exists('User', false)) {
echo "User shape is already defined";
}
// Check with autoloading (default)
if (shape_exists('User')) {
echo "User shape exists (was autoloaded if needed)";
}Benefits of Shape Aliases:
- Reusability: Define once, use in multiple functions
- Readability: Clean function signatures without inline type definitions
- Maintainability: Change the shape definition in one place
- Documentation: Self-documenting data structures
- IDE Support: Enhanced autocomplete and refactoring
Similar to declare(strict_types=1), array element validation is controlled by a declare directive:
<?php
declare(strict_arrays=1);
function getIds(): array<int> {
return [1, 2, "three"]; // TypeError: element at index 2 is string
}Without the declare directive, array<T> provides syntax support only:
<?php
// No declare(strict_arrays=1)
function getIds(): array<int> {
return [1, 2, "three"]; // No error - validation is disabled
}This design provides:
- Zero overhead by default - existing code is unaffected
- Opt-in runtime validation - enable only where needed
- Gradual adoption - add runtime checks file by file
- Static analysis compatibility - tools can enforce types regardless of runtime mode
For array<T>:
- Every element in the array must be of type
T - Mixed keys (int/string) are allowed
- Empty arrays are valid
For array{key: type}:
- All specified keys must exist in the returned array
- Values for specified keys must match declared types
- Extra keys are allowed (open shapes)
- Keys can be string identifiers or integer literals
Validation happens once at function return in the ZEND_VERIFY_RETURN_TYPE opcode handler:
function getData(): array{id: int} {
$data = ['id' => 1, 'name' => 'Extra']; // Extra keys OK
return $data; // ✓ Validated here
}
$result = getData();
$result['id'] = 'string'; // No runtime error
unset($result['id']); // No runtime errorRationale: Type information is not maintained after the function returns. This design choice:
- Matches PHP's existing pattern (validate at boundaries)
- Avoids runtime overhead on array operations
- Keeps mental model simple
- Allows static analyzers to enforce post-return correctness
Nested structures are validated recursively:
function getNestedData(): array<array{id: int, value: string}> {
return [
['id' => 1, 'value' => 'test'], // Each element validated
['id' => 2, 'value' => 'data'], // as array{id: int, value: string}
];
}Array shape return types follow PHP's standard covariance rules for return types:
Covariance (More Specific Return Types Allowed):
class Repository {
function getUser(): array{id: int} {
return ['id' => 1];
}
}
class ExtendedRepository extends Repository {
// ✓ Valid: Child returns MORE keys (covariant)
function getUser(): array{id: int, name: string, email: string} {
return ['id' => 1, 'name' => 'Alice', 'email' => 'a@b.com'];
}
}Contravariance (Less Specific Not Allowed):
class Repository {
function getUser(): array{id: int, name: string} {
return ['id' => 1, 'name' => 'Alice'];
}
}
class BrokenRepository extends Repository {
// ✗ Fatal error: Cannot return less specific type
function getUser(): array{id: int} {
return ['id' => 1];
}
}For array<T>, standard type variance applies:
class NumberProvider {
function getNumbers(): array<int|float> {
return [1, 2.5, 3];
}
}
class IntProvider extends NumberProvider {
// ✓ Valid: array<int> is more specific than array<int|float>
function getNumbers(): array<int> {
return [1, 2, 3];
}
}Array shape validation interacts with PHP's type coercion system:
With declare(strict_types=1) (Strict Mode):
declare(strict_types=1);
declare(strict_arrays=1);
function getInts(): array<int> {
return [1, 2, "3"]; // TypeError: element 2 must be int, string given
}
function getFloats(): array<float> {
return [1, 2, 3]; // TypeError: elements must be float, int given
}Without declare(strict_types=1) (Weak Mode):
declare(strict_arrays=1);
// No strict_types - coercion applies
function getInts(): array<int> {
return [1, 2, "3"]; // ✓ OK: "3" coerced to 3
}
function getFloats(): array<float> {
return [1, 2, 3]; // ✓ OK: integers coerced to floats
}
function getStrings(): array<string> {
return [1, 2, 3]; // ✓ OK: integers coerced to "1", "2", "3"
}Coercion follows standard PHP rules:
| From | To | Behavior |
|---|---|---|
int |
float |
Allowed (widening) |
float |
int |
Allowed in weak mode (truncates) |
string |
int |
Allowed if numeric string |
int/float |
string |
Allowed (converts to string) |
bool |
int |
Allowed (true→1, false→0) |
null |
any | TypeError (null is not coercible) |
array/object |
scalar | TypeError |
Empty Arrays:
function getIds(): array<int> {
return []; // ✓ Valid - empty array satisfies any array<T>
}
function getUser(): array{id: int, name: string} {
return []; // ✗ TypeError - missing required keys
}Nullable Types:
function maybeGetIds(): ?array<int> {
return null; // ✓ Valid
}
function getIds(): array<?int> {
return [1, null, 3]; // ✓ Valid - elements can be null
}
function maybeGetUser(): ?array{id: int} {
return null; // ✓ Valid
}Union Types:
function getData(): array<int>|false {
return false; // ✓ Valid
}
function getMixed(): array<int|string> {
return [1, "two", 3, "four"]; // ✓ Valid
}References in Arrays:
function getInts(): array<int> {
$x = 1;
$arr = [&$x, 2, 3]; // Contains reference
return $arr; // ✓ Valid - reference target is validated
}Deeply Nested Types:
// Arbitrary nesting depth supported
function getMatrix(): array<array<array<int>>> {
return [[[1, 2], [3, 4]], [[5, 6], [7, 8]]];
}
// Practical limit: validation time scales with depth × sizeNumeric String Keys:
function getData(): array{0: int, 1: string} {
return ['0' => 1, '1' => 'hello']; // ✓ Valid - PHP normalizes keys
// Internally stored as [0 => 1, 1 => 'hello']
}// Simple homogeneous array
function getPrimes(): array<int> {
return [2, 3, 5, 7, 11];
}
// Simple shape
function getConfig(): array{host: string, port: int, ssl: bool} {
return [
'host' => 'localhost',
'port' => 3306,
'ssl' => false,
];
}<?php
declare(strict_arrays=1); // Required for runtime validation
// Missing required key
function getUser(): array{id: int, name: string} {
return ['id' => 1];
// Fatal error: Uncaught TypeError: Return value missing key 'name'
}
// Wrong type for key
function getUser(): array{id: int, name: string} {
return ['id' => 'string', 'name' => 'Alice'];
// Fatal error: Return value key 'id' must be of type int, string given
}
// Wrong element type in array<T>
function getIds(): array<int> {
return [1, 2, 'three'];
// Fatal error: Uncaught TypeError: getIds(): Return value must be
// of type array<int>, array element at index 2 is string
}<?php
declare(strict_arrays=1); // Enable runtime validation for this file
class UserRepository {
/**
* Fetch all users from database
*/
public function findAll(): array<array{
id: int,
username: string,
email: string,
created_at: string,
is_active: bool
}> {
$rows = $this->db->query("SELECT * FROM users")->fetchAll();
return $rows; // Validates each row has correct structure
}
/**
* Get user statistics
*/
public function getStatistics(): array{
total_users: int,
active_users: int,
inactive_users: int,
new_today: int
} {
return [
'total_users' => $this->db->count('users'),
'active_users' => $this->db->count('users', ['is_active' => 1]),
'inactive_users' => $this->db->count('users', ['is_active' => 0]),
'new_today' => $this->db->count('users', [
'created_at' => date('Y-m-d')
]),
];
}
}/**
* @return array{id: int, name: string}
*/
function getUser(): array{id: int, name: string} {
// PHPStan/Psalm: validates function body
// PHP Runtime: validates return value
return ['id' => 1, 'name' => 'Alice'];
}
$user = getUser();
// Static analyzer knows $user['id'] is int
// Static analyzer knows $user['name'] is string
// Static analyzer warns if you try $user['missing_key']- Existing code without array shapes continues to work unchanged
- Plain
arrayreturn types remain unaffected - No changes to array behavior after assignment
- No impact on existing extensions (unless they want to use the feature)
// Before (PHP 8.4 and earlier)
function getUser(): array {
return ['id' => 1, 'name' => 'Alice'];
}
// After (PHP 8.5+) - opt-in
function getUser(): array{id: int, name: string} {
return ['id' => 1, 'name' => 'Alice'];
}zend_typestructure uses existing union mechanism- No ABI break - extensions don't need recompilation
- New type masks use reserved bits in
type_maskfield
Benchmarks run on PHP 8.5.0-dev with declare(strict_arrays=1), 50,000 iterations, various array sizes.
The implementation includes several aggressive optimizations that dramatically reduce overhead:
| Scenario | Plain array |
array<int> |
Overhead | Notes |
|---|---|---|---|---|
| Constant literals (10 elem) | 0.62 ms | 0.63 ms | ~1% | Escape analysis |
| Cached array (100 elem) | 1.78 ms | 1.97 ms | ~11% | Type tagging cache |
| Fresh arrays (100 elem) | 188 ms | 224 ms | ~19% | Loop unrolling + prefetch |
| Object arrays (20 elem) | 7.51 ms | 24.7 ms | ~229% | No class caching |
The implementation uses four key optimization strategies borrowed from Java, C#, and JavaScript engines:
For constant array literals, the compiler verifies element types at compile time, completely eliminating runtime validation:
function getConstants(): array<int> {
return [1, 2, 3, 4, 5]; // Verified at compile time - ZERO runtime cost
}How it works:
- During compilation,
zend_const_array_elements_match_type()analyzes literal array values - If all elements match the expected type, the
ZEND_VERIFY_RETURN_TYPEopcode is skipped entirely - The array is returned directly with no type checking overhead
Overhead: ~0-1% (noise level)
Arrays that pass validation are "tagged" with their validated type, allowing subsequent validations to be skipped:
// In zend_types.h - HashTable structure
struct _zend_array {
union {
struct {
uint8_t flags;
uint8_t nValidatedElemType; // Cached element type (was _unused)
// ...
} v;
} u;
};How it works:
- When an array is validated for
array<int>, the type code (IS_LONG) is stored innValidatedElemType - A flag bit (
HASH_FLAG_ELEM_TYPE_VALID) marks the cache as valid - On subsequent returns of the same array, validation is a single flag check
- Any modification to the array invalidates the cache automatically
Cache invalidation triggers:
zend_hash_add/update- element added or modifiedzend_hash_del- element removedzend_hash_clean- array cleared
Overhead: ~10-11% (first validation) → ~0% (cached hits)
For arrays that must be validated (first-time or cache miss), the validator uses CPU-optimized iteration:
// 4x loop unrolling with prefetch hints
while (data + 4 <= end) {
// Prefetch next cache line (64 bytes ahead)
__builtin_prefetch(data + 8, 0, 1);
// Unrolled type checks - CPU can pipeline these
if (Z_TYPE_P(data) != IS_LONG) return false;
if (Z_TYPE_P(data + 1) != IS_LONG) return false;
if (Z_TYPE_P(data + 2) != IS_LONG) return false;
if (Z_TYPE_P(data + 3) != IS_LONG) return false;
data += 4;
}How it works:
- Processes 4 elements per iteration, reducing loop overhead by 75%
__builtin_prefetch()hints the CPU to load the next cache line before it's needed- Packed arrays (sequential integer keys) use a contiguous memory fast path
- Branch prediction is optimized with
UNEXPECTED()macros for error paths
Overhead: ~17-22% (varies by array size)
Packed arrays (sequential 0-indexed) benefit from optimized memory access:
static zend_always_inline bool zend_verify_packed_array_elements_long(zval *data, uint32_t count)
{
// Direct pointer arithmetic on contiguous memory
// Much faster than hash table iteration
}How it works:
- Packed arrays store elements in contiguous memory
- The validator skips hash table overhead and iterates directly over the data pointer
- Combined with loop unrolling, this maximizes cache efficiency
| Use Case | Optimization | Expected Overhead |
|---|---|---|
| Return literal arrays | Escape analysis | 0% |
| Return same array repeatedly | Type tagging cache | ~1% after first call |
| Return fresh arrays (small) | Loop unrolling | ~15-20% |
| Return fresh arrays (large) | Loop unrolling + prefetch | ~15-20% |
| Return object arrays | Full validation | ~200-250% |
Array shapes can also be used as parameter types. This section benchmarks the overhead of validating shaped parameters compared to plain array type hints.
Testing with 1,000,000 iterations per test case:
| Shape Complexity | Plain (ms) | Shaped (ms) | Overhead | Per-call |
|---|---|---|---|---|
| 2 keys (Point) | ~16 ms | ~53 ms | ~230% | ~37 ns |
| 4 keys (User) | ~16 ms | ~68 ms | ~325% | ~52 ns |
| 5 keys (Config) | ~16 ms | ~78 ms | ~388% | ~62 ns |
| 6 keys nested (2+2+2) | ~18 ms | ~89 ms | ~395% | ~71 ns |
Note: These percentages appear high because the function bodies are empty - we're measuring only the validation overhead. The actual impact depends on how much work the function performs.
Testing a function that loops through items and calculates order totals (500,000 iterations):
| Scenario | Time | Notes |
|---|---|---|
| Plain array (no validation) | ~202 ms | No shape checking |
| Array shape (5 keys in + 4 keys out) | ~250 ms | Full shape validation |
| Absolute overhead | ~48 ms | |
| Relative overhead | ~24% | In context of real work |
| Per-call overhead | ~96 ns |
Key insight: When functions perform actual work (database queries, calculations, I/O), the validation overhead becomes a much smaller percentage of total execution time.
| Scenario | Expected Overhead |
|---|---|
| Empty functions (micro-benchmark) | ~200-400% |
| Light work (simple calculations) | ~20-50% |
| Medium work (loops, string ops) | ~10-25% |
| Heavy work (I/O, database) | <5% |
- Development/testing: Enable validation to catch type errors early
- API boundaries: Validate data entering/leaving your application
- Cached data: Near-zero overhead for repeatedly returned arrays
- Performance-critical loops: Acceptable for most cases; profile if returning new large arrays in tight loops
- Zero per-array overhead - type descriptors stored in function metadata
- 1 byte per HashTable - reused from previously unused padding byte for type cache
- Element type info allocated once at compile time (persistent memory)
- Typical type descriptor: 16-32 bytes per function (one-time cost)
- Lines of code: ~1,500 LOC (including optimizations)
- Files modified: 9 core files
- Test coverage: 10 test files covering all features
- Status: Complete working implementation
Zend/zend_language_parser.y (~100 LOC) - Grammar rules
Zend/zend_language_scanner.l (~50 LOC) - Lexer state machine
Zend/zend_compile.h (~150 LOC) - Type structures
Zend/zend_compile.c (~350 LOC) - Type compilation + escape analysis
Zend/zend_execute.c (~400 LOC) - Runtime validation + optimized loops
Zend/zend_types.h (~10 LOC) - HashTable type cache field
Zend/zend_hash.h (~20 LOC) - Type cache macros
Zend/zend_hash.c (~30 LOC) - Cache invalidation hooks
ext/reflection/php_reflection.c (~150 LOC) - Reflection support
// Shape element (key: type pair)
typedef struct _zend_shape_element {
zend_string *key; // String key (or NULL for numeric)
zend_ulong key_num; // Numeric key (if key == NULL)
zend_type type; // Element type (can be nested)
} zend_shape_element;
// Shape descriptor
typedef struct _zend_array_shape {
uint32_t num_elements;
zend_shape_element elements[]; // Flexible array member
} zend_array_shape;
// Homogeneous array descriptor
typedef struct _zend_array_of {
zend_type element_type;
uint8_t depth; // For array<array<T>>
} zend_array_of;The complete implementation is available at:
- Fork: https://github.com/signalforger/php-src/tree/feature/array-shapes
- Patch: https://github.com/signalforger/php-array-shapes/blob/main/patches/array-shapes.patch
- Build script: https://github.com/signalforger/php-array-shapes/blob/main/build-php-array-shapes.sh
The implementation includes:
- Complete parser integration with lexer-level
>>splitting for nested generics - Type compilation and storage in arena memory
- Runtime validation with optimized loops and type caching
- Full test coverage (10 test files, all 5142 Zend tests pass)
$reflFunc = new ReflectionFunction('getUser');
$returnType = $reflFunc->getReturnType();
// ReflectionNamedType methods
echo $returnType->getName(); // "array{id: int, name: string}"
echo $returnType->__toString(); // "array{id: int, name: string}"
// For array<T>
if ($returnType instanceof ReflectionArrayType) {
$elementType = $returnType->getElementType(); // ReflectionType
}
// For array{k: T}
if ($returnType instanceof ReflectionArrayShapeType) {
$elements = $returnType->getElements(); // array<ReflectionArrayShapeElement>
$count = $returnType->getElementCount();
$requiredCount = $returnType->getRequiredElementCount();
foreach ($elements as $element) {
echo $element->getName(); // Key name
echo $element->getType(); // ReflectionType for value
echo $element->isOptional(); // bool - true if key?: type
}
}For homogeneous arrays (array<T>):
class ReflectionArrayType extends ReflectionType {
/** Get the element type */
public function getElementType(): ReflectionType;
}For array shapes (array{key: type}):
class ReflectionArrayShapeType extends ReflectionType {
/** Get all elements defined in the shape */
public function getElements(): array; // array<ReflectionArrayShapeElement>
/** Get total number of elements */
public function getElementCount(): int;
/** Get number of required (non-optional) elements */
public function getRequiredElementCount(): int;
}Represents a single element in an array shape:
class ReflectionArrayShapeElement {
/** Get the key name */
public function getName(): string;
/** Get the type of this element's value */
public function getType(): ReflectionType;
/** Check if this element is optional (key?: type) */
public function isOptional(): bool;
}<?php
declare(strict_arrays=1);
function getUserProfile(): array{
id: int,
name: string,
email: string,
age?: int,
verified?: bool
} {
return ['id' => 1, 'name' => 'Alice', 'email' => 'alice@example.com'];
}
$reflection = new ReflectionFunction('getUserProfile');
$returnType = $reflection->getReturnType();
if ($returnType instanceof ReflectionArrayShapeType) {
echo "Total elements: " . $returnType->getElementCount() . "\n"; // 5
echo "Required elements: " . $returnType->getRequiredElementCount() . "\n"; // 3
foreach ($returnType->getElements() as $element) {
$optional = $element->isOptional() ? ' (optional)' : '';
echo " {$element->getName()}: {$element->getType()}{$optional}\n";
}
}
// Output:
// Total elements: 5
// Required elements: 3
// id: int
// name: string
// email: string
// age: int (optional)
// verified: bool (optional)The following features are intentionally excluded from this RFC but could be proposed separately:
function process(array{id: int, name: string} $data): void {
// ✓ Works - validates parameter at call time
}Status: Implemented in this version. Parameter shapes validate when the function is called.
class User {
public array<string> $tags; // Not in this RFC
}Rationale: Properties require validation on every write operation, significantly different from boundary validation.
function getData(): array{id: int}! { // Exact shape, no extra keys
// Not in this RFC
}Rationale: Can be added later if needed. Open shapes (allowing extra keys) are more PHP-like.
function getData(): array{id: int, name?: string} {
return ['id' => 1]; // ✓ Works - 'name' is optional
}Status: Implemented in this version. Optional keys are marked with ? after the key name.
function getData(): list<int> { // Sequential array (0, 1, 2, ...)
// Not in this RFC
}Rationale: array<T> handles this case. Explicit list<T> can be added later.
function getConfig(): readonly array{host: string, port: int} {
// Not in this RFC
}Rationale: Immutability is a separate concern that could be addressed in a future RFC. Considerations:
- Syntax options:
readonly array<T>,immutable array<T>, orconst array<T> - Copy-on-write: PHP arrays already use COW; true immutability would prevent modification after return
- Nested immutability: Should
readonly array<array<int>>make nested arrays immutable too? - Performance: Immutable arrays could skip defensive copying and enable additional optimizations
- Use cases: Configuration, constants, thread-safe data sharing (with future async)
This RFC focuses on type validation; immutability can build on top of array shapes in a follow-up proposal.
function getScores(): array<string, int> {
return ['alice' => 100, 'bob' => 85]; // ✓ Works - validates keys are strings, values are ints
}Status: Implemented in this version. The array<K, V> syntax validates both key types (int, string, or int|string only) and value types.
shape User = array{id: int, name: string, email: string};
function getUser(): User {
return ['id' => 1, 'name' => 'Alice', 'email' => 'alice@example.com'];
}Status: Implemented in this version. The shape keyword allows defining reusable type aliases for array shapes. Shapes support autoloading via spl_autoload_register() and can be checked with shape_exists().
type UserShape = shape(
'id' => int,
'name' => string,
);
function getUser(): UserShape { }Differences:
- Hack uses
typekeyword; PHP usesshapekeyword for aliases - Hack has closed/open shape distinction
- PHP supports both inline definitions and type aliases
- Both support autoloading of type aliases
function getUser(): { id: number, name: string } {
return { id: 1, name: 'Alice' };
}Differences:
- TypeScript is structurally typed (compile-time only)
- PHP validates at runtime
- TypeScript requires exact object match
from typing import TypedDict
class User(TypedDict):
id: int
name: str
def get_user() -> User:
return {"id": 1, "name": "Alice"}Differences:
- Python requires class definition
- PHP allows inline definitions
- Python typing is optional (mypy enforces)
Array shape types provide security benefits at trust boundaries:
declare(strict_arrays=1);
class ApiController {
// Ensures response structure is always correct
public function getUser(int $id): array{id: int, name: string, role: string} {
$user = $this->db->fetchUser($id);
return $user; // TypeError if DB returns unexpected structure
}
}// Without shapes: accidentally exposing sensitive data
function getPublicUser(): array {
$user = $db->fetch("SELECT * FROM users WHERE id = ?", [$id]);
return $user; // Might include password_hash, api_token, etc.
}
// With shapes: explicit contract prevents accidental exposure
function getPublicUser(): array{id: int, name: string, avatar: string} {
$user = $db->fetch("SELECT * FROM users WHERE id = ?", [$id]);
return $user; // TypeError if password_hash accidentally included
// (assuming closed shapes in future, or static analysis catches this)
}declare(strict_arrays=1);
function processPayment(array{amount: int, currency: string} $payment): void {
// Cannot pass ['amount' => '100; DROP TABLE users', 'currency' => 'USD']
// because amount must be int, not string
$this->gateway->charge($payment['amount'], $payment['currency']);
}declare(strict_arrays=1);
function loadConfig(): array{debug: bool, db_host: string, db_port: int} {
$config = json_decode(file_get_contents('config.json'), true);
return $config; // TypeError if JSON was tampered with unexpected types
}declare(strict_arrays=1);
class UserRepository {
// Clear contract for what the repository returns
public function findWithPosts(int $id): array{
user: array{id: int, name: string, email: string},
posts: array<array{id: int, title: string, created_at: string}>
} {
$user = User::with('posts')->findOrFail($id);
return [
'user' => $user->only(['id', 'name', 'email']),
'posts' => $user->posts->map->only(['id', 'title', 'created_at'])->all(),
];
}
}declare(strict_arrays=1);
class ProductController extends AbstractController {
#[Route('/api/products/{id}')]
public function show(int $id): JsonResponse {
return $this->json($this->getProductData($id));
}
private function getProductData(int $id): array{
id: int,
name: string,
price: float,
stock: int,
categories: array<string>
} {
$product = $this->repository->find($id);
return [
'id' => $product->getId(),
'name' => $product->getName(),
'price' => $product->getPrice(),
'stock' => $product->getStock(),
'categories' => $product->getCategoryNames(),
];
}
}declare(strict_arrays=1);
class ReportRepository {
/**
* Complex query returning specific structure
*/
public function getSalesReport(DateRange $range): array<array{
date: string,
total_sales: float,
order_count: int,
top_product: string
}> {
return $this->em->createQuery('...')
->setParameters(['start' => $range->start, 'end' => $range->end])
->getResult();
}
}declare(strict_arrays=1);
class OrderNormalizer implements NormalizerInterface {
public function normalize($order): array{
id: int,
status: string,
items: array<array{product_id: int, quantity: int, price: float}>,
total: float,
created_at: string
} {
return [
'id' => $order->getId(),
'status' => $order->getStatus(),
'items' => array_map(fn($item) => [
'product_id' => $item->getProductId(),
'quantity' => $item->getQuantity(),
'price' => $item->getPrice(),
], $order->getItems()),
'total' => $order->getTotal(),
'created_at' => $order->getCreatedAt()->format('c'),
];
}
}declare(strict_arrays=1);
class DatabaseConfig {
public static function load(): array{
driver: string,
host: string,
port: int,
database: string,
username: string,
password: string,
options: array<string>
} {
return [
'driver' => $_ENV['DB_DRIVER'] ?? 'mysql',
'host' => $_ENV['DB_HOST'] ?? 'localhost',
'port' => (int) ($_ENV['DB_PORT'] ?? 3306),
'database' => $_ENV['DB_NAME'],
'username' => $_ENV['DB_USER'],
'password' => $_ENV['DB_PASS'],
'options' => explode(',', $_ENV['DB_OPTIONS'] ?? ''),
];
}
}Start with functions that return structured data:
- Repository methods returning database results
- API endpoints returning JSON structures
- Configuration loaders
- Data transformation functions
# Find functions returning arrays (candidates for typing)
grep -r "function.*(): array" src/Phase 1: Add types without validation (syntax only)
// No declare - types documented but not enforced
function getUser(): array{id: int, name: string} {
return $this->db->fetch(...);
}Phase 2: Enable validation in test environment
// tests/bootstrap.php
declare(strict_arrays=1);Phase 3: Enable validation in specific files
<?php
declare(strict_arrays=1); // Enable for this file
function getUser(): array{id: int, name: string} {
return $this->db->fetch(...); // Now validated
}When validation catches type mismatches:
// Before: Wrong type from database
function getUser(): array{id: int, age: int} {
return $this->db->fetch(...); // age might be string "25"
}
// After: Cast at boundary
function getUser(): array{id: int, age: int} {
$row = $this->db->fetch(...);
$row['age'] = (int) $row['age']; // Explicit cast
return $row;
}// Before: PHPStan/Psalm docblock
/**
* @return array{id: int, name: string}
*/
function getUser(): array {
return ['id' => 1, 'name' => 'Alice'];
}
// After: Native type (keep docblock for BC with older tools)
/**
* @return array{id: int, name: string}
*/
function getUser(): array{id: int, name: string} {
return ['id' => 1, 'name' => 'Alice'];
}Once adopted, ecosystem tools could provide automated migration:
# Potential php-cs-fixer rule
php-cs-fixer fix --rules=array_shape_from_docblock src/
# Potential Rector rule
vendor/bin/rector process src/ --rules=DocblockToArrayShapeThese tools don't exist yet but would be straightforward to build once the syntax is standardized.
Response: Classes are often better for complex domain objects, but arrays remain the pragmatic choice for:
- Database results: ORMs return arrays; wrapping every query in DTOs adds boilerplate
- API responses: JSON naturally maps to arrays; DTOs require serialization setup
- Configuration: Arrays are readable and don't need class files
- Interoperability: Many libraries expect/return arrays
- Performance: Arrays have less overhead than object instantiation
Array shapes complement DTOs - use DTOs for complex behavior, shapes for simple data structures.
Response: PHP already has:
- Typed properties (
public int $id) - Typed parameters (
function foo(int $x)) - Typed returns (
function foo(): int) - Union types (
int|string) - Intersection types (
Foo&Bar)
Array shapes are the logical completion of PHP's type system. The syntax mirrors existing static analysis tools, so it's already familiar to many developers.
Response: With optimizations implemented:
- Escape analysis: Constant literals have ~0% overhead
- Type tagging cache: Repeated validations have ~1% overhead
- Opt-in validation:
declare(strict_arrays=1)means zero overhead by default
For comparison, declare(strict_types=1) also has overhead but is widely used.
Response: Static analysis and native types are complementary:
| Static Analysis | Native Types |
|---|---|
| Catches bugs at analysis time | Catches bugs at runtime |
| Can be bypassed/ignored | Cannot be bypassed |
| Requires tool setup | Works out of the box |
| Can't validate external data | Validates all data |
| Comments can drift | Types are the contract |
Both together provide defense in depth.
Response:
- Plain
arrayreturn types continue to work unchanged - New syntax is opt-in and additive
- No
declare(strict_arrays=1)= no validation overhead - Static analyzers can read native types (better than parsing docblocks)
Response: This RFC supports both return types AND parameter types, but intentionally excludes property types:
- Different validation timing: Properties require validation on every write, not just at function boundaries
- Performance implications: Write-time validation would affect all array assignments
- Complexity: Property variance rules differ from parameter/return variance
- Future extensibility: Property types can be a follow-up RFC after gathering feedback on boundary validation
Response: Native types provide clear, actionable errors:
TypeError: getUser(): Return value must be of type array{id: int, name: string},
key 'name' must be of type string, int given
// vs. current state:
Warning: Undefined array key "name" in /app/View.php on line 847
The error tells you exactly what's wrong and where.
function getData(): array{value: int|string} {
return ['value' => 'string']; // Or int
}Proposal: Yes, leverage existing union type support in zend_type.
function getData(): array{0: int, 1: int} {
return [0 => 1, 1 => 2]; // ✓ OK
return ['0' => 1, '1' => 2]; // ✓ OK? (string keys "0", "1")
}Proposal: Follow PHP's standard array key coercion rules.
var_export(getData());
// Current: array ( 'id' => 1, ... )
// Option: array{id: int}( 'id' => 1, ... )Proposal: No, keep var_export() output unchanged for BC. Type info is in function signature.
Primary vote: Accept "Array Shape Return Types" as proposed?
- Yes
- No
Voting started: TBD
Voting ends: TBD
Required majority: 2/3
- Implementation: https://github.com/signalforger/php-src/tree/feature/array-shapes
- Patch file: https://github.com/signalforger/php-array-shapes/blob/main/patches/array-shapes.patch
- Test Suite: https://github.com/signalforger/php-src/tree/feature/array-shapes/Zend/tests/type_declarations/array_shapes
- v1.0 (2024-12-25): Initial draft
- v1.1 (2025-12-27): Added performance optimizations (type tagging cache, loop unrolling, escape analysis)
- v1.2 (2025-12-28): Added parameter type validation,
array<K, V>map types, optional keys support - v1.3 (2025-12-30): Code cleanup, fixed nested
array<array<T>>parsing with lexer-level>>splitting - v1.4 (2025-12-31): Added
shapekeyword for reusable type aliases, shape autoloading viaspl_autoload_register(),shape_exists()function
This document is placed in the public domain or under CC0-1.0-Universal license, whichever is more permissive.