Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursive iteration #36

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
<br>

## master
Nothing yet
### Added
- Recursive iteration via `recursive` option.

<br>

Expand Down
60 changes: 39 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ for PHP >=7.0. See [TL;DR](#tl-dr). No dependencies in production except optiona
+ [Parsing nested values in arrays](#parsing-nested-values)
+ [Parsing a single scalar value](#getting-scalar-values)
+ [Parsing multiple subtrees](#parsing-multiple-subtrees)
+ [Recursive iteration](#recursive)
+ [What is JSON Pointer anyway?](#json-pointer)
* [Options](#options)
* [Parsing streaming responses from a JSON API](#parsing-json-stream-api-responses)
Expand Down Expand Up @@ -318,6 +319,38 @@ foreach ($fruits as $key => $value) {
}
```

<a name="recursive"></a>
### Recursive iteration (BETA)
Recursive iteration can be enabled via `recursive` option set to `true`.
Every JSON iterable that JSON Machine encounters will then be yielded as an instance of `NestedIterator`.
No JSON array or object will be materialized and kept in memory.
The only PHP values you get materialized will be scalar values.
Let's see an example with many, many users with many, many friends

```php
<?php

use JsonMachine\Items;

$users = Items::fromFile('users.json', ['recursive' => true]);
foreach ($users as $user) { // $user instanceof Traversable, not an array/object
foreach ($user as $userField => $userValue) {
if ($userField === 'friends') {
foreach ($userValue as $friend) { // $userValue instanceof Traversable, not an array/object
foreach ($friend as $friendField => $friendValue) { // $friend instanceof Traversable, not an array/object
// do whatever you want here
}
}
}
}
}
```

> If you break an iteration of such lazy deeper-level (i.e. you skip some `"friends"` via `break`)
> and advance to a next value (i.e. next `user`), you will not be able to iterate it later.
> JSON Machine must iterate it the background to be able to read next value.
> Such an attempt will result in closed generator exception.

<a name="json-pointer"></a>
### What is JSON Pointer anyway?
It's a way of addressing one item in JSON document. See the [JSON Pointer RFC 6901](https://tools.ietf.org/html/rfc6901).
Expand Down Expand Up @@ -345,6 +378,7 @@ Some examples:
Options may change how a JSON is parsed. Array of options is the second parameter of all `Items::from*` functions.
Available options are:
- `pointer` - A JSON Pointer string that tells which part of the document you want to iterate.
- `recursive` - Bool. Any JSON array/object the parser hits will not be decoded but served lazily as a `Traversable`. Default `false`.
- `decoder` - An instance of `ItemDecoder` interface.
- `debug` - `true` or `false` to enable or disable the debug mode. When the debug mode is enabled, data such as line,
column and position in the document are available during parsing or in exceptions. Keeping debug disabled adds slight
Expand Down Expand Up @@ -516,30 +550,14 @@ but you forgot to specify a JSON Pointer. See [Parsing a subtree](#parsing-a-sub
### "That didn't help"
The other reason may be, that one of the items you iterate is itself so huge it cannot be decoded at once.
For example, you iterate over users and one of them has thousands of "friend" objects in it.
Use `PassThruDecoder` which does not decode an item, get the json string of the user
and parse it iteratively yourself using `Items::fromString()`.

```php
<?php

use JsonMachine\Items;
use JsonMachine\JsonDecoder\PassThruDecoder;

$users = Items::fromFile('users.json', ['decoder' => new PassThruDecoder]);
foreach ($users as $user) {
foreach (Items::fromString($user, ['pointer' => "/friends"]) as $friend) {
// process friends one by one
}
}
```
The most efficient solution is to set `recursive` option to `true`.
See [Recursive iteration](#recursive).

<a name="step3"></a>
### "I am still out of luck"
It probably means that the JSON string `$user` itself or one of the friends are too big and do not fit in memory.
However, you can try this approach recursively. Parse `"/friends"` with `PassThruDecoder` getting one `$friend`
json string at a time and then parse that using `Items::fromString()`... If even that does not help,
there's probably no solution yet via JSON Machine. A feature is planned which will enable you to iterate
any structure fully recursively and strings will be served as streams.
It probably means that a single JSON scalar string itself is too big to fit in memory.
For example very big base64-encoded file.
In that case you will probably be still out of luck until JSON Machine supports yielding of scalar values as PHP streams.

<a name="installation"></a>
## Installation
Expand Down
100 changes: 100 additions & 0 deletions src/FacadeTrait.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
<?php

declare(strict_types=1);

namespace JsonMachine;

use JsonMachine\Exception\InvalidArgumentException;
use JsonMachine\JsonDecoder\ExtJsonDecoder;
use JsonMachine\JsonDecoder\ItemDecoder;

trait FacadeTrait
{
/**
* @var Parser
*/
private $parser;

/**
* @var bool
*/
private $debugEnabled;

public function isDebugEnabled(): bool
{
return $this->debugEnabled;
}

/**
* @param iterable $bytesIterator
*
* @throws InvalidArgumentException
*/
private static function createParser($bytesIterator, ItemsOptions $options, bool $recursive): Parser
{
if ($options['debug']) {
$tokensClass = TokensWithDebugging::class;
} else {
$tokensClass = Tokens::class;
}

return new Parser(
new $tokensClass(
$bytesIterator
),
$options['pointer'],
$options['decoder'] ?: new ExtJsonDecoder(),
$recursive
);
}

/**
* @throws Exception\JsonMachineException
*/
public function getPosition()
{
return $this->parser->getPosition();
}

public function getJsonPointers(): array
{
return $this->parser->getJsonPointers();
}

/**
* @throws Exception\JsonMachineException
*/
public function getCurrentJsonPointer(): string
{
return $this->parser->getCurrentJsonPointer();
}

/**
* @throws Exception\JsonMachineException
*/
public function getMatchedJsonPointer(): string
{
return $this->parser->getMatchedJsonPointer();
}

/**
* @param string $string
*/
abstract public static function fromString($string, array $options = []): self;

/**
* @param string $file
*/
abstract public static function fromFile($file, array $options = []): self;

/**
* @param resource $stream
*/
abstract public static function fromStream($stream, array $options = []): self;

/**
* @param iterable $iterable
*/
abstract public static function fromIterable($iterable, array $options = []): self;

}
98 changes: 6 additions & 92 deletions src/Items.php
Original file line number Diff line number Diff line change
Expand Up @@ -5,38 +5,13 @@
namespace JsonMachine;

use JsonMachine\Exception\InvalidArgumentException;
use JsonMachine\JsonDecoder\ExtJsonDecoder;
use JsonMachine\JsonDecoder\ItemDecoder;

/**
* Entry-point facade for JSON Machine.
*/
final class Items implements \IteratorAggregate, PositionAware
{
/**
* @var iterable
*/
private $chunks;

/**
* @var string
*/
private $jsonPointer;

/**
* @var ItemDecoder|null
*/
private $jsonDecoder;

/**
* @var Parser
*/
private $parser;

/**
* @var bool
*/
private $debugEnabled;
use FacadeTrait;

/**
* @param iterable $bytesIterator
Expand All @@ -46,71 +21,47 @@ final class Items implements \IteratorAggregate, PositionAware
public function __construct($bytesIterator, array $options = [])
{
$options = new ItemsOptions($options);

$this->chunks = $bytesIterator;
$this->jsonPointer = $options['pointer'];
$this->jsonDecoder = $options['decoder'];
$this->debugEnabled = $options['debug'];

if ($this->debugEnabled) {
$tokensClass = TokensWithDebugging::class;
} else {
$tokensClass = Tokens::class;
}

$this->parser = new Parser(
new $tokensClass(
$this->chunks
),
$this->jsonPointer,
$this->jsonDecoder ?: new ExtJsonDecoder()
);
$this->parser = $this->createParser($bytesIterator, $options, false);
}

/**
* @param string $string
*
* @return self
*
* @throws InvalidArgumentException
*/
public static function fromString($string, array $options = [])
public static function fromString($string, array $options = []): self
{
return new self(new StringChunks($string), $options);
}

/**
* @param string $file
*
* @return self
*
* @throws Exception\InvalidArgumentException
*/
public static function fromFile($file, array $options = [])
public static function fromFile($file, array $options = []): self
{
return new self(new FileChunks($file), $options);
}

/**
* @param resource $stream
*
* @return self
*
* @throws Exception\InvalidArgumentException
*/
public static function fromStream($stream, array $options = [])
public static function fromStream($stream, array $options = []): self
{
return new self(new StreamChunks($stream), $options);
}

/**
* @param iterable $iterable
*
* @return self
*
* @throws Exception\InvalidArgumentException
*/
public static function fromIterable($iterable, array $options = [])
public static function fromIterable($iterable, array $options = []): self
{
return new self($iterable, $options);
}
Expand All @@ -125,41 +76,4 @@ public function getIterator()
{
return $this->parser->getIterator();
}

/**
* @throws Exception\JsonMachineException
*/
public function getPosition()
{
return $this->parser->getPosition();
}

public function getJsonPointers(): array
{
return $this->parser->getJsonPointers();
}

/**
* @throws Exception\JsonMachineException
*/
public function getCurrentJsonPointer(): string
{
return $this->parser->getCurrentJsonPointer();
}

/**
* @throws Exception\JsonMachineException
*/
public function getMatchedJsonPointer(): string
{
return $this->parser->getMatchedJsonPointer();
}

/**
* @return bool
*/
public function isDebugEnabled()
{
return $this->debugEnabled;
}
}
6 changes: 6 additions & 0 deletions src/ItemsOptions.php
Original file line number Diff line number Diff line change
Expand Up @@ -69,12 +69,18 @@ private function opt_debug(bool $debug)
return $debug;
}

private function opt_recursive(bool $recursive)
{
return $recursive;
}

public static function defaultOptions(): array
{
return [
'pointer' => '',
'decoder' => new ExtJsonDecoder(),
'debug' => false,
'recursive' => false,
];
}
}
Loading
Loading