Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wildcards give strange results when listing directories #582

Closed
judgej opened this issue Dec 15, 2015 · 6 comments
Closed

Wildcards give strange results when listing directories #582

judgej opened this issue Dec 15, 2015 · 6 comments
Labels

Comments

@judgej
Copy link
Member

judgej commented Dec 15, 2015

Wildcards kind of work, but kind of don't.

  • I'm using the built-in FTP driver with FlySystem 1.x on Laravel 5.2.
  • The base directory is 'testing'.
  • I'm scanning files in testing/Outgoing/

Files have the form NNNN_ACK.xml so I'm using the wildcard Outgoing/*_ACK.xml to fined them.

        $filesystem = new Filesystem(new FtpAdapter([
            'host' => $host,
            'username' => $username,
            'password' => $password,

            'port' => 21,
            'root' => '/testing',
            'passive' => true,
            'ssl' => false,
            'timeout' => 30,
        ]));

        $contents = $filesystem->listContents('Outgoing/*_ACK.xml');

I get results like this:

0 => array:9 [▼
    "type" => "file"
    "path" => "Outgoing/*_ACK.xml/1412201504360801_ACK.xml"
    "visibility" => "public"
    "size" => 525
    "timestamp" => 1450110960
    "dirname" => "Outgoing/*_ACK.xml"
    "basename" => "1412201504360801_ACK.xml"
    "extension" => "xml"
    "filename" => "1412201504360801_ACK"
  ]

The problem is the extra /*_ACK.xml in the returned path of the file. The remote OS won't be adding this in, so I guess Flysystem is making an assumption that there are no wildcards, and what has been provided is an exact directory.

Similarly the dirname entry should be Outgoing and not Outgoing/*_ACK.xml

@frankdejonge
Copy link
Member

Flysystem is not designed to handle wildcard searches. So the error here is that the * is not escaped.

@judgej
Copy link
Member Author

judgej commented Dec 15, 2015

Are wildcard searches something that could be supported, or is normalising that across the various providers impractical? If so, then I guess I could extend the FTP driver to handle wildcards separately from the directory.

In my current project I have to sift through some very large directories to find unprocessed files, and not pulling the file details I don't need down to the application would help to speed things up and keep data transfers and memory usage down.

@frankdejonge
Copy link
Member

@judgej normalising it across the different adapters is virtually impossible in the current listContents contract without bloating it. For instance, some adapters would have to polyfil the behaviour heavily. I'd rather introduce a listMatching and really get into the how and why filtering aspect than pollute the current implementation. But that'd be a 2.0 exercise.

@judgej
Copy link
Member Author

judgej commented Dec 16, 2015

listMatching() - sounds like a plan :-) In the meantime I'm just doing a list of all files, then throwing the result through array_filter() to discard those that don't match a RE. It will work for a now, while the file quantities are low, and I can revisit it later when 2.0 is further down the line. Thanks.

In case it is useful to others:

    // The Flysystem Filesystem object.
    protected $filesystem;

    /**
     * Scan for files matching a given RE at a given path.
     * Returns a Flysystem normalised array.
     */
    protected function scanFiles($path, $match = null)
    {
        // There is no reliable wildcard matching at this level, so get a listing
        // of all files in the directory.
        $contents = $this->filesystem->listContents($path);

        // Filter out files that don't match the RE.
        if ( ! empty($match)) {
            $contents = array_filter($contents, function ($file) use ($match) {
                return preg_match($match, $file['basename']);
            });
        }

        return $contents;
    }

Further checks could be added to filter out directories, check if the path really is a directory and not a file, and that it exists etc. but it's a start. Could be extended to wildcards in directories too, with a recursive listContents().

Usage is something like this:

$files = $this->scanFiles('/uploads', '/.+_ACK\.xml$/');

@frankdejonge
Copy link
Member

@judgej that's what I do. I almost always try solve this through structure. It seems that all the *_ACK.xml's could have been placed in ACK/{name}.xml, for which only a simple listing is needed.

@judgej
Copy link
Member Author

judgej commented Dec 16, 2015

Yep - I'm connecting to a third-party API, which consists of an FTP account and file structure I can't change. At the very least, I might try to get them to enable SFTP, and Flysystem will make switching to that a cinch :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants