-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strange 'find' tool behaviour when scanning mounted .zip file #104
Comments
Could it be that your Does it happen for other FUSE mounts, e.g., archivemount, sshfs, etc.? Does it happen for mounted tar archives as opposed to zip? |
I cannot reproduce the problem. Even when using sudo, the AppImage and your specified options. Additionally to my previous suggestions: what do the permissions on |
|
I'm attaching my .zip file: ATTN some of files insied it contains Window viruses |
Hm. It might be better if you deleted the attachment again 😅 . But I can indeed reproduce the problem. An idea that comes to mind is the I tried setting the size to 1 but it doesn't help ... I think this might be more helpful to report for the |
find -L /tmp/mnt does go into subdirs |
This small QT program does not recurse into subdirs too...#include <QCoreApplication>
#include <QDirIterator>
#include <QDebug>
int main(int argc, char *argv[])
{
QDirIterator it(argv[1], QDir::Files|QDir::Hidden, QDirIterator::Subdirectories);
QString path;
for( path = it.next(); !path.isEmpty(); path = it.next()) {
QFileInfo fileInfo(path);
if(!fileInfo.isFile() || fileInfo.isSymLink()) {
qDebug() << "Maybe Ignoring" << path <<
"fileInfo:" << fileInfo.isFile() << "isSymlink:" << fileInfo.isSymLink() <<
"isDir:" << fileInfo.isDir();
if (fileInfo.isDir()) {
qDebug() << path << "is a directory. Will process it.";
}
continue;
}
qDebug() << path;
}
return 0;
} |
Thank you for the minimal reproducer in Qt! I can reproduce it with that.
Everything looks fine here and it still does not show the contained file. Test archive with dummy data (probably any archive with a subfolder is fine. I might have only tested an archive with a single file initially): mkdir subdir
echo foo > subdir/mimi.exe
zip subdir.zip subdir/mimi.exe Some idea for possible problems that come to mind is the readdir interface. There are two possible FUSE implementations for that. One that only returns the names and one that returns the names and all stats for each entry. Maybe something is wrong with that. |
The following program (C++17) show the same beahviour #include <filesystem>
#include <iostream>
void ls_recursive(const std::filesystem::path& path) {
for(const auto& p: std::filesystem::recursive_directory_iterator(path)) {
std::cout << "seeing: " << p.path() << '\n';
if (!std::filesystem::is_directory(p)) {
std::cout << p.path() << '\n';
}
}
}
int main(int argc, char *argv[])
{
ls_recursive(argv[1]);
return 0;
}
|
Even this one does not work correctly: #include <stdio.h>
#include <dirent.h>
#include <sys/stat.h>
#include <string.h>
void list_dir(const char *path) {
DIR *dir = opendir(path);
if (dir == NULL) {
perror("opendir");
return;
}
struct dirent *entry;
while ((entry = readdir(dir)) != NULL) {
if (entry->d_type == DT_DIR) {
// Ignore the "." and ".." directories
if (strcmp(entry->d_name, ".") == 0 || strcmp(entry->d_name, "..") == 0) {
continue;
}
// Recurse into the subdirectory
char new_path[1024];
snprintf(new_path, sizeof(new_path), "%s/%s", path, entry->d_name);
list_dir(new_path);
} else {
// Print the file name
printf("%s/%s\n", path, entry->d_name);
}
}
closedir(dir);#include <stdio.h>
#include <dirent.h>
#include <sys/stat.h>
#include <string.h>
void list_dir(const char *path) {
DIR *dir = opendir(path);
if (dir == NULL) {
perror("opendir");
return;
}
struct dirent *entry;
while ((entry = readdir(dir)) != NULL) {
if (entry->d_type == DT_DIR) {
// Ignore the "." and ".." directories
if (strcmp(entry->d_name, ".") == 0 || strcmp(entry->d_name, "..") == 0) {
continue;
}
// Recurse into the subdirectory
char new_path[1024];
snprintf(new_path, sizeof(new_path), "%s/%s", path, entry->d_name);
list_dir(new_path);
} else {
// Print the file name
printf("%s/%s\n", path, entry->d_name);
}
}
closedir(dir);
}
int main(int argc, char *argv[]) {
if (argc < 2) {
fprintf(stderr, "Usage: %s <directory>\n", argv[0]);
return 1;
}
list_dir(argv[1]);
return 0;
}
}
int main(int argc, char *argv[]) {
if (argc < 2) {
fprintf(stderr, "Usage: %s <directory>\n", argv[0]);
return 1;
}
list_dir(argv[1]);
return 0;
}
Apparently entry->d_type == DT_DIR is FALSE !!!! |
This is probably related: |
Well, i see the d_type for the offending subdirs is 8 which is DT_REG -- regular file |
I dont' get it!? I added debug output for the returned mode and I also tried modifying the mode to match that of a mounted identical tar file. The tar file works but not the zip file ... The output of Fortunately, the fact that the TAR backend works, but not the zip backend, gave me an idea: Please try the development version, which works for me: python3 -m pip install --user --force-reinstall 'git+https://github.com/mxmlnkn/ratarmount.git@develop#egginfo=ratarmountcore&subdirectory=core'
python3 -m pip install --user --force-reinstall 'git+https://github.com/mxmlnkn/ratarmount.git@develop#egginfo=ratarmount' The development version refactors the zip backend to use the same index backend as the tar backend. And for some reason it works there. |
It still does not work here:
|
Btw another small bug in exception message:
|
Unfortunately, the With the develop branch version I cannot reproduce the issue, not even with your uploaded zip file even though I could reproduce that issue with ratarmountcore 0.4.0. |
Yes, you were righ, somehow i had an old version of ratarmount core... Now it works correctly |
Can you please build an AppImage for this version? |
I've sent a link in the Telegram group because the Github does not allow attachments over 25 MB. It seems like the addition of pragzip or the update to Python 3.11 pushed me over that limit. I'll try to do a new official release this weekend. I think I wanted to hear back from the original issue reporter that wanted the index for zip archives but he did not write back and I forgot about this because of pragzip. |
I did not perform yet any formal testing, but I have impression that access to big .zip file is muuuuch slower |
:( That is frustrating. I did some benchmarks for #98 and it was vastly faster there. But maybe I did the wrong kind of benchmarks. Please specify your exact conditions. Does "big" mean size-wise or number of files? Is it slow for |
After some testing: my bad, i was wrong, actually the new version is faster...
i'm running this command in 3 windows:
With only one window the copy is doing 115-120 MB/s |
What kind of compression does this zip use? Could you send the output of |
I did some more testing, this time on .tar.gz files
I see that it does not use pragzip backend when generatine sqlite index file... |
It is compressed: |
Sorry about that. This is intended as of now. The reason is this singular issue. :/ Without implementing that, memory usage could grow up to |
Ok, the |
I've personally rarely seen a .zip file achieving more the 2x compression. |
I've multiple cases with a compression ratio of more than 4 but still under 10. Around 8 maybe. E.g., (build) logs and notably the Chrome Trace Event Format and in general, similar JSON files with lots of redundancy. Pragzip does not yet compute the CRC32 in parallel decompression mode. Using SIMD adds complexities with dynamic dispatch based on supported CPU instruction sets... Then again, SSE4 might be old enough that almost any x86 CPU of the last 10 years supports it. |
I'm mounting a .zip file as following:
sudo -b ./ratarmount-x86_64.AppImage -o ro,allow_other ~/tmp/viruses/000-test-001-003-302.zip /tmp/mnt
then i'm doing:
And
For some reason find does not recurse into the mount point.
ls -lr however have no problem to recurse into it.
The text was updated successfully, but these errors were encountered: