Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Buggy" initial scan? #1

Closed
ExSport opened this issue Apr 17, 2019 · 15 comments
Closed

"Buggy" initial scan? #1

ExSport opened this issue Apr 17, 2019 · 15 comments

Comments

@ExSport
Copy link

ExSport commented Apr 17, 2019

Hello
Tried your tool and spotted strange behavior:

C:\Progs\Everything>EFUtool.exe temp.efu \\smbserver\temp
EFUtool v1.0.5 (c) 2019 Pedro Fonseca [pbfonseca@gmail.com]

Creating EFU file: temp.efu

Scanning and indexing folders
Finalizing EFU index
Replacing EFU file

Contents:  9,61 GB in 30.247 files, 1.117 folders [18 depth]

EFUtool finished in 0:02:10
C:\Progs\Everything>EFUtool.exe temp.efu \\smbserver\temp
EFUtool v1.0.5 (c) 2019 Pedro Fonseca [pbfonseca@gmail.com]

Updating EFU file: temp.efu

Scanning current EFU index
  Contents:  4,78 GB in 30.247 files, 1.117 folders [18 depth]
Finding changed folders
  1117 folders checked, 1 changed and 0 missing
Reindexing unchanged folders
Scanning new/modified folders
Updating directory sizes
Finalizing EFU index
Replacing EFU file

Contents:  9,61 GB in 30.247 files, 1.117 folders [18 depth]

EFUtool finished in 12,9 seconds
C:\Progs\Everything>EFUtool.exe temp.efu \\smbserver\temp
EFUtool v1.0.5 (c) 2019 Pedro Fonseca [pbfonseca@gmail.com]

Updating EFU file: temp.efu

Scanning current EFU index
  Contents:  9,61 GB in 30.247 files, 1.117 folders [18 depth]
Finding changed folders
  1117 folders checked, 1 changed and 0 missing
Reindexing unchanged folders
Scanning new/modified folders
Updating directory sizes
Finalizing EFU index
Replacing EFU file

Contents:  9,61 GB in 30.247 files, 1.117 folders [18 depth]

EFUtool finished in 0:02:51
C:\Progs\Everything>EFUtool.exe temp.efu \\smbserver\temp
EFUtool v1.0.5 (c) 2019 Pedro Fonseca [pbfonseca@gmail.com]

Updating EFU file: temp.efu

Scanning current EFU index
  Contents:  9,61 GB in 30.247 files, 1.117 folders [18 depth]
Finding changed folders
  1117 folders checked, 1 changed and 0 missing
Reindexing unchanged folders
Scanning new/modified folders
Updating directory sizes
Finalizing EFU index
Replacing EFU file

Contents:  9,61 GB in 30.247 files, 1.117 folders [18 depth]

EFUtool finished in 0:02:53
C:\Progs\Everything>EFUtool.exe temp.efu \\smbserver\temp
EFUtool v1.0.5 (c) 2019 Pedro Fonseca [pbfonseca@gmail.com]

Updating EFU file: temp.efu

Scanning current EFU index
  Contents:  9,61 GB in 30.247 files, 1.117 folders [18 depth]
Finding changed folders
  1117 folders checked, 1 changed and 0 missing
Reindexing unchanged folders
Scanning new/modified folders
Updating directory sizes
Finalizing EFU index
Replacing EFU file

Contents:  9,61 GB in 30.247 files, 1.117 folders [18 depth]

EFUtool finished in 0:02:39

First RUN it is initial SCAN, all others are RESCAN.

The first oddity is that all rescans shows one difference but EFU file is totally same all the time:

  1117 folders checked, 1 changed and 0 missing

Second oddity is that first rescan for some reason was very fast (12 sec compared to 2-3mins) with weird statistics.
It seems initial scan created smaller EFU file with not all files.

Contents:  4,78 GB in 30.247 files, 1.117 folders [18 depth]

When doing other rescans, the CONTENTS is consistent:

Contents:  9,61 GB in 30.247 files, 1.117 folders [18 depth]

So I tried whole procedure again and confirmed that initial scan creates file with same number of lines like rescan but not all folder attributes are populated or fully counted.
Now, the first rescan took about 2.5 minutes. Unfortunately I didn't checked EFU file at time it took 12 secs only and all other retries of initial scan+rescan took standard 2-3mins.
EFU initial files are exactly same during retries. Same for rescanned/updated files (but little different from initial scan).

Initial Scan (example)

Filename Size
\smbserver\temp_Migrate2AWS 0
\smbserver\temp_Migrate2AWS\Install 14960

Rescan (example)

Filename Size
\smbserver\temp_Migrate2AWS 135740893
\smbserver\temp_Migrate2AWS\Install 135740893

Thanks for looking into it. 👍

@zybexXL
Copy link
Owner

zybexXL commented Jun 9, 2019

Hi @ExSport,
Apologies for not replying sooner - looks like I missed your post, didn't get any notification for it.
Thank for reporting these issues. Version 1.0.6 fixes the wrong folder sizes on first scan - they now match the updated folder sizes.

Regarding the "1 changed" - this is the root share folder itself. EFUTool uses the folder timestamp to determine if the contents are changed or not. However, a share has no timestamp, so EFUTool always considers the root folder to be changed to force a rescan/refresh of the root contents. That's why it will always say "1 changed". If your root folder is not a share (like "\\server\share\folder") then a valid timestamp will exist and you will not see this behavior.

Your rescan times are also a bit strange: the 12s one seems to indicate some caching going on (on the share server). I would also expect subsequent scans to be faster than the first scan, but they're not (except that 12s one). EFUTool is really great for huge volumes (I test it on shares with millions of files and about 100K folders) - perhaps it's not so fast for your particular setup.

Let me know if you find any other issues - Github notifications are enabled, but add a mention to @zybexXL just in case.

@ExSport
Copy link
Author

ExSport commented Jun 9, 2019

Hello @zybexXL
Thanks for your fix.
Still some weirdness happens :)

  1. remote server UNC path has about 65 ms RTT.
  2. With such RTT the initial scan takes 2:30-2:50 min.
  3. rescan takes 3:10-3:30 min.
  4. One rescan took 20sec only. After byte to byte comparison of the old and new EFU files, there was no file change (filelist order starts with files from root folder as any other rescan changes this order so root files are moved from start to the different position)
  5. As stated above, initial scan has filelist starting with files from root folder but it is not true for rescans where the order is changed - always! Only once no change was done - at time rescan took 20sec only. Unfortunately after maybe 20 rescans I spotted such fast rescan only once. All the other time rescan was slower then initial scan and filelist order was changed. But not between additional scans!
  6. It seems when there is no change during rescan in fileorder in EFU file, rescan was very fast. Initial filelist starts with files from root but don't know why, next rescan changed this order and makes rescans took longer.
  7. share is static, no changes done there during testing
  8. Spotted that some folders have different size when initial scan and rescans compared. When I did few other new initial scans and rescans, it doesn't happen again. In this example, the initial scan shows incorrect (1774383598) folder size. Btw. all folders have "compression bit" set.
    initial scan:
    "\\smbserver\temp\Server_scans",1774383598,132045312240733675,131763931934786691,10256
    rescans:
    "\\smbserver\temp\Server_scans",1769402862,132045312240733675,131763931934786691,10256
  9. Tested \smbserver\h$\temp with using admin share and tool still shows 1 folder changed (TEMP timestamps are visible). Adding additional trailing slash: \smbserver\h$\temp\ shows 2 folders changed. It generates different filelists as it shows root files in TEMP folder twice when trailing "" is used after "TEMP". It seems it takes TEMP and TEMP\ as different folders.

Thanks for looking into it 👍

@zybexXL
Copy link
Owner

zybexXL commented Jun 10, 2019

Hi @ExSport,
The first scan always writes the root folder contents first, while the re-scan will write the non-modified folders first, and then the modified ones (re-scanned). Since Roots are always considered to be modified, the root contents are written last in re-scans. This is by design and doesn't affect functionality.

You are right, the root folder is ALWAYS marked as new/changed. I did this because of DFS links (mountpoints) which are common in server environments. DFS links have a static timestamp which does not change even when contents of the root folder change, and it's not trivial to detect if a given folder is a DFS link or not (there's no attribute indicating it). To work around it and since DFS links are mostly used as root folders/shares, I've decided to always consider the root folders as "changed".

I've published v1.07 which fixes the trailing slash issue. You would see "2 changed" because the app would count both the root folder you indicated on the command line, plus the root folder already present on the existing EFU file - in the end they would be found to be the same folder (so scan results are correct), but still counted 2 changes. In practice, when doing re-scans, you don't need to specify which folder you are scanning - the app gets that from the existing EFU file already.

Regarding scan times - a compressed volume forces the OS to extract an entire directory even when requesting only a single entry, so when EFUTool scans for changes, all directories have to be de-compressed (by the OS), so I'm guessing that negates any speed advantage of the change-detection algorithm.

The filesize difference you saw... are you sure that isn't real? Maybe some app added/deleted a file in one of the subfolders of that path between your scans. Sometimes just going to a folder with images causes Windows to create/update a hidden thumbs.db file - you should see this change in your EFU comparison too though.

Br,
ZybexXL

@ExSport
Copy link
Author

ExSport commented Jun 10, 2019

Hi @zybexXL
What I remember from past, in Powershell I used for DFS checks properties/attributes like Link/Junction/ReparsePoint or to get DFS link properties I used something like:

$signature = @'
using System;
using System.Collections.Generic;
using System.Management.Automation;
using System.Runtime.InteropServices;

public class Win32Api
{
    [DllImport("netapi32.dll", SetLastError = true)]
    private static extern int NetApiBufferFree(IntPtr buffer);

    [DllImport("Netapi32.dll", CharSet = CharSet.Unicode, SetLastError = true)]
    private static extern int NetDfsGetClientInfo
    (
    [MarshalAs(UnmanagedType.LPWStr)] string EntryPath,
    [MarshalAs(UnmanagedType.LPWStr)] string ServerName,
    [MarshalAs(UnmanagedType.LPWStr)] string ShareName,
    int Level,
    ref IntPtr Buffer
    );

    public struct DFS_INFO_3
    {
        [MarshalAs(UnmanagedType.LPWStr)]
        public string EntryPath;
        [MarshalAs(UnmanagedType.LPWStr)]
        public string Comment;
        public UInt32 State;
        public UInt32 NumberOfStorages;
        public IntPtr Storages;
    }
    public struct DFS_STORAGE_INFO
    {
        public Int32 State;
        [MarshalAs(UnmanagedType.LPWStr)]
        public string ServerName;
        [MarshalAs(UnmanagedType.LPWStr)]
        public string ShareName;
    }

    public static List<PSObject> NetDfsGetClientInfo(string DfsPath)
    {
        IntPtr buffer = new IntPtr();
        List<PSObject> returnList = new List<PSObject>();

        try
        {
            int result = NetDfsGetClientInfo(DfsPath, null, null, 3, ref buffer);

            if (result != 0)
            {
                throw (new SystemException("Error getting DFS information"));
            }
            else
            {
                DFS_INFO_3 dfsInfo = (DFS_INFO_3)Marshal.PtrToStructure(buffer, typeof(DFS_INFO_3));

                for (int i = 0; i < dfsInfo.NumberOfStorages; i++)
                {
                    IntPtr storage = new IntPtr(dfsInfo.Storages.ToInt64() + i * Marshal.SizeOf(typeof(DFS_STORAGE_INFO)));

                    DFS_STORAGE_INFO storageInfo = (DFS_STORAGE_INFO)Marshal.PtrToStructure(storage, typeof(DFS_STORAGE_INFO));

                    PSObject psObject = new PSObject();

                    psObject.Properties.Add(new PSNoteProperty("State", storageInfo.State));
                    psObject.Properties.Add(new PSNoteProperty("ServerName", storageInfo.ServerName));
                    psObject.Properties.Add(new PSNoteProperty("ShareName", storageInfo.ShareName));

                    returnList.Add(psObject);
                }
            }
        }
        catch (Exception e)
        {
            throw(e);
        }
        finally
        {
            NetApiBufferFree(buffer);
        }
        return returnList;
    }
}
'@

        if (-not ('Win32Api' -as [Type])) {
            Add-Type -TypeDefinition $signature
        }
$DFS = [Win32Api]::NetDfsGetClientInfo($CheckPath)

About the speed. Ok, I hoped there will be some glitch so 20seconds after the fix will be the default behavior but it seems due to some weird glitch user/I got 20sec rescan times what is not good, like in case intermittent network problem during EFUTool rescan what led to skipping "something" what created so short scans.
About the compression bit, if the full traverse is forced locally, still it is better that send such full traverse list from remote to the client as SMB overhead on such slow networks is killing any dirtree queries as user can see in Wireshark. Definitely SMB is not good for queries via WAN networks.
I will try to clear compression bit if anything change in speeds of rescan.
In IT you can't be sure by anything but as I said, the share is "dead backup folder" where no changes are made for ages, by anything (Antivirus, backups, manually browsed by users, etc...) so yes, it should be totally static and as you can see, only some initial scans shown wrong size but any other rescan was ok. Other tries I did, also the initial scan was ok, so there must be some glitch somewhere:-) Anyway folder timestamp stays intact also in case folder size differs so it doesn't seem there is some temp file created and deleted afterwards so during the scans the size changes. I can understand it in case folder timestamp will also change but here there is no change in time but in size only. Weird.
Btw. my tests were done on Wifi via VPN so maybe some network glitch can do this weird behavior. But I should understand it in case wrong folder size will be smaller (some date were not queried and summed up) but not bigger like in our example. Totally weird.
Will try to do more tests later to dig this annoying behavior :)
Thx

@zybexXL
Copy link
Owner

zybexXL commented Jun 10, 2019

@ExSport, thanks for all the feedback!

You do have an environment which is not ideal for performance... Wifi+VPN+Compression... :) We can't easily know what's going on. The 20sec rescan time is what I would expect on normal circumstances (7x faster than first scan), but there's nothing in the code that would make a 3rd scan different.

Getting DFS info for each folder would slow down things too much. NetDfsGetClientInfo is fast, but only returns info already stored on the local client cache - if a given server folder is not in cache yet, then there's no info.
The proper solution would perhaps be to use NetDfsEnum and NetDfsGetInfo to get a list of DFS links on the target volume, but that's outside of scope for now. I may add it some point in the future, but the current "assume it's changed" behavior solves the immediate problem.

@ExSport
Copy link
Author

ExSport commented Jun 13, 2019

@zybexXL
Hi again
So I created script to repeat thousand times creation of initial EFU files. Some hundreds/thousands runs there were no difference, other runs the script logged many anomalies in folder sizes. Checked few of them and....
Btw. I used New-NetQosPolicy to shape in/out data to 1Mbit/s to simulate slower network as I tested it between servers with RTT <1ms. I also run it between two locations connected by DirectConnect, so very reliable connection, RTT about 40ms due to the distance, 500Mbit/s and spotted same issues.
So what I spotted:
Comparing different EFU files between each other (good vs. bad), there is a difference in folder sizes. The difference is in 99% exactly 4096 bytes (cluster size/allocation unit size of the volume). Only in one example I spotted the difference 8192 bytes.
So I summed files from different/incorrect folders defined in EFU file and the sum is correct but not the counted size defined in EFU.
It seems in some circumstances EFU tool incorrectly "rounds" folder sizes which it gets from indexed files. File sizes are correct, but not the final size of its parents folder what EFU tool counts.
It was very unpredictable when EFU tool started to generate incorrect EFU files as sometimes everything was ok after many cycles so I restarted job and in some cases it started to generate invalid EFU files, other time everything was OK, for many hours....so very very hardly reproducible.
Hopefully you will get idea what is "wrong" with info where the difference lies and that it is exactly 4096 bytes 👍
Thx

@zybexXL
Copy link
Owner

zybexXL commented Jun 13, 2019

@ExSport
The code is deterministic, it cannot "sometimes" forget about 4 or 8KB. The code just calls into the OS to get a list of folder contents and adds it up. Any change as you describe must come from the OS API call results.

It sounds like there's some TMP file being created on the target volume (taking 1 or 2 clusters), which sometimes happens during scanning. You should see this file on your EFU compares though.

I suggest you run the same stress test with a simple "dir /s \target\share > someLogFile.txt" and see if you spot the same difference in folder sizes/contents. This is basically what EFUTool is doing :)

@ExSport
Copy link
Author

ExSport commented Jun 13, 2019

@zybexXL
I understand you got file sizes from OS API (NtQueryInformationFile, GetFileSizeEx,...) but I suppose there is no option how to ask OS the size of folder size property so you must sum them up manually or am I wrong? E.g. in EFU file there is a folder with incorrect TOTAL size but its files have correct filesize indexed. If I manually count them, SUM is different with the TOTAL in EFU file for root folder. Another run file sizes are same=correct and also TOTAL of parent folder is OK. When TOTAL is incorrect, there is no change for any of files in subfolders, parent of subfolders itself. It means there was no temporary file added/removed. Strange, right?

As stated above, there is no new TMP file indexed in EFU file, no timestamp change, only incorrect TOTAL (sometimes EFU file have 2 incorrect TOTALS, other time zero or ten. It is unpredictable).

I already did this dir traversing tests during stress tests I described in the morning. There was no difference. I run it on two different servers, at time when EFU file generated incorrect TOTALS on one server but on the other server it was OK, also in such case when EFU files differ between servers, DIR log was exactly same between servers. No size change, file change, timestamp, anything. Totally weird :)
Unfortunately creating EFU export can take only 40 secs or 2mins per server, DIR log export took 3-4 hours.

Powershell or cmd is my daily job so I did a lot of stress tests to see what is happening but unfortunately some tests are so slow so they are useless and when I reduced the scope, it was hard to reproduce it. Tried it locally where speed is ok, but I was not able to reproduce it.
And I don't expect there is some TEMP file generated but deleted immediately so it is not captured by EFU Tool or my scripts and without modifying any TIMESTAMP :)

Btw. how you get TOTALS for folders? Nevermind, we are on GitHub...C#...as I said you count files to get folder size :) So again, if you understood me bad. It is not problem the size of files you get via OS API, problem sometimes arise for TOTALS which are calculated for folders. Unfortunately to reproduce it is a nightmage :))

Thx for support.... Btw. don't lose much time with it, it is nothing critical. I reported it only as I spotted such behavior during few tests (I don't use your tool right now for anything, only wanted to test speed difference between your tool and Everything 👍)
Anyway great tool, thanks for it!

@zybexXL
Copy link
Owner

zybexXL commented Jun 13, 2019

@ExSport
Don't worry, I too like a good discussion and a puzzle :)
There's no "total folder size" property than can be queried. A given folder size is the sum of all files in it, plus the sizes of all subfolders in it (recursively). That is what the code calculates.

If the inputs are always the same, then the outputs MUST always be the same. This is what I mean by "deterministic". There are only 3 ways to get different results on different scans:

  • there is some change to the filesystem (most likely)
  • there's a random element of calculation in the code (there is not, in this case - it's deterministic, 2+2=4 each and every time)
  • you have some hardware/RAM issue that is causing some bit flips! (veeeery unlikely)

Perhaps you can send me a couple of EFU files with those differences?

@ExSport
Copy link
Author

ExSport commented Jun 13, 2019

@zybexXL
I thought it is clear that I know there is no size property for dir, that it is counted in your code. ;)
It was a reaction on your response that it can't be different as the sizes are what OS API returns, no special code on your side.

  • There should not be any and EFU file with indexed files confirms it there is no new one or even there is no change in file sizes or file/folder timestamps.
  • I agree, but some unknown aspect sometimes change folder size for 4096 bytes. Who knows if it is some caching, timeout or compression bit weirdness which creates such inconsistency. I know, 2+2 is 4 but...
  • No as I can reproduce it on VMware VM, Amanon VM or Azure VM. Very unlikely all have memory corruption :)

I will try to reproduce it again on some general data and upload it.

@ExSport
Copy link
Author

ExSport commented Jun 14, 2019

@zybexXL
Here you go: https://github.com/ExSport/UniversalMediaServer/blob/2cd3194afaf1ccb0851c4a815b352077bf09f779/stresstest-short-efutool.zip
Further info is provided in hash-short.txt file.

  • Generally I slowed down network so UNC path is not so responsive (servers have <1ms RTT)
  • iterate efutool thousand times to get wrong efu file. For validity I checked hash for such EFU file to see if there is something unusual or not.
  • forgot what else I wanted to say :)

@zybexXL
Copy link
Owner

zybexXL commented Jun 14, 2019

@ExSport
Thank you for the logs. I think I may have found the cause, please try latest v1.0.8 :)

When asking windows for a folder info (FindFirstFileW/FindNextFileW) the returned size for directories SHOULD be zero. I was assuming that is the case, so I was using the returned value as the initial folder size. However, it is possible that Windows API is not always resetting that field to zero, and sometimes returning 4K/8K/whatever. In v8, I'm now ignoring the returned folder size and setting it manually to zero.

Let me know if you can still reproduce it.

@ExSport
Copy link
Author

ExSport commented Jun 14, 2019

@zybexXL that is exactly what I expected it is happening. That sometimes API will return nonzero value for folders. I even wrote post about such possible behaviour but then deleted it as meanwhile I tested it in powershell and saw the size property is not initialized at all :)
Great you found the culprit. Will test it hopefully soon 👍
Thanks

@ExSport
Copy link
Author

ExSport commented Jun 15, 2019

@zybexXL
After about 2000 and 6000 iterations from two different servers, it seems generated EFU files are finally consistent 👍
Thanks. You can close it. I will reopen it in case I will spot another issues with another test folders.
👍 🥇 👍

@zybexXL
Copy link
Owner

zybexXL commented Jun 15, 2019

Great! Thanks for finding the issue and sticking with it until we solved it.

@zybexXL zybexXL closed this as completed Jun 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants