-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CHERRY-PICK from dev to point-release/23103 PR #17526 #17811
CHERRY-PICK from dev to point-release/23103 PR #17526 #17811
Conversation
Improve ap startup on linux (o3de#17526) * Improves the startup time of AP on linux. Note that this could also potentially improve it on mac and windows, as well as metafile processing in general. Most of this change is just moving the initialization of the catalog until the file cache is warmed up and then using the file cache to fix path cases on case sensitive file systems. The rest of it is fixing the unit tests to account for this - the mock for the file system wasn't sufficient to actually use this new code as it would do things like always answer false for "is directory". With this change, all unit tests pass in debug on linux, with no unexpected assertions and no failures. In addition, AP starts and idles very quickly without any long freezes during startup. Signed-off-by: Nicholas Lawson <70027408+nick-l-o3de@users.noreply.github.com> * Fixes a warning I'm not sure I created During Createjobs, Asset Processor Manager does this: ```cpp Q_EMIT CreateJobsDurationChanged(sourceAsset.RelativePath().c_str(), sourceAsset.ScanFolderId()); ``` However, scan folder ID of type AZ::s64 which hasn't been registered to Qt serializer's meta object system, so it does not know how to convert it to a byte array to emit messages across threads. This causes warnings and probably causes that message not to be emitted. * Fixes for windows Most of the problems were in oversights in existing tests. For example the asset catalog tests use a virtual file system but were using CreateDummyFile which makes real files. Likewise, the asset catalog was using SystemFile instead of FileIO which was the mock file IO. So it would be making mock files in memory but then checking real files on disk, etc. Note that the Pass-thru file cache is only used in unit tests, or when specifically asked for by a command line parameter, which is not used in any of the test scripts. Signed-off-by: Nicholas Lawson <70027408+nick-l-o3de@users.noreply.github.com>
Code/Tools/AssetProcessor/native/AssetManager/FileStateCache.cpp
Outdated
Show resolved
Hide resolved
// The reason this is so expensive is that it could also be the case that the DIRECTORY path is the wrong case. | ||
// For example, the actual path might be /SomeFolder/textures/whatever/texture.dat | ||
// but the real file on disk is actually /SomeFolder/Textures/Whatever/texture.dat | ||
// Note the case difference of the directories. If you were to ask the operating system just to list all of | ||
// the files in the input folder (/SomeFolder/textures/whatever/*) it would not find any since the directory | ||
// does not itself exist. Not only that, but many operating system calls on case-insensitive file systems | ||
// actually use the input given in their return - that is, if you ask a WINAPI call to construct an absolute path | ||
// to a file and give it the wrong case as input, the output absolute path will also be the wrong case. | ||
// Thus, to actually correct a path it has to start at the bottom and whenever it encounters | ||
// a path segment, it has to do a read of the actual directory information to see which files exist in that | ||
// directory, and whether a file or directory exists with the correct name but with different case. | ||
|
||
// if checkEntirePath is false, we only check the file name, and nothing else. This is for the case where | ||
// the path is known to be good except for the file name, such as when the relPathFromRoot comes from just | ||
// taking an existing, known-good file and replacing its extension only. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically if this function was part of a class or struct with state. Or if we had a caching structure, we could "index" any file paths that were found with the correct case in a set<AZ::IO::Path>
or unordered_set<AZ::IO::Path>
where each AZ::IO::Path
uses the Windows path separator.
Alternatively even an in-memory only sqlite table could be populated as the application is running to store of the correct cased path and it's case-insensitive hash value
When an AZ::IO::Path
uses the Windows path separator it hashes it's contents case-insensitively.
Also comparing two paths that uses Windows path separators would compare each path segment case-insensitively as well.
The HashPathCompareValidation parameterized unit-test verifies that paths that compare equal case-insensitively also hashes to the same value.
Technically a index of file paths with the correct case can be built up lazily (as this function is called) or it can be built as paths are added to the Asset Catalog.
As this function is calling FileIOBase::FindFiles
and visiting every file within a directory recursively, the index of file paths can be built for each file in a directory the first time it is visited and then every time afterwards, the file index can be checked to see if the correct case exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting - in this case, I'm not sure its quite necessary until we rearchitect the thing, since the system calling into this function does have state, and does have its own case insensitive hash - its only calling this function when it encounters a case insensitive hash miss to make sure that what it then adds into the cache is authoritatively the correct case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is not something we need to fix for point release before this goes in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well there is one low-hanging fruit, which we could tackle if we need to improve the performance in the future even more.
That is the "potential" repeated calls to FileIOBase::FindFiles
. Since that could be called multiple times for the same directory, that directory information with the correct case and case-insensitive directory hash could be a quick win without storing too much data in-memory.
However we can wait on that.
Signed-off-by: Nicholas Lawson <70027408+nick-l-o3de@users.noreply.github.com>
Improve ap startup on linux (#17526)
Note that this could also potentially improve it on mac and windows, as well as metafile processing in general.
Most of this change is just moving the initialization of the catalog until the file cache is warmed up and then using the file cache to fix path cases on case sensitive file systems.
The rest of it is fixing the unit tests to account for this - the mock for the file system wasn't sufficient to actually use this new code as it would do things like always answer false for "is directory".
With this change, all unit tests pass in debug on linux, with no unexpected assertions and no failures. In addition, AP starts and idles very quickly without any long freezes during startup.
During Createjobs, Asset Processor Manager does this:
Q_EMIT CreateJobsDurationChanged(sourceAsset.RelativePath().c_str(), sourceAsset.ScanFolderId());
However, scan folder ID of type AZ::s64 which hasn't been registered to Qt serializer's meta object system, so it does not know how to convert it to a byte array to emit messages across threads. This causes warnings and probably causes that message not to be emitted.
Most of the problems were in oversights in existing tests. For example the asset catalog tests use a virtual file system but were using CreateDummyFile which makes real files.
Likewise, the asset catalog was using SystemFile instead of FileIO which was the mock file IO. So it would be making mock files in memory but then checking real files on disk, etc.
Note that the Pass-thru file cache is only used in unit tests, or when specifically asked for by a command line parameter, which is not used in any of the test scripts.