Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filesystem redesign and performance improvements #2820

Merged
merged 15 commits into from Jun 15, 2017

Conversation

JosJuice
Copy link
Member

@JosJuice JosJuice commented Aug 9, 2015

Scanning the game list without a cache can take a while when there are many games. As evidenced by games with large file systems taking significantly longer than others, a big part of this time is spent dealing with file systems. To improve the speed of scanning the game list, I made this PR. Initializing a file system now only involves three tiny disc reads and one big disc read to a vector<u8>, as opposed to the old code which made lots of tiny reads (many of them even going backwards!) and then constructed a bunch of objects from the values it obtained. I have also implemented better search algorithms for the two variants of FindFileInfo. With these changes, building the game list cache now takes about the same time for each game – the only real speed limit that's left is the limitations of the hard drive. I had hoped that these changes also would make opening the game properties fast, but it can still take a second or so with large file systems because of the time it takes for the GUI to create all the tree items. (EDIT: After updating to VS2015, this seems to be reasonably fast even in master.)

Another goal with this PR was to avoid implementation details of file systems being exposed to code that uses them. In the past, getting all file infos in a directory involved looping through the file system's vector of file infos and knowing that getting the size of a directory in a GC/Wii disc file system happens to produce the index of the first file info that comes after the directory. This is now taken care of inside the file system code, and code that uses file systems simply needs to use a range-based for loop on a directory to get its children. This should make it easier to both write new code that interacts with file systems and to implement new types of file systems that can be used in the same way as GC/Wii disc file systems. However, working with iterators and unique_ptrs and whatnot is still a bit new to me, so please comment if any parts of the design are bad.

While working on the changes needed for these two goals, I also tried to do as much error checking as possible. Invalid file systems should never make Dolphin crash or hang, because this code gets run when Dolphin boots. The only remaining issue that I know of is that Dolphin can run out of memory if a file system is too large, which is handled by PR #2610. (EDIT: That fix has been merged.)

Now, the question is... @JMC47, how much faster can your game list be scanned with this PR?

EDIT: This description was originally written when using VS2013. Now that Dolphin uses VS2015, the speed of scanning the game list and opening game properties windows has been significantly improved in master, at least with local drives. That means that the speed difference between master and this PR isn't as large as on VS2013, but it still does improve the speed, especially in situations like FileMonitor being called frequently. I'm not sure what the situation is like on other compilers/platforms.

Review on Reviewable

@delroth
Copy link
Member

delroth commented Aug 9, 2015

@shuffle2 I seem to remember you use Dolphin with games on a networked file system. If I'm right, care to test this to see how much faster gamelist loading is?


public:
// Set everything manually
CFileInfoGCWii(const u8* fst, bool wii, u32 fst_size, u32 index, u32 total_file_infos);

This comment was marked as off-topic.

for (const DiscIO::IFileInfo& file_info : directory)
{
const std::string path = file_info.GetPath();
DEBUG_LOG(DISCIO, "%s", path.empty() ? "/" : path.c_str());

This comment was marked as off-topic.

// Each entry contains 3 u32s (0xC bytes).
u32 CFileInfoGCWii::Get(int n) const { return Common::swap32(*((u32*)(m_fst + m_index * 0xC) + n)); }
u32 CFileInfoGCWii::GetNameOffset() const { return Get(0); }
u32 CFileInfoGCWii::GetRawOffset() const { return Get(1); }

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

@JosJuice JosJuice force-pushed the filesystem branch 5 times, most recently from a9bb61c to 08edde4 Compare August 9, 2015 16:18
@JMC47
Copy link
Contributor

JMC47 commented Aug 9, 2015

The biggest delay now is actually after it finishes scanning; there's a 5+ second delay before the gamelist actually loads. Everything else is lightning fast for thousands of games (roughly 3 seconds?)

@shuffle2
Copy link
Contributor

@delroth I haven't done that for a while (I was doing that when I was doing dev work on a laptop - don't really do that anymore...)

JosJuice added a commit to JosJuice/dolphin that referenced this pull request Aug 10, 2015
04fcb72 fixed an issue with reading the Wii FST size, but I found a second
issue when working on PR dolphin-emu#2820 - the size must be shifted left by 2.
DiscScrubber and Boot already do this correctly using separate code.
@JosJuice JosJuice force-pushed the filesystem branch 3 times, most recently from 9155934 to cd86e51 Compare August 16, 2015 09:38
@leoetlino
Copy link
Member

Review status: all files reviewed at latest revision, 15 unresolved discussions.


Source/Core/DiscIO/FileSystemGCWii.cpp, line 201 at r23 (raw file):

  const std::optional<u32> fst_offset_unshifted = m_volume->ReadSwapped<u32>(0x424, m_partition);
  const std::optional<u32> fst_size_unshifted = m_volume->ReadSwapped<u32>(0x428, m_partition);
  if (!fst_offset_unshifted || fst_size_unshifted)

Bug -- this should be: if (!fst_offset_unshifted || !fst_size_unshifted)


Comments from Reviewable

@JosJuice
Copy link
Member Author

Review status: all files reviewed at latest revision, 15 unresolved discussions, some commit checks failed.


Source/Core/DiscIO/FileSystemGCWii.cpp, line 27 at r23 (raw file):

Previously, leoetlino (Leo Lam) wrote…

There is no need for static here.

Done.


Source/Core/DiscIO/FileSystemGCWii.cpp, line 48 at r23 (raw file):

Previously, leoetlino (Leo Lam) wrote…

= default

Should I change the Filesystem.h destructors too to be consistent? And should the = defaults be in the .h file or .cpp file?


Source/Core/DiscIO/FileSystemGCWii.cpp, line 201 at r23 (raw file):

Previously, leoetlino (Leo Lam) wrote…

Bug -- this should be: if (!fst_offset_unshifted || !fst_size_unshifted)

Done.


Source/Core/DiscIO/FileSystemGCWii.cpp, line 351 at r23 (raw file):

Previously, leoetlino (Leo Lam) wrote…

GetSize() returns a u32, not a u64.

Done.


Source/Core/DiscIO/FileSystemGCWii.h, line 96 at r23 (raw file):

Previously, leoetlino (Leo Lam) wrote…

Is there a reason for using a pointer here instead of a const ref?

It can let callers skip checking whether the FileInfo returned by FindFileInfo is valid. See VolumeWii.cpp for an example.


Source/Core/DolphinWX/ISOProperties/FilesystemPanel.cpp, line 434 at r23 (raw file):

Previously, leoetlino (Leo Lam) wrote…

This is potentially unsafe. You should use "%s" for the format string.

Done.


Comments from Reviewable

@leoetlino
Copy link
Member

Reviewed 1 of 6 files at r24, 2 of 4 files at r25.
Review status: 9 of 12 files reviewed at latest revision, 11 unresolved discussions.


Source/Core/DiscIO/FileSystemGCWii.cpp, line 48 at r23 (raw file):

Previously, JosJuice wrote…

Should I change the Filesystem.h destructors too to be consistent? And should the = defaults be in the .h file or .cpp file?

Yes, please change the other destructors too. I think it'd be better for them to be in the .cpp file, so if you need to change the destructor, you don't need to recompile everything using the header.


Source/Core/DiscIO/FileSystemGCWii.h, line 96 at r23 (raw file):

Previously, JosJuice wrote…

It can let callers skip checking whether the FileInfo returned by FindFileInfo is valid. See VolumeWii.cpp for an example.

Ah, okay.


Comments from Reviewable

GC/Wii filesystem internals shouldn't be exposed to other classes.
This change isn't especially useful by itself, but it opens up the
way for some neat stuff in the following commits.
Some callers (i.e. ISOProperties) don't want the full path, so giving them
it is unnecessary. Those that do want it can use GetPathFromFSTOffset.
Not storing full paths everywhere also saves a small bit of RAM and is
necessary for a later commit. The code isn't especially pretty right now
(callers need to use FST offsets...) but it'll become better later.
Some callers already have the file info, making the relatively slow
FindFileInfo calls unnecessary. Callers that didn't have the file info
will now need to call FindFileInfo on their own.
Instead of calling GetPathFromFSTOffset for every file info, FindFileInfo
now only looks at names in directories that are included in the path.
For the common case of searching for "opening.bnr", this means that
only root-level files and directories have to be searched through.
Instead of using lots of small scattered reads to read the FST,
only one big read is used, which is more efficient.

This also means that the FST only allocates memory once and stores all
strings close to each other - good for the CPU cache. The file info
objects use pointers to this FST memory of containing data themselves.
Keeping around the big m_FileInfoVector containing objects with only
pointers is a bit unnecessary, but that will be fixed soon.
Not initializing until the filesystem is used is good when
a filesystem is constructed and then never used, but nobody does that.
This simplifies the code a little and lets all methods be const.
Now that the FST in read in the constructor, m_Valid
can be set to false when there are errors in the FST.
@JosJuice
Copy link
Member Author

Review status: 9 of 12 files reviewed at latest revision, 10 unresolved discussions.


Source/Core/DiscIO/FileSystemGCWii.cpp, line 48 at r23 (raw file):

Previously, leoetlino (Leo Lam) wrote…

Yes, please change the other destructors too. I think it'd be better for them to be in the .cpp file, so if you need to change the destructor, you don't need to recompile everything using the header.

Done.


Source/Core/DiscIO/FileSystemGCWii.cpp, line 255 at r23 (raw file):

Previously, leoetlino (Leo Lam) wrote…

= default

Done.


Comments from Reviewable

Copy link
Member

@leoetlino leoetlino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went through the PR again, can't spot any issue. And having iterators does make traversing directories much nicer.

@leoetlino
Copy link
Member

Reviewed 2 of 3 files at r26.
Review status: 10 of 12 files reviewed at latest revision, 10 unresolved discussions.


Source/Core/DiscIO/Filesystem.h, line 27 at r19 (raw file):

Previously, JosJuice wrote…

Should I make this a forward iterator? The only thing that's missing as far as I know is a default constructor, which isn't especially hard to write (just set m_file_info to nullptr), but then I would need to add nullptr checks to functions like operator==.

Looks like I had missed this comment. I don't see why not, since this iterator satisfies nearly all requirements (including multipass) and you really just have to make it default constructible (+ the checks you mentioned).


Comments from Reviewable

Instead of expecting callers to know how the size of directory file infos
relates to which files are in which directories, filesystems now offer a
GetRoot() method, and file infos offer a way to get their children. As
a bonus, m_FileInfoVector no longer has to be created and kept around
in RAM. Only the file info objects that actually are used are created.
FileMonitor calls this every time a read happens, and there's no code that
only calls this a small number of times, so having a cache is worthwhile.
This makes it possible to catch errors earlier so that file systems
simply fail to load instead of behaving oddly. It also makes it possible
to check for errors that weren't checkable before, like the end of a
directory being after the end of the parent directory.
@JosJuice
Copy link
Member Author

Review status: 10 of 12 files reviewed at latest revision, 10 unresolved discussions.


Source/Core/DiscIO/Filesystem.h, line 27 at r19 (raw file):

Previously, leoetlino (Leo Lam) wrote…

Looks like I had missed this comment. I don't see why not, since this iterator satisfies nearly all requirements (including multipass) and you really just have to make it default constructible (+ the checks you mentioned).

Done.


Comments from Reviewable

@leoetlino
Copy link
Member

Reviewed 1 of 6 files at r24, 1 of 3 files at r26, 1 of 1 files at r27.
Review status: all files reviewed at latest revision, 2 unresolved discussions.


Comments from Reviewable

@leoetlino leoetlino merged commit 09c0a3c into dolphin-emu:master Jun 15, 2017
@JosJuice JosJuice deleted the filesystem branch June 15, 2017 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet