Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'Mass file loader' for use in skin and asset loading #6727

Closed
wants to merge 28 commits into from

Conversation

ewancg
Copy link
Contributor

@ewancg ewancg commented Jun 9, 2023

Summary:

Currently, asset and skin loading are quite slow if you have a decent amount of them, even on a fast disk. I would like to create a generic, multi-threaded file loader system that can lessen the impact of some necessary disk blocks, and implement it into the skin and asset systems so the brunt of this wait time can be avoided (most or all skins and assets loaded by the time a user finds and joins a server from the browser). This file loader will also be available (and hopefully encouraged) for similar future tasks requiring a lot of files to be loaded & used in a similar way.

This is a relatively big-picture feature, if my goal is at all unclear to you, I am happy to clarify.

Goals:

  • Flexibility; allowing you to choose which files to or not to load without ever interacting with any io_*/fs_*/IStorage::* - functions.
  • Rich error reporting; not only reporting that the operation failed but also why, as well as granting the integrator choice to continue in each instance.
  • Simplicity; requiring only a few steps to use, it should be no harder to grasp than a manual/tailored implementation.

At the moment, I have gotten the interface and reference implementation (single-threaded) to a stage where they are ready for review, after which I plan to create the multi-threaded implementation and then add to skin and asset loading.

Just for the sake of it, here is a checklist.

  • Interface
  • Reference Implementation
  • Asynchronous Implementation
  • Integration to skin loading
  • Integration to asset loading

Misc. stuff:

At the moment, the API is described in src/game/client/file_loader.h. It is not formatted as proper documentation, I could use some help with that. Maybe it's not needed?

2 functions have been added to src/base/system.cpp/.h in order to allow for enhanced error checking. It is not strictly needed, but I would like for it to be kept. They have been tested and seen as working on Windows and Linux

There is a simple temporary test case in src/engine/client.cpp. Up to this point I have been using a test file/directory schema that looks like this

.
├── limits
│   ├── 1GiB_file.test.txt 				# 1GiB, expected to succeed in loading
│   ├── 2GiB_file.test.txt 				# 2GiB, expected to fail because it's too big
│   ├── 4GiB_file.test.txt 				# 4GiB, same as above
│   ├── empty_file.test.txt 				# 0 bytes, expected to succeed in loading
│   ├── unreadable_directory 				# expected to fail because it's unreadable
│   └── unreadable_file.test.txt 			# expected to fail because it's unreadable
├── quantity 						# all 250 test files expected to show up
│   ├── 000.test.txt
...
│   └── 250.test.txt
├── recursion_1 					# all recursion tests expected to show up independently
│   ├── recursion_2
│   │   ├── recursion_3
│   │   │   └── status.txt
│   │   └── status.txt
│   └── status.txt
└── status.txt

I am trying my best to stick to DDNet code convention. Please let me know if there's something you think doesn't fit.

Checklist

  • Tested the change ingame
  • Provided screenshots if it is a visual change
  • Tested in combination with possibly related configuration options
  • Written a unit test (especially base/) or added coverage to integration test
  • Considered possible null pointers and out of bounds array indexing
  • Changed no physics that affect existing maps
  • Tested the change with ASan+UBSan or valgrind's memcheck (optional)

@ewancg
Copy link
Contributor Author

ewancg commented Jun 9, 2023

I would like to know if it's wanted in the DDNet code & if the way I'm doing it is considered to be the 'right approach' before I go on making the multi-threaded implementation & replacing the asset and skin loading systems with it. I will wait for feedback before making major changes.

@def-
Copy link
Member

def- commented Jun 9, 2023

Would it make sense to use aio for that?

@ewancg
Copy link
Contributor Author

ewancg commented Jun 9, 2023

Would it make sense to use aio for that?

I don't know aio. I hear it doesn't work on Windows. If it only works file-by-file, this still has the benefit of being batch-oriented

src/base/system.cpp Outdated Show resolved Hide resolved
src/base/system.cpp Outdated Show resolved Hide resolved
@heinrich5991
Copy link
Member

heinrich5991 commented Jun 10, 2023

Sounds like a good idea in general.

High-level: I think stuff like "is_readable" shouldn't be used, rather, the file can be opened in the background thread and the resulting error can be reported. std::regex is known to be slow and should probably not be used. EDIT: I guess looking for file suffixes should be enough.

Is there a reason for denying symlinks by default or at all?

@ewancg
Copy link
Contributor Author

ewancg commented Jun 10, 2023

Recursing through symlinks presents the risk of looping forever.

@ewancg
Copy link
Contributor Author

ewancg commented Jun 10, 2023

In this commit I've changed it so that error checking is only done in the load step & removed fs_is_readable. I'm now unable to check for unreadable directories at all and unreadable files if the flags contain LOAD_FLAGS_DONT_READ_FILE.
I have also replaced the regex with a file extension parameter.

@Kaffeine
Copy link
Contributor

Kaffeine commented Jun 12, 2023

heinrich5991: Is there a reason for denying symlinks by default or at all?

ewancg: Recursing through symlinks presents the risk of looping forever.

I have to say that symlinks is what I use typically for own stuff.
I have a directory with custom resources I work on, and when I save the files in editors (GIMP, or Inkscape, or something else). I use symlinks to have the result available in the game immediately, without extra coping or whatever manual operations.

It seems to be a bad idea to work on graphics directly in the game folder (I don't want to have extra (editor, e.g. .xcf) files there.

I'd say: "please keep the current behavior as it have nothing to do with how the game opens files (one-by-one or 'massively')".

Disabled symlinks would be a regression for my personal UX.

@ewancg
Copy link
Contributor Author

ewancg commented Jun 12, 2023

I planned to have them on for skins. What's the big deal

@Kaffeine
Copy link
Contributor

Kaffeine commented Jun 12, 2023

I just don't see the need in implementing the loader and special symlinks processing in one MR.

Recursing through symlinks presents the risk of looping forever.

It would be a 100% user fault, and it does not look like a new problem. Why we'd mix it into a new feature.


in order to allow for enhanced error checking

Oh, now I see. The current implementation is single-threaded and you want to add the proper API first.

@ewancg
Copy link
Contributor Author

ewancg commented Jun 14, 2023

I'll start the async implementation soon if there are no more complaints.

@heinrich5991
Copy link
Member

Status: Draft.

UserData->m_pThis->m_PathCollection.insert({AbsolutePath, new std::vector<std::string>});
// Note that adding data to a SORTED container that is currently being iterated on higher in scope would invalidate the iterator. This is not sorted
SListDirectoryCallbackUserInfo Data{&AbsolutePath, UserData->m_pThis, UserData->m_pContinue};
UserData->m_pThis->m_pStorage->ListDirectory(IStorage::TYPE_ALL, AbsolutePath.c_str(), ListDirectoryCallback, &Data); // Directory item is a directory, must be recursed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The storage type has to be the one used initially instead of TYPE_ALL every time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean. It's TYPE_ALL initially as well, in the direct ListDirectory invocation in the load function

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is mean is that the loader should have a parameter int StorageType so the storage type that is listed can be adjusted, same as the existing ListDirectory.

Copy link
Contributor Author

@ewancg ewancg Aug 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's why the ":" exists on a per-directory basis. That is the solution I came up with and thought everyone was happy with when I realized it would be tricky to provide the storage type for each path.

…out std strings in favor of char ptrs when not justified by a container or other major convenience
@ewancg
Copy link
Contributor Author

ewancg commented Aug 26, 2023

File loader thread has to wait some ms for the main thread to get a chance to access the textures queued for upload to the GPU, otherwise both threads just take turns hogging the mutex, making the game unplayable. It is also not ideal to have a hard-baked time constant, and I think this is generally the wrong approach; I want to find a lockfree way to do this.

Advice?

@ewancg
Copy link
Contributor Author

ewancg commented Aug 26, 2023

I can see how regex could be considered overkill, but maybe I should add the ability to filter by more than one extension. Sort of like how most file dialog APIs work. If so, multiple strings or one string separated by semicolons? Thoughts?

@Robyt3
Copy link
Member

Robyt3 commented Aug 26, 2023

I can see how regex could be considered overkill, but maybe I should add the ability to filter by more than one extension. Sort of like how most file dialog APIs work. If so, multiple strings or one string separated by semicolons? Thoughts?

Not sure it's useful anywhere to list files of multiple extensions at the same time. Unless we can identify a clear use case, I think it's better to keep it simple with only one file extension being filtered.

@ewancg
Copy link
Contributor Author

ewancg commented Aug 27, 2023

Async implementation pretty much finished. Changed the mutex tug-of-war to a mostly lockfree system else using an IJob. Specifically, instead of both threads waiting to access a list of textures (main thread to read; file loader thread to write), the file loader thread goes into 'yield mode' after each file is loaded & texture is prepared, where the main thread can then do what it has to do & set the file loader thread's status back to running.

Waiting for review, I think that this is the final iteration of the file loader itself so that would probably be the best thing to look at. Skins loading also works and could use some review, not 100% sure if that's how it'll end up looking but it works.

Still have to do the assets loading.


std::vector<std::string> m_RequestedPaths;
std::unordered_map<std::string, std::vector<std::string> *> m_PathCollection;
char *m_pExtension = nullptr;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m_aExtension[IO_MAX_PATH_LENGTH]

There's no need for dynamic memory allocation for this. We know the extension can only have a certain maximum size.

@heinrich5991
Copy link
Member

Status: Not done, waiting for author.

@ewancg
Copy link
Contributor Author

ewancg commented Nov 18, 2023

Doesn't seem like a high demand feature. Especially considering how much work it is just to get it working. I will consider a simpler/less versatile approach for async skins and assets in the future. Closing for now

@ewancg ewancg closed this Nov 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants