Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filesystem [1/N] #2189

Merged
merged 20 commits into from
Aug 16, 2020
Merged

Filesystem [1/N] #2189

merged 20 commits into from
Aug 16, 2020

Conversation

Ghabry
Copy link
Member

@Ghabry Ghabry commented May 2, 2020

This is a partial patchset from the VFS changes rebased to upstream.
It only changes how File input/output works across Player in a consistent way by now. You don't get any new features by this but it makes future VFS work muuuuuch easier when these stream changes are kept being updated upstream.

For reviewing I recommend to look at the change as a whole. The changes by ChrisBreiti are a bit old :)

  • All uses of FILE* replaced with C++ streams
  • All C++ streams use now Filesystem::InputStream/Filesystem::OutputStream which is by now a shallow wrapper around std::filebuf.

Basicly the VFS code will do the following later:

  1. You open a file like /home/user/game.zip/RPG_RT.ldb
  2. This returns a Filesystem::InputStream that inherits from istream. So most of C++ fstream API will work as expected.
  3. Use normal istream read funcs on it.

Is a draft. Need to test if I broke any Read/Write code. All these raw buffer APIs are hard to get right when a shared pointer is mixed in ^^'. Type confusion.

@Ghabry Ghabry marked this pull request as draft May 2, 2020 14:36
@Ghabry Ghabry changed the title WIP: Filesystem [1/N] Filesystem [1/N] May 6, 2020
@Ghabry Ghabry marked this pull request as ready for review May 6, 2020 23:29
@Ghabry
Copy link
Member Author

Ghabry commented May 6, 2020

There are still some crashes to resolve e.g. when project/save path are invalid.

But besides this: rdy for review

src/image_png.cpp Outdated Show resolved Hide resolved
src/image_png.cpp Outdated Show resolved Hide resolved
src/player.cpp Outdated Show resolved Hide resolved
src/filefinder.cpp Outdated Show resolved Hide resolved
src/filesystem.h Outdated Show resolved Hide resolved
src/image_xyz.cpp Outdated Show resolved Hide resolved
src/input.cpp Outdated Show resolved Hide resolved
src/output.cpp Outdated Show resolved Hide resolved
src/output.cpp Outdated Show resolved Hide resolved
char magic[4] = { 0 };
if (fread(magic, 4, 1, file) != 1)
if (stream->read(magic, sizeof(magic)).gcount() == 0) {
Copy link
Contributor

@fmatthew5876 fmatthew5876 May 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be != sizeof(magic) what if I tried to read a 1, 2 or 3 byte file?

Stuff like this could be worth adding some kind of templated read method which returns true or false. To reduce the noise of reading these kinds of things and the all to easy to write bugs.

Something Like:

template <typename T>
bool InputStream::ReadIntoObj(T& obj) {
   return read(&obj, sizeof(obj)).gcount() == sizeof(obj);
}

This style of interface also lets you do things like read directly into integer types, do byte swapping, etc..

Copy link
Member Author

@Ghabry Ghabry May 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added now such a ReadIntoObj, just not using it alot by now

src/filesystem.h Outdated Show resolved Hide resolved
src/filesystem.h Outdated Show resolved Hide resolved
Filesystem::InputStream FileFinder::OpenInputStream(const std::string& name,
std::ios_base::openmode m)
{
std::streamsize size = FileFinder::GetFileSize(name);
Copy link
Contributor

@fmatthew5876 fmatthew5876 May 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to get the file size from the filebuf, this can save you from having to open the file twice. You could also move that logic into the InputStreamRaw constructor.

That being said, in iostream world that means seek(end), tell(), seek(begin). 3 system calls instead of one call to stat().. Maybe it's worse?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit indifferent here. Which of the ways if faster should be also explored via benchmarking.
At least for compressed streams seek-to-end and back to begin is bad because it forces a full decompression that could be avoided in some cases.

Also the file size isn't really used a lot except for some buffer preallocation optimisations. Imo we can remove it completely later and just read until eof (but needs more testing if this breaks some libs)

src/bitmap.cpp Outdated Show resolved Hide resolved
src/decoder_fmmidi.cpp Outdated Show resolved Hide resolved
src/decoder_opus.cpp Outdated Show resolved Hide resolved
uint16_t bitspersample;
fread(&bitspersample, 1, 2, file_);
stream->read(reinterpret_cast<char*>(&bitspersample), sizeof(bitspersample));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good case where doing a single read into a buffer and copying the pieces out would be faster than all these individual reads and seeks.

@fmatthew5876
Copy link
Contributor

fmatthew5876 commented May 7, 2020

So one general comment about the overall approach. I really like what you're doing here and I see the goal is to encapsulate I/O so we can load resources from anything. This is great.

We should carefully consider whether we really wanted shared ownership semantics here (i.e. shared_ptr).

For resources like files, you really want to know when they are opened and when they get closed. When you have a shared_ptr, this becomes difficult because anything could be holding a reference. It's possible to write a bug where you keep the file around too long. We should consider using unique_ptr instead. Not only will this make the resource lifetime more explicit, it'll also give us better performance as we don't pay for atomic ref counts anymore.

The above is why streams aren't copyable and why in python for example the recommended way to do file IO is with a context manager.

In fact we don't event need unique_ptr.

The istream interface as of C++11 is movable.

https://en.cppreference.com/w/cpp/io/basic_iostream/basic_iostream
https://en.cppreference.com/w/cpp/io/basic_istream/operator%3D

The istream/ostream itself already encapsulates the real I/O inside of filebuf. Essentially, the istream itself can act as a unique_ptr for the filebuf it contains. So using pointers is adding another level of indirection we don't actually need. The istream inheritance hierarchy is intended to already provide this abstraction over the underlying file implementation.

By doing this you would remove one level of indirection (performance) and we don't have to deal with pointers and null checks in our code.

Consider getting rid of the pointers and RawInputStream and just using InputStream directly by implementing the move operations and std::moveing the stream object itself when you want to pass ownership to another entity.

I understand you probably have a lot of commits after this set. So if such a refactor adds rebase pain I'm fine punting it to the end.

@Ghabry
Copy link
Member Author

Ghabry commented May 7, 2020

Thanks for the long and detailed review appreciate it.
Because this is a huge and invasive change it is important to get the overall API correct in the first attempt to avoid more gigantic refactoring.

The usage of shared_ptr was recommended in a stackoverflow answer but your explanation makes sense. Shared ptr is not really needed here as there is always exclusive ownership on the stream. Will try to get everything (std)moved.

@Ghabry Ghabry added this to the 0.6.3 milestone Jun 14, 2020
@fmatthew5876
Copy link
Contributor

fmatthew5876 commented Jun 14, 2020

Optimization which can be done later:

Utils::ReadStream should take the vector by reference and fill it in. This will allow clients to reuse a single buffer and not allocate every time. You can provide 2 overloads, an optimal one which the user passes the buffer, and a convenience one which provides the interface you have now.

std::getline uses this approach: https://en.cppreference.com/w/cpp/string/basic_string/getline

@Ghabry
Copy link
Member Author

Ghabry commented Jun 14, 2020

The XMP IsModule test does not go through the Filesystem stuff.
I submitted a patch to them in January 2019 which is not merged yet :(
libxmp/libxmp#134

ChristianBreitwieser and others added 20 commits June 18, 2020 15:36
…itional filesize field)

* First charge of audio decoders and sdl audio are converted to use this stream
* Windows build is buildable - Linux not so as not all decoders are converted yet
…lled with a istream, decoder_wav does now work with streams, decoder_wildmidi is (apart from the interface) unchanged.
…tor easier later when the type is already correct.
@fdelapena fdelapena merged commit f9ee510 into EasyRPG:master Aug 16, 2020
@Ghabry
Copy link
Member Author

Ghabry commented Nov 20, 2020

XMP is finally being developed again and my patch was integrated 👍

@Ghabry Ghabry deleted the filesystem branch February 22, 2022 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

4 participants