New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Archive format complete ArchiveWriter implementation #16052
Adding Archive format complete ArchiveWriter implementation #16052
Conversation
The `edit-engine-properties` and `edit-project-properties` command has been updated to sort the gem names array inside of their respective manifest .json files for consistency Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com>
Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com>
…mat specifiers Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com>
UnitTest A new `AddActiveGems` function has been added that accepts a list of gem names and for each gem query there gem root location using the o3de manifest.json files and add there gem root locations to the Settings Registry and as a FileIO `@gemroot:<gem-name>@` alias. Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com>
The ArchiveWriter implementation can mount an archive file, where it can add/remove contentfiles from the archive, as well as optionally compress the content files being added. It also supports compressing the Archive Table of Contents using a compression algorithm registered with the Compression gem. The ability to write to a new archive or an existing archive is supported by the implementation. Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com>
That test found a bug in the AddFileToResult::operator bool logic where it would return false when a file was successfully added. Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com>
When the `TaskGraphEvent::Wait` function is used to wait for the completion of task, if there are no task added to the TaskGraph, a deadlock will occur. This changes fixes the issue by invoking `TaskGraphEvent::Signal` if there are no tasks on the Compiled Task Graph when submitted to the TaskExecutor. Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com>
content file as part of an Archive. Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com>
The start of each compressed block were being overridden with \0 bytes instead of being added as padding bytes to the compressed block to reach 512-byte alignment. Set the compression algorithm ID in the AddFileToArchive result value structure. It is now properly returned as part of the result Renamed the `ArchiveBlockSize` constant to `ArchiveBlockSizeForCompression` to indicate that the block size is for viewing a 2 MiB chunk of an uncompressed file when sending it through a compression algorithm. Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com>
The test writes a single file to the archive using the LZ4 compression algorithm Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com>
Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com>
…em is part of the "gem_names" array for the project.json and engine.json files Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com>
auto& compiledTasks = graph.Tasks(); | ||
if (compiledTasks.empty() && event) | ||
{ | ||
event->Signal(); | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch
…::vector An ambiguity would occur if an `AZStd::vector` was used with a std:: namespace type such as `std::byte`(which AZStd::byte is just an alias and not a real type) Therefore ADL brings in both AZStd::equal and std::equal when making an unalified call to `equal` in the AZStd::vector `operator=` function. Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com>
open/close state The boolean is just used to track whether a call to `Close()` has occured on a ByteContainerStream Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com>
Added a test to validate that an uncompressed file > 6 MIB creates 2 block line entries. Added test to validate that a file can be added and removed from the same mounted archive instance in a single Archive Writer. Also validated file additional and removal across multiple Archive writen instances, where the first instance, adds a file and the next instance remove a file from the archive. A test was added to make sure that an existing file in the Archive can be updated using the `ArchiveWriter::AddFileToArchive` function. Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com>
Public API Files that can be used in both the Client and Tool modules of other Gems have been moved to the `Include/Archive/Clients` directory Public API that can be used in only the Tools modules have been moved to the `Include/Archive/Tools` directory. The ArchiveToc* files have been moved to the Archive.Private.Object target as they only need to be used as part of implementing the Archive Reader and Writer logic and should not be part ofthe public API. They have been relocated to the `Source/Clients` directory. Add a new ArchiveWriter test to validate that it can write out a compressed table of contents(TOC) to a GenericStream. Reading of an archive with a compressed TOC was also tested. The testing found issues in the Compression and Decompression logic of the Table of Contents in the ArchiveWriter.cpp and those logic have been addressed. Updated the Archive format to write out an 8-byte magic byte sequence at the beginning of the Table of Contents section in order to detect when an invalid archive TOC of contents has been read. This is used to avoid crashes within the Archive code and instead inform the user of an invalid Archive table of contents. Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com>
virtual ArchiveAddFileResult AddFileToArchive(AZStd::span<const AZStd::byte> inputSpan, | ||
const ArchiveWriterFileSettings& fileSettings = {}) = 0; | ||
|
||
//! Searches for a relative path within the archive | ||
//! @param relativePath Relative path within archive to search for | ||
//! @return A token that identifies the Archive file if it exist | ||
//! if the with the specified path doesn't exist InvalidArchiveFileToken is returned | ||
virtual ArchiveFileToken FindFile(AZ::IO::PathView relativePath) = 0; | ||
virtual ArchiveFileToken FindFile(AZ::IO::PathView relativePath) const = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was actually going to recommend const initially, but these days I'm on the fence about using const in early API iterations because of the situations when a dev may want to do something non-consty like say caching requests to FindFile or ContainsFile, or something else I'm not thinking of. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well the API here for the IArchiveWriter is meant to allow access to the ArchiveWriter across DLL boundaries and be available as part of the Archive.Editor.API
target.
It is not an abstract interface meant for other Gems or libraries to implement, so the API is made in mind with the current implementation.
I prefer using const
with these APIs as to the outside caller it should appear constant to them.
For a caching request, what should be done is to NOT make the function non-const, but instead add a mutable
member that performs the caching
I am of the believe that the internal implementation details of an interface should not leak into the signature of the interface.
{ | ||
// Resize the compress block buffer to be the same size as the input buffer | ||
AZStd::vector<AZStd::byte> tocCompressBuffer; | ||
tocCompressionAlgorithmId = m_settings.m_tocCompressionAlgorithm.value(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m_settings can override the toc compression algo id from the m_archiveheader? is it possible for the archive header to specify a different toc compression algo than m_settings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the Compression Algorithm is set by the previous instance of the ArchiveWriterSettings
that was used to write the Archive before.
So if the a previous use of the ArchiveWriterSettings
set the table of contents(TOC) compression algorithm to Zlib, but then the next use of the ArchiveWriterSettings
set the TOC compression algorithm to LZ4, then the compression algorithm will be different than the one in the set in the archiveHeader
//! Used to track if the ByteContainerStream has received a Close() call | ||
//! By default the ByteContainerStream is open | ||
bool m_opened{ true }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this necessary?
It seems to me that if this container is going to change to support opening and closing, the following should happen:
- It should probably get set to true by the constructor that takes in a buffer pointer instead of defaulting to open, so that way if other constructors (like default constructors) get added, they would correctly start as closed, same as the other types of streams.
- All of the operations (seek / read / write) should handle the case that they're called when the stream isn't open. Either assert, return back an error or noop, or both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is necessary because the checking if the stream is open via the AZ::IO::GenericStream::IsOpen
function would return true even if a Close()
call was performed.
This was causing an issue in the ArchiveWrite::UnmountArchive
as it invokes Commit to write the Archive Header and Table of Contents if the file was open, which it would continuously do even if a previous unmount call has occurred.
- It should probably get set to true by the constructor that takes in a buffer pointer instead of defaulting to open, so that way if other constructors (like default constructors) get added, they would correctly start as closed, same as the other types of streams.
Well I defaulted it to true in the member initializer due to the maintaining the current the ByteContainerStream API is working with being associated with a single stream from the constructor.
I do like the flexibility however supporting an Open
function that would take in reference to the buffer type, in which case the state a container being "open" is no longer tied to construction.
With that change it would make sense to no longer default the m_opened
to true and instead set it in the constructor/Open
- All of the operations (seek / read / write) should handle the case that they're called when the stream isn't open. Either assert, return back an error or noop, or both.
This was kept inline with the other GenericStream implementations where their Seek, Read and Writes don't check if a file is open before performing the operations: https://github.com/o3de/o3de/blob/development/Code/Framework/AzCore/AzCore/IO/GenericStreams.cpp#L223-L238
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's not done in GenericStream, but there's some amount of checking in general:
https://github.com/o3de/o3de/blob/development/Code/Framework/AzCore/AzCore/IO/SystemFile.cpp#L118
https://github.com/o3de/o3de/blob/development/Code/Framework/AzCore/AzCore/IO/GenericStreams.cpp#L278-L281
I guess what I was thinking is that Close() should probably clear out the m_buffer pointer, and read/write/seek should be accounting for this, because otherwise you can keep reading/writing even though it's "closed" making the tracking of "closed" not really correct or functional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll update the ByteContainerStream, to have proper lifetime semantics, so that when the stream is not in the "open state", it will no-op if trying to seek, read or write.
@@ -15,12 +15,26 @@ namespace AZ::IO | |||
enum class OpenMode : AZ::u32 | |||
{ | |||
Invalid = 0, | |||
//! Open the file in read mode ("r") | |||
//! if ModeRead is specified without ModeWrite, the file is open in file share write mode | |||
//! when on platforms that uses WinAPI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//! when on platforms that uses WinAPI | |
//! when on platforms that use WinAPI |
ModeWrite = (1 << 1), | ||
//! Opens the file in append mode ("a") | ||
//! The file is opened and and set the write offset to the end of the file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//! The file is opened and and set the write offset to the end of the file | |
//! The file is opened and sets the write offset to the end of the file |
ModeBinary = (1 << 3), | ||
//! Opens the file with the text option("t") set | ||
//! This may affect how newlines are handled on Windows platforms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a suggestion:
//! This may affect how newlines are handled on Windows platforms | |
//! This affects how newlines are handled on different platforms ("\n" vs "\r\n") |
@@ -764,4 +764,8 @@ namespace AZ | |||
AZ_TYPE_INFO_SPECIALIZE(AZ::IO::FixedMaxPath, "{FA6CA49F-376A-417C-9767-DD50744DF203}"); | |||
} | |||
|
|||
//! Use this macro to simplify safe printing of a PathView or BasicPath* which may not be null-terminated. | |||
//! Example: AZStd::fixed_string<1024>::format("Safely formatted: %.*s", AZ_PATH_ARG(myString)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a suggestion:
//! Example: AZStd::fixed_string<1024>::format("Safely formatted: %.*s", AZ_PATH_ARG(myString)); | |
//! Example: AZStd::fixed_string<1024>::format("Safely formatted: " AZ_STRING_FORMAT, AZ_PATH_ARG(myPathView)); |
}; | ||
|
||
//! vector storing a copy of each file path in memory | ||
//! It's length matches the value of m_fileCount |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//! It's length matches the value of m_fileCount | |
//! Its length matches the value of m_fileCount |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that alignment issue is bad.
Will fix.
tocValidationResult.m_errorMessage = ArchiveTocValidationResult::ErrorString::format( | ||
"The Archive TOC has an invalid magic byte sequence of %llu", tocView.m_magicBytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: %llx for byte sequence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change to use hex.
Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com>
* Added support to the ByteContainerStream class to open a new stream post construction With this change, a default constructor has been added to the ByteContainerStream class. The Seek, Read, Write, GetCurPos, GetLength and Truncate functions have been updated to now trigger an `AZ_Error` when attempting to perform their operations on closed byte container stream. Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com> * Moved increment of the TaskExecutor::m_graphsRemaining member The number of graphs being processed now only increments if there is at least one compiled task to run. Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com> * Addressed feedback in PR #16052 around comments and documentation Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com> * Update Gems/Archive/Code/Include/Archive/Tools/ArchiveWriterAPI.h Co-authored-by: Mike Balfour <82224783+mbalfour-amzn@users.noreply.github.com> Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com> * Update Gems/Archive/Code/Source/Clients/ArchiveTOC.h Co-authored-by: Mike Balfour <82224783+mbalfour-amzn@users.noreply.github.com> Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com> * Update Gems/Archive/Code/Source/Clients/ArchiveTOC.h Co-authored-by: Mike Balfour <82224783+mbalfour-amzn@users.noreply.github.com> Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com> * Update Code/Framework/AzCore/AzCore/IO/ByteContainerStream.h Co-authored-by: Alex Peterson <26804013+AMZN-alexpete@users.noreply.github.com> Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com> --------- Signed-off-by: lumberyard-employee-dm <56135373+lumberyard-employee-dm@users.noreply.github.com> Co-authored-by: Mike Balfour <82224783+mbalfour-amzn@users.noreply.github.com> Co-authored-by: Alex Peterson <26804013+AMZN-alexpete@users.noreply.github.com>
Archive Writer Changes
The ArchiveWriter implementation can mount an archive file, where it can
add/remove content files from the archive, as well as optionally compress the
content files being added.
It also supports compressing the Archive Table of Contents using a
compression algorithm registered with the Compression gem.
The ability to write to a new archive or an existing archive is
supported by the implementation.
Test Environment changes
Updated the GemTestEnvironment to support specifying gems as active in UnitTest
A new
AddActiveGems
function has been added that accepts a list of gemnames and for each gem query there gem root location using the o3de
manifest.json files and add there gem root locations to the Settings
Registry and as a FileIO
@gemroot:<gem-name>@
alias.Path API changes
Adding AZ_PATH_ARG macro to make it easier to use AZ::IO::Path in format specifiers
AutomatedTesting Active Gem changes
Adding the Compression and Archive gem to the AutomatedTesting project
o3de CLI changes
The edit-engine/project-properties command now sorts it's "gem_names" array field
The
edit-engine-properties
andedit-project-properties
command hasbeen updated to sort the gem names array inside of their respective
manifest .json files for consistency
Task System changes
Fixed deadlock when Submitting a TaskGraph with zero tasks and a TaskGraphEvent.
When the TaskGraphEvent was waited on using the
TaskGraphEvent::Wait
call it would attempt to acquire a semaphore that will never be released as there were no tasks that could complete to signal the semaphore.How was this PR tested?
Unit Test have been added for the Archive Writer implementation.
More UnitTest are on the way as the current set of test are nowhere near sufficient
For the
edit-engine-properties
/edit-project-properties
changes, theenable-gem
command was used to update the AutomatedTesting project list of active gems.As can be seen in the project.json change, the list of gems are sorted.