Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduced MediaGalleryMetadata services with XMP support #1514

Merged
merged 12 commits into from
Jul 4, 2020

Conversation

sivaschenko
Copy link
Member

@sivaschenko sivaschenko commented Jun 23, 2020

Story: #1183

Intorudced the AddMetadataInterface and GetMetadataInterface services according to the technical design: https://github.com/magento/adobe-stock-integration/wiki/Technical-Design:-Images-metadata-processing
Added XMP read/write support.
Covered with integration tests.

Closes #1453
Closes #1450
Closes #1448
Closes #1447
Closes #1237

Copy link
Contributor

@gabrieldagama gabrieldagama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @sivaschenko, thanks for the pull request, it looks great! Please follow some comments below.

Comment on lines 57 to 73
if (!$fileHandle) {
throw new LocalizedException(__('Cannot open file'));
}

$data = $this->read($fileHandle, 2);

if ($data != self::MARKER_IMAGE_PREFIX . self::MARKER_IMAGE_FILE_START) {
fclose($fileHandle);
throw new LocalizedException(__('Not an image'));
}

$data = $this->read($fileHandle, 2);

if ($data[0] != self::MARKER_IMAGE_PREFIX) {
fclose($fileHandle);
throw new LocalizedException(__('File is corrupted'));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add these validations to a private method?

Comment on lines 77 to 78
while (($data[1] != self::MARKER_IMAGE_END) && (!feof($fileHandle))) {
if ((ord($data[1]) < 0xD0) || (ord($data[1]) > 0xD7)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it worth to wrap these conditions on methods to make the behavior explicit, especially because we are dealing with functions and operations that are not used that often (by us at least).

Comment on lines 79 to 90
$sizestr = $this->read($fileHandle, 2);

$decodedsize = unpack("nsize", $sizestr);

$segmentDataStartPosition = ftell($fileHandle);

$segmentData = $this->read($fileHandle, $decodedsize['size'] - 2);

$segments[] = $this->segmentFactory->create([
'name' => $this->segmentNames->getSegmentName(ord($data[1])),
'dataStart' => $segmentDataStartPosition,
'data' => $segmentData,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we wrap this to a private function? Something like createSegment($segmentName, $fileHandle)

Comment on lines 95 to 109
$compressedImage = '';
do {
$compressedImage .= $this->read($fileHandle, 1048576);
} while (!feof($fileHandle));

$endOfImageMarkerPosition = strpos($compressedImage, self::MARKER_IMAGE_PREFIX . self::MARKER_IMAGE_END);
$compressedImage = substr($compressedImage, 0, $endOfImageMarkerPosition);

fclose($fileHandle);

return $this->fileFactory->create([
'path' => $path,
'compressedImage' => $compressedImage,
'segments' => $segments,
]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar thing here, can we wrap this on private function? Maybe also add constant for values like 1048576 and 2, that are used on read method, mostly to explain what they are.

public function execute(FileDataObject $file): void
{
foreach ($file->getSegments() as $segment) {
if (strlen($segment->getData()) > 0xfffd) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to use hexadecimal here, wouldn't decimal work as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hexidecimal looks much more cool

*/
class SegmentNames
{
private const SEGMENT_TYPE_TO_NAME = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to use hexadecimal values here? I guess it makes it harder to understand and when debugging the application.

fwrite($fileHandler, self::MARKER_IMAGE_PREFIX . self::MARKER_IMAGE_FILE_START);
foreach ($file->getSegments() as $segment) {
fwrite($fileHandler, self::MARKER_IMAGE_PREFIX . $this->segmentNames->getSegmentType($segment->getName()));
fwrite($fileHandler, pack("n", strlen($segment->getData()) + 2));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same thing with the 2 here, having it as constant would make it easier to understand and change in the future (if needed).

throw new LocalizedException(__('Could not open file.'));
}

fwrite($fileHandler, self::MARKER_IMAGE_PREFIX . self::MARKER_IMAGE_FILE_START);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For fwrite function, we can use Magento\Framework\Filesystem\DriverInterface::fileWrite()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we can use the driver for all file operations, thanks

$data = $this->read($fileHandle, 2);

if ($data != self::MARKER_IMAGE_PREFIX . self::MARKER_IMAGE_FILE_START) {
fclose($fileHandle);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For fclose function, we can use Magento\Framework\Filesystem\DriverInterface::fileClose()

@sivaschenko
Copy link
Member Author

@gabrieldagama @konarshankar07 thanks for your suggestions! I have added the approximate structure of the module, please take a look

@sivaschenko
Copy link
Member Author

@magento run all tests

@sivaschenko sivaschenko changed the title [WIP] Image metadata parsing prototype Introduced MediaGalleryMetadata services with XMP support Jun 29, 2020
@gabrieldagama
Copy link
Contributor

@magento run all tests

@sivaschenko
Copy link
Member Author

@magento run all tests

@sivaschenko sivaschenko marked this pull request as ready for review July 1, 2020 17:44
Copy link
Contributor

@gabrieldagama gabrieldagama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HI @sivaschenko thanks for the PR it looks awesome, well done. Please follow some suggestions below.

Comment on lines 48 to 56
foreach ($this->readerPool->get() as $reader) {
$data = $reader->execute($path);
$title = $data->getTitle() ?? $title;
$description = $data->getDescription() ?? $description;
$keywords = array_unique(array_merge($keywords, $data->getKeywords()));
if (!empty($title) && !empty($description) && !empty($keywords)) {
break;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally find this hard to understand, we are executing the reader from all formats on one file, each of those will return a MetadataInterface, after extracting the values and validating them we create a new MetadataInterface to return.

I would prefer to have a class that receives the path and returns the correct ExtractMetadataInterface for that file, also I think readerPool is misleading, I think the name extractMetadataPool would be more straight forward, I got confused and thought the readerPool was a pool of FileReader classes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning the correct ExtractMetadataInterface will:

  1. Expose internal implementation
  2. Make the client implementation more complex
  3. Will not allow using multiple ExtractMetadataInterface implementation to process a single image from different angles)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aggree with extractMetadataPool. Btw, as for the naming, do you think we should use extractMetadata term or getMetadata term for all classes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think extractMetadata makes sense, we are actually doing more than just getting it, I mean, usually gets are simple functions, and here we are a more complex functionality.

Comment on lines 58 to 64
if (!$this->fileReader->isApplicable($path)) {
return $this->metadataFactory->create([
'title' => $title,
'description' => $description,
'keywords' => $keywords
]);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of returning an empty MetaDataInterface wouldn't be better to throw an exception and on your client, you just handle that and continue?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there are any differences for us if the metadata fields are empty or metadata is missing. To have only one flow of processing this situation I decided to always return a MetadataInterface. The MetadataInterface contains the information that the reader was able to retrieve.

Also, this approach provides more flexibility to extensions using plugins

{
$xmp = substr($segment->getData(), 13);

if (substr($xmp, -257, 3) !== "\x01\xFF\xFE" || substr($xmp, -4) !== "\x03\x02\x01\x00") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth adding a comment why this condition means corrupt data

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will work on improving exception messages and exception handling in general

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduced constants with comments

/**
* @inheritdoc
*/
public function execute(string $path): MetadataInterface
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the ExtractMetadata classes seem to be doing exactly the same thing, to avoid code duplication wouldn't be better to use virtual type or have a builder for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still not exactly sure the code for each extract metadata implementation is going to be exactly the same, but yes that's something I was also thinking about, probably worth going for it, thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added virtual types

@sivaschenko
Copy link
Member Author

@magento run all tests

@sivaschenko
Copy link
Member Author

@sivaschenko
Copy link
Member Author

@magento run all tests

gabrieldagama
gabrieldagama previously approved these changes Jul 2, 2020
@sivaschenko
Copy link
Member Author

@sivaschenko
Copy link
Member Author

@magento run all tests

@sivaschenko
Copy link
Member Author

@sivaschenko
Copy link
Member Author

@magento run Integration Tests

@sivaschenko
Copy link
Member Author

@sivaschenko
Copy link
Member Author

@magento run Integration Tests

@sivaschenko sivaschenko merged commit 6d29819 into magento:2.0-develop Jul 4, 2020
@ghost
Copy link

ghost commented Jul 4, 2020

Hi @sivaschenko, thank you for your contribution!
Please, complete Contribution Survey, it will take less than a minute.
Your feedback will help us to improve contribution process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants