Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add product metadata check audio/video files #665

Merged
merged 3 commits into from
Jun 26, 2023
Merged

Add product metadata check audio/video files #665

merged 3 commits into from
Jun 26, 2023

Conversation

al-niessner
Copy link
Contributor

@al-niessner al-niessner commented Jun 21, 2023

🗒️ Summary

Added checks in FileReferenceValidationRule that if an A/V file then to check with its metadata that there is the correct video and/or audio track. Like using FFMPEG to find errors it will process any valid A/V file. It cannot really tell the difference between H.264 and H.265 or 266 or other supported encoding. Same applies with audio. The mp4parser library used for this work currently only supports MP4 H.264 and AAC. Not to say that will not change in the future.

⚙️ Test Data and/or Report

See unit tests (604 and 605 specifically)

♻️ Related Issues

Closes #664

Al Niessner added 2 commits June 20, 2023 17:08
Found a library that opes the MP4/M4A container, parsers it, and produces an ISO representation of the metadata in memory. Now browsing that to check if it can be verified that the encoding types are correct.
@sonatype-lift
Copy link
Contributor

sonatype-lift bot commented Jun 21, 2023

🛠 Lift Auto-fix

Some of the Lift findings in this PR can be automatically fixed. You can download and apply these changes in your local project directory of your branch to review the suggestions before committing.1

# Download the patch
curl https://lift.sonatype.com/api/patch/github.com/NASA-PDS/validate/665.diff -o lift-autofixes.diff

# Apply the patch with git
git apply lift-autofixes.diff

# Review the changes
git diff

Want it all in a single command? Open a terminal in your project's directory and copy and paste the following command:

curl https://lift.sonatype.com/api/patch/github.com/NASA-PDS/validate/665.diff | git apply

Once you're satisfied, commit and push your changes in your project.

Footnotes

  1. You can preview the patch by opening the patch URL in the browser.

@al-niessner
Copy link
Contributor Author

@jordanpadams @thareUSGS

Here is the solution that uses simple metadata. It checks that the library can find a video and audio track. Supposedly it only handles MP4 containers with H.264 video and will not return any information if the file cannot be parsed. If the library is more modern than suggested or it is updated to be more complete like FFMPEG it process files that it understands beyond what we currently want. Let me know if this is sufficient.

Tried to sneak an text file by it and it did catch it.

@jordanpadams
Copy link
Member

@al-niessner just to verify,

will not return any information if the file cannot be parsed.

we throw an error when that happens, correct?

@jordanpadams jordanpadams changed the title Issue 664: metadata check audio/video files Add product metadata check audio/video files Jun 23, 2023
@jordanpadams
Copy link
Member

actually @al-niessner it looks like we throw a WARNING for non-compliant files. If we are pretty sure when this is caught, the file is actually not compliant, I think this should be an ERROR.

@al-niessner
Copy link
Contributor Author

actually @al-niessner it looks like we throw a WARNING for non-compliant files. If we are pretty sure when this is caught, the file is actually not compliant, I think this should be an ERROR.

@jordanpadams

I followed the PNG check which throws a warning if it thinks it is not a PNG. Would you prefer an error?

@jordanpadams
Copy link
Member

@al-niessner It looks like this may have been confusion in requirements before. I did not realize we were actually checking the product binary content for JPEGs and PNGs. Those should also probably be ERRORs since we are actually verifying more than just the filename suffix (like we do for a lot of the other encoding standards).

thoughts?

@al-niessner
Copy link
Contributor Author

al-niessner commented Jun 23, 2023

@jordanpadams

I assumed they were warning because some "valid" images were being generated but the library we use did not like them. By "valid" end user tool glosses over errors and processes images anyway. Hence they were made warnings in validate.

If there is no reason other than random developer choice for them being warnings, then I agree that they should be errors. If they are for limited library reasons, then leave them warnings. Either way, they should all be consistent.

@jordanpadams
Copy link
Member

@al-niessner agreed. I would rather we throw errors and if they are inaccurate, then how library is not sufficient and we need to try something else.

Copy link
Member

@nutjob4life nutjob4life left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@al-niessner
Copy link
Contributor Author

@jordanpadams

warnings are now errors if you want to call this done

@jordanpadams jordanpadams merged commit 17a9a64 into main Jun 26, 2023
2 checks passed
@jordanpadams jordanpadams deleted the issue_664 branch June 26, 2023 21:36
jordanpadams added a commit that referenced this pull request Jun 26, 2023
issue #665 658: control PDF error dir
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement content validation for audio/video formats
3 participants