Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modified DropOffUtil to put all file extensions in FILEXT #160

Merged

Conversation

dlmarion
Copy link
Collaborator

Closes #159

@fbruton
Copy link
Collaborator

fbruton commented Jun 23, 2021

In the case of multiple file extensions, should the last file extension be preserved as a separate value?

@dlmarion
Copy link
Collaborator Author

That was not mentioned in the internal ticket.

@@ -1019,12 +1019,12 @@ public void processMetadata(final List<IBaseDataObject> payloadList) {
parentTypes.put("" + level, p.getFileType());

final String fn = p.getStringParameter("Original-Filename");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be misinterpreting the ticket intent, but I understand it as handling multiple values in the Original-Filename parameter, more like:

Suggested change
final String fn = p.getStringParameter("Original-Filename");
if (p.hasParameter("Original-Filename")) {
List<Object> fileNames = p.getParameter("Original-Filename");
for (Object filename : fileNames) {
final String fn = (String) filename;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-read the issue and modified the code in dedf4c9

if (fext.length() > 0 && fext.length() <= this.maxFilextLen) {
p.setParameter("FILEXT", fext.toLowerCase());
p.setParameter("FILEXT", fext);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the above comment re: multiple Original-Filename values is correct, this will need to change so the extensions from all Original-Filename values are preserved.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been addressed in dedf4c9

@drivenflywheel
Copy link
Collaborator

In the case of multiple file extensions, should the last file extension be preserved as a separate value?

That's a good question, and probably needs clarification. My first instinct is that this (last extension) would be more aligned with how the data might be searched for.

@jpdahlke jpdahlke added this to the v6.7.0 milestone Jun 25, 2021
@dlmarion
Copy link
Collaborator Author

Re-reading issue it seems that maybe I misunderstood the original requirements. It seems "Original-Filename" may contain multiple file names. I was thinking the issue was with file extensions. I'll rework.

@dlmarion
Copy link
Collaborator Author

Note that there is a change in the preserved file extension in this change. Prior to this change the code used the last occurrence of . and this change uses the first. For a file foo.tar.gz the file extension prior to this change would have been .gz and this change makes it tar.gz.

Copy link
Collaborator

@drivenflywheel drivenflywheel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that there is a change in the preserved file extension in this change. Prior to this change the code used the last occurrence of . and this change uses the first. For a file foo.tar.gz the file extension prior to this change would have been .gz and this change makes it tar.gz.

Verbal feedback I've gotten from the original requestor is that this should continue to extract only after the last occurrence of .. Same extraction as pre-PR, but now extracting the "last file extension" from each of the values stored under the "Original-filename" key

@dlmarion
Copy link
Collaborator Author

Verbal feedback I've gotten from the original requestor is that this should continue to extract only after the last occurrence of .. Same extraction as pre-PR, but now extracting the "last file extension" from each of the values stored under the "Original-filename" key

Addressed in d80fff0

if (fext.length() > 0 && fext.length() <= this.maxFilextLen) {
p.setParameter("FILEXT", fext.toLowerCase());
if (p.hasParameter("Original-Filename")) {
final List<String> extensions = new ArrayList<>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend extracting lines 1022:1034 to a new method.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale for the recommendation? Is this duplicated elsewhere? Where you looking for an associated test for this block of code?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that this method extraction would make the processMetadata method more coherent, but I don't consider that a blocker for the PR. We see virtually the same pattern immediately after this FILEXT processing (https://github.com/NationalSecurityAgency/emissary/pull/160/files#diff-5e7393b5cf28de1fa844df12051fecba96c6962dcbf4aaa3be6b7c233677e410R1040-R1060 would also benefit from a similar method extraction).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just concerned about the overall length and complexity of processMetadata. I believe extracting to method will make it more readable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 78aa260

@jpdahlke jpdahlke removed this from the v7.1.0 milestone Sep 29, 2021
@drivenflywheel
Copy link
Collaborator

drivenflywheel commented Oct 1, 2021

@dlmarion - can you rebase this onto the lastest master at your convenience? (rebasing will remove most of the CodeQL security alerts)
@fbruton - can you gave this a final look after it's been rebased?

No reason this little PR is still languishing.

@jpdahlke jpdahlke added this to the 7.2.0 milestone Oct 13, 2021
@dev-mlb dev-mlb merged commit 2a59ae5 into NationalSecurityAgency:master Nov 6, 2021
andrewbp pushed a commit to andrewbp/emissary that referenced this pull request Jul 5, 2022
…curityAgency#160)

* Modified DropOffUtil to put all file extensions in FILEXT

Closes NationalSecurityAgency#159

* Fixed assert message

* Guard against the position of the first period being the last character in the file name

* Modified DropOffUtil to add a file extension for each entry in Original-Filename

* Modified to only use last file extension when there are multiple extensions on a file

* Move file extension extraction code to it's own method
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FILEXT should contain all file extensions, not just the last one.
5 participants