Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clipper: Increase protection of files downloaded by webclipper and cloudRestrict download media file #9676

Merged
merged 14 commits into from Jan 22, 2024

Conversation

pedr
Copy link
Collaborator

@pedr pedr commented Jan 7, 2024

I'm changing the current behavior from downloadMediaFile from a invalidProtocols list to an allowedProtocols list, which have http, https, data and file as the default values.

I also added more automated tests to the downloadMediaFile function and refactored the code to be more readable, basically grouping things together in functions.

I think the best way to read this PR is by looking at each individual commit.

@pedr pedr added the clipper Anything that concerns the Web Clipper label Jan 7, 2024
@pedr pedr requested a review from laurent22 January 7, 2024 01:47
@pedr pedr self-assigned this Jan 7, 2024
Copy link
Owner

@laurent22 laurent22 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Pedro, that looks good. I've just left a few comments regarding the test units.

Also please could you fix the pull request title?

@@ -78,4 +78,44 @@ describe('routes/notes', () => {

expect(response).toBe('');
});

test('should not copy content from invalid protocls', async () => {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

protocls => protocols

Comment on lines 84 to 89
const shimFsDriverCopySpy = jest.fn();
jest.spyOn(shim, 'fsDriver').mockImplementation(() => {
return {
copy: shimFsDriverCopySpy,
} as any;
});
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if we avoid using spy as much as possible because it tests implementation details, and the more we do this the harder it becomes to refactor the code. For example if we start using fs-extra instead of fsDriver this test would be broken but there's no reason for that to happen since the input/output of the function hasn't changed.

In that particular case, isn't it possible to verify that the function simply did nothing (didn't download any media file)?

Comment on lines 101 to 107
jest.spyOn(shim, 'fetchBlob').mockImplementation(() => {
return {
headers: {
'content-type': 'image/jpg',
},
};
});
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it makes sense to mock fetchBlob to avoid http requests in tests, but for the other mocks I'm not sure, I think they can be avoided. It's nice to know in advance the UUID but again if we use another uuid lib, this test will be broken. Maybe you can check what's in the directory before and after to find what you need.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is trickier to do without spyOn.

What I'm testing is when the image type can't be found by the URL.

We fetch the image and save it on filesystem with fetchBlob function, since we don't want a HTTP request in the test I'm using a mock for it, so no file gets created. Then the code will call tryToGuessExtFromMimeType, since the URLs that I used for the test don't have an extension, which will try to move the file that was never created (that is why I have a mock for shim.fsDriver.move)

In the end we can't even look at the temp directory to see if the test worked because no files has ever been created in this test. Maybe it would be best to remove this test altogether?

@laurent22
Copy link
Owner

laurent22 commented Jan 8, 2024

@pedr, another issues with spy is that it seems they are not cleared properly after the test, even though jest.resetAllMocks() is in beforeEach. After adding database setup to the tests, they started failing I think due to uuid.create not being restored properly (some object that should have an ID didn't have one anymore). I could only get them to work again using mockRestore, like so:

const fetchBlobSpy = jest.fn();
const spy1 = jest.spyOn(shim, 'fetchBlob').mockImplementation(fetchBlobSpy);
const spy2 = jest.spyOn(uuid, 'create').mockReturnValue('mocked_uuid_value');

const response = await downloadMediaFile(url);

expect(response.endsWith('mocked_uuid_value.png')).toBe(true);
expect(fetchBlobSpy).toBeCalledTimes(1);

spy1.mockRestore();
spy2.mockRestore();

I don't understand why because you've added jest.resetAllMocks() but anyway to be safe let's add mockRestore within the test itself. Or better, let's avoid spy methods when possible

@pedr
Copy link
Collaborator Author

pedr commented Jan 9, 2024

I don't understand why because you've added jest.resetAllMocks() but anyway to be safe let's add mockRestore within the test itself. Or better, let's avoid spy methods when possible

I didn't notice this problem. This was probably one of the reasons that I choose to use mock so much, since I couldn't use the fsDriver() function (I had mocked before, but it wasn't being reset).

I rewrote some tests to remove some mocks and use the Setting.value('tempDir') to check for the file creation.

jest.spyOn(shim, 'fetchBlob').mockImplementation(fetchBlobSpy);
jest.spyOn(uuid, 'create').mockReturnValue('mocked_uuid_value');
const spy1 = jest.spyOn(shim, 'fetchBlob').mockImplementation(fetchBlobSpy);
const spy2 = jest.spyOn(uuid, 'create').mockReturnValue('mocked_uuid_value');
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's necessary to mock uuid.create - you can simply check that "some file" exists. You can also check the file content to verify it's the right file. fetchBlob makes more sense

Comment on lines 106 to 111
const shimFsDriverMoveSpy = jest.fn();
const spy2 = jest.spyOn(shim, 'fsDriver').mockImplementation(() => {
return {
move: shimFsDriverMoveSpy,
} as any;
});
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spy shouldn't be needed since you're mocking fetchBlob and can make it create real files?

@pedr
Copy link
Collaborator Author

pedr commented Jan 10, 2024

Hi, I think now it is closer to your vision. At first, I thought that creating a file inside fetchBlob would be weird for the test, but since it is pretty simple and makes the test more meaningful I think it is the correct choice.

Ready for review again.

@laurent22 laurent22 merged commit d4d4002 into laurent22:dev Jan 22, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clipper Anything that concerns the Web Clipper
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants