Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block External Content from Zim Web Pages #1116

Merged
merged 1 commit into from
Jun 1, 2024

Conversation

ShaopengLin
Copy link
Collaborator

@ShaopengLin ShaopengLin commented May 23, 2024

Fix #904
Changes:

  • Added the toggle button to disable external content
  • Refresh all opened articles when the button is toggled.
  • Requests that do not start with "zim://" are blocked.

You can add this code to TabBar::openUrl to add an image, sourced online, to the top of each article.

connect(webView, &QWebEngineView::loadFinished, [=](){
        webView->page()->runJavaScript(
            "var img = document.createElement(\"img\");"
            "img.src = \"https://picsum.photos/200/300\";"
            "var body = document.body;"
            "body.insertBefore(img, body.firstChild);"
        );
    });

@ShaopengLin ShaopengLin force-pushed the Issue#904-block-non-zim-external-requests branch 2 times, most recently from 0558bd9 to f46e0df Compare May 23, 2024 08:04
@kelson42
Copy link
Collaborator

@veloman-yunkan A review please?

Copy link
Collaborator

@veloman-yunkan veloman-yunkan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the history of this PR, I think that blocking of external resources must be introduced unconditionally in the first commit (so that it is immediately obvious how ExternReqInterceptor is going to be utilized) and then made optional in subsequent commit.

src/webview.h Outdated
{
Q_OBJECT
public:
ExternalReqInterceptor(bool isActive, QObject* parent=nullptr) : QWebEngineUrlRequestInterceptor(parent), m_isActive(isActive) {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Make explicit
  2. Fit into line length of 80 chars

src/webview.h Outdated Show resolved Hide resolved
src/webview.h Outdated
Comment on lines 101 to 102
qDebug() << "Blocked external request to URL >"
<< reqUrl << "<";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What a strange way of quoting the URL (as >URL<)

src/tabbar.cpp Outdated Show resolved Hide resolved
src/tabbar.cpp Outdated Show resolved Hide resolved
src/portutils.h Outdated
Comment on lines 39 to 47
#if QT_VERSION < QT_VERSION_CHECK(5, 13, 0) // Earlier than Qt 5.13
inline void setWebViewRequestInterceptor(WebView* webView, QWebEngineUrlRequestInterceptor* interceptor) {
webView->page()->profile()->setRequestInterceptor(interceptor);
}
#else // Qt 5.13 and later
inline void setWebViewRequestInterceptor(WebView* webView, QWebEngineUrlRequestInterceptor* interceptor) {
webView->page()->setUrlRequestInterceptor(interceptor);
}
#endif
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a non-portable functionality is used in a single place (and is not likely to be used somewhere else in the future) it doesn't need to go into portutils

src/webview.h Outdated
Comment on lines 93 to 91
virtual void interceptRequest(QWebEngineUrlRequestInfo &info) override
{
if (!m_isActive)
return;

const QString reqUrl = info.requestUrl().toString();
if (!reqUrl.startsWith("zim://"))
{
qDebug() << "Blocked external request to URL >"
<< reqUrl << "<";
info.block(true);
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the implementation into webview.cpp.

src/settingsmanager.cpp Outdated Show resolved Hide resolved
@veloman-yunkan
Copy link
Collaborator

You can add this code to TabBar::openUrl to add an image, sourced online, to the top of each article.

connect(webView, &QWebEngineView::loadFinished, [=](){
        webView->page()->runJavaScript(
            "var img = document.createElement(\"img\");"
            "img.src = \"https://picsum.photos/200/300\";"
            "var body = document.body;"
            "body.insertBefore(img, body.firstChild);"
        );
    });

I'd rather test using a ZIM file with references to external resources.

@ShaopengLin
Copy link
Collaborator Author

You can add this code to TabBar::openUrl to add an image, sourced online, to the top of each article.

connect(webView, &QWebEngineView::loadFinished, [=](){
        webView->page()->runJavaScript(
            "var img = document.createElement(\"img\");"
            "img.src = \"https://picsum.photos/200/300\";"
            "var body = document.body;"
            "body.insertBefore(img, body.firstChild);"
        );
    });

I'd rather test using a ZIM file with references to external resources.

Could you point me to one? I am not aware of how to filter Zims to find ones with external resources.

@veloman-yunkan
Copy link
Collaborator

I'd rather test using a ZIM file with references to external resources.

Could you point me to one? I am not aware of how to filter Zims to find ones with external resources.

I don't know of any such ZIM file off the top of my head and I don't even remember if I have seen one. As long as Kiwix is concerned a non-fully-self-contained ZIM file could only be created due to a bug in the scraper. @kelson42 and/or @rgaudin are more likely to point to such a file. If no such test data is available it can be created using zim-tools:

  1. Extract contents of a ZIM file using zimdump
  2. Modify some of its HTMLs
  3. Create a new ZIM file using zimwriterfs.

Or you can take https://github.com/openzim/zim-tools/blob/main/test/data/zimfiles/create_test_zimfiles as a starting point and modify its source data to meet your requirements.

@rgaudin
Copy link
Member

rgaudin commented May 29, 2024

We surely have some zimit files that includes this but I can't point to one in particular.

Best is probably as suggested to create a custom one. zimwriterfs might even be simpler for this.

@kelson42
Copy link
Collaborator

kelson42 commented May 29, 2024

Added the toggle button to disable external content
Refresh all opened articles when the button is toggled.

@ShaopengLin Sorry but I only see that properly now. I see no scenario where this option should allow external loading. Do you? I believe this should not be any option. Sorry I shoukd have seen this "detail" in the original feature request.

@ShaopengLin
Copy link
Collaborator Author

Added the toggle button to disable external content
Refresh all opened articles when the button is toggled.

@ShaopengLin Sorry but I only see that properly now. I see no scenario where this option should allow external loading. Do you? I believe this should not be any option. Sorry I shoukd have seen this "detail" in the original feature request.

I agree. we shouldn't really allow external loading in the first place. Zim files are supposed to be the 'offline' versions of websites after all and are not supposed to generate any network traffic.

@ShaopengLin ShaopengLin force-pushed the Issue#904-block-non-zim-external-requests branch from f46e0df to ce8aef3 Compare May 30, 2024 21:01
@ShaopengLin
Copy link
Collaborator Author

I used singleton for the class since it is more space-efficient.

Here is the test file, where the main page should try to load an image externally.
test_external.zip

src/tabbar.cpp Outdated Show resolved Hide resolved
@veloman-yunkan veloman-yunkan changed the title Added Setting to Block External Content from Zim Web Pages Block External Content from Zim Web Pages May 31, 2024
@veloman-yunkan
Copy link
Collaborator

veloman-yunkan commented May 31, 2024

The commit description better be changed too (in this case it will be quite appropriate to tell what you have achieved instead of how you did it).

@ShaopengLin ShaopengLin force-pushed the Issue#904-block-non-zim-external-requests branch from ce8aef3 to fcfa124 Compare June 1, 2024 06:01
@ShaopengLin ShaopengLin force-pushed the Issue#904-block-non-zim-external-requests branch from fcfa124 to 33582f3 Compare June 1, 2024 06:02
Webpages blocks all requests with Url not starting with "zim://", i.e. native to the Zim file.
@kelson42 kelson42 force-pushed the Issue#904-block-non-zim-external-requests branch from 33582f3 to f8d646d Compare June 1, 2024 12:47
@kelson42
Copy link
Collaborator

kelson42 commented Jun 1, 2024

CI broken because of #1121, but this has nothing to do with this PR. Therefore merging.

@kelson42 kelson42 merged commit 79efd58 into main Jun 1, 2024
1 of 4 checks passed
@kelson42 kelson42 deleted the Issue#904-block-non-zim-external-requests branch June 1, 2024 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: option to block all external elements
4 participants