Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix] replace embedded HTML by data_src #882

Merged
merged 9 commits into from
Feb 18, 2022
Merged

Conversation

return42
Copy link
Member

@return42 return42 commented Feb 7, 2022

What does this PR do?

[mod] result_templates/videos.html replace embedded HTML by data_src
[mod] result_templates/default.html replace embedded HTML by data_src

Embedded HTML breaks SearXNG architecture. To modularize, HTML is generated in
the templates (oscar & simple) and result parameter 'embedded' is replaced by
'data_src', an URL for embedded content (<iframe>).

[mod] hostname_replace: replace hostnames in result's data_src param

How to test this PR locally?

The (iframe & audio) templates of the following engines has been touched (oscar & simple theme)

  • searx/engines/dailymotion.py
  • searx/engines/invidious.py
  • searx/engines/peertube.py --> Peertube URL is offline #881
  • searx/engines/sepiasearch.py
  • searx/engines/vimeo.py
  • searx/engines/youtube_api.py
  • searx/engines/bandcamp.py
  • searx/engines/deezer.py
  • searx/engines/freesound.py
  • searx/engines/mixcloud.py
  • searx/engines/soundcloud.py

To test you need to redirect embedded videos (e.g.) from youtube to a invidios
instance. Search for videos using engine !youtube lebowski. The result URLs
and the embedded videos should link to the invidios instance.

Here is an example of such a hostname_replace configuration::

    hostname_replace:

      # youtube --> Invidious

      '(.*\.)?youtube-nocookie\.com': 'invidio.xamh.de'
      '(.*\.)?youtube\.com$': 'invidio.xamh.de'
      '(.*\.)?invidious\.snopyta\.org$': 'invidio.xamh.de'
      '(.*\.)?vid\.puffyan\.us': 'invidio.xamh.de'
      '(.*\.)?invidious\.kavin\.rocks$': 'invidio.xamh.de'
      '(.*\.)?inv\.riverside\.rocks$': 'invidio.xamh.de'

Related issues

Closes: #873

@return42 return42 assigned dalf, unixfox and mrpaulblack and unassigned dalf, unixfox and mrpaulblack Feb 7, 2022
@return42 return42 force-pushed the fix-873 branch 2 times, most recently from ca40537 to 6df5295 Compare February 7, 2022 21:40
@return42 return42 changed the title replace embedded HTML by data_src [fix] replace embedded HTML by data_src Feb 8, 2022
@dalf
Copy link
Member

dalf commented Feb 12, 2022

To modularize, HTML is generated in the templates (oscar & simple)

💯 👍

result parameter 'embedded' is replaced by 'data_src', an URL for embedded content (<iframe>).

IMO: embedded is clearer than data_src. Why choose data_src?


The vertical video are fine:
image


There are issues with the height of the audio media:

In the case of audio, there is a blank space at the bottom of the iframe:
image

In some other cases the iframe is too small:
image

Possible solution: the engine returns an additional field named embedded_height.

Copy link
Member

@dalf dalf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comments

searx/engines/vimeo.py Outdated Show resolved Hide resolved
searx/plugins/hostname_replace.py Outdated Show resolved Hide resolved
searx/plugins/hostname_replace.py Outdated Show resolved Hide resolved
searx/templates/oscar/result_templates/default.html Outdated Show resolved Hide resolved
searx/templates/oscar/result_templates/videos.html Outdated Show resolved Hide resolved
searx/templates/simple/result_templates/default.html Outdated Show resolved Hide resolved
@return42
Copy link
Member Author

return42 commented Feb 13, 2022

result parameter 'embedded' is replaced by 'data_src', an URL for embedded content (<iframe>).

IMO: embedded is clearer than data_src. Why choose data_src?

IMO embedded is not clear, at least when we look at the default.html template ...

{% if result.data_src -%}
<div id="result-media-{{ index }}" class="embedded-content invisible">
  <iframe data-src="{{result.data_src}}" frameborder="0" allowfullscreen></iframe>
</div>
{%- endif %}
{% if result.audio_src -%}
<div id="result-media-{{ index }}" class="audio-control">
  <audio controls><source src="{{result.audio_src}}"></audio>
</div>
{%- endif %}

... where we have an embedded <iframe> and an embedded audio control.

That's why I was looking for a name that better fits to the template. I don't know if data_src and audio_src are good names. If you have better suggestions, we can change the names, but whatever names we choose they should point out their usage in the templates. And we should consider that we might have other embedded content in the future, content we do not think about today. All in all I think embedded as a name is to general.

@dalf
Copy link
Member

dalf commented Feb 13, 2022

what about iframe_src and audio_src?
or embedded_iframe and embedded_audio?

data-src is already used to lazy load images:

<img src="" data-src="{{ image_proxify(result.img_src) }}" alt="{{ result.title|striptags }}">

Embedded HTML breaks SearXNG architecture.  To modularize, HTML is generated in
the templates (oscar & simple) and result parameter 'embedded' is replaced by
'data_src', an URL for embedded content (<iframe>).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
To test you need to redirect embeded videos (e.g.) from youtube to a invidios
instance.  Search for videos using engine `!youtube lebowski`.  The result URLs
and the embeded videos should link to the invidios instance.

Here is an example of such a `hostname_replace` configuration::

    hostname_replace:

      # youtube --> Invidious

      '(.*\.)?youtube-nocookie\.com': 'invidio.xamh.de'
      '(.*\.)?youtube\.com$': 'invidio.xamh.de'
      '(.*\.)?invidious\.snopyta\.org$': 'invidio.xamh.de'
      '(.*\.)?vid\.puffyan\.us': 'invidio.xamh.de'
      '(.*\.)?invidious\.kavin\.rocks$': 'invidio.xamh.de'
      '(.*\.)?inv\.riverside\.rocks$': 'invidio.xamh.de'

Closes: searxng#873
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
… audio_src

Embedded HTML breaks SearXNG architecture.  To modularize, HTML is generated in
the templates (oscar & simple) and result parameter 'embedded' is replaced by
'data_src' (and 'audio_src'), an URL for embedded content (<iframe>).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
return42 added a commit to return42/searxng that referenced this pull request Feb 13, 2022
Suggested-by: @dalf searxng#882 (comment)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
return42 added a commit to return42/searxng that referenced this pull request Feb 13, 2022
Suggested-by: @dalf searxng#882 (comment)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
return42 added a commit to return42/searxng that referenced this pull request Feb 13, 2022
Rename result field data_src to iframe_src

Suggested-by: @dalf searxng#882 (comment)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
@return42
Copy link
Member Author

what about iframe_src and audio_src?

I renamed data_src to iframe_src in 9c46856

Possible solution: the engine returns an additional field named embedded_height.

I don't want the layout hard coded by the engines / therefore I added 1d72ada


@dalf: thanks for your review and helping comments .. I fixed all your requests or leaved a comment at least.

PR is now ready for the next review cycle ..

@return42 return42 requested a review from dalf February 13, 2022 15:30
return42 added a commit to return42/searxng that referenced this pull request Feb 16, 2022
Rename result field data_src to iframe_src

Suggested-by: @dalf searxng#882 (comment)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
return42 added a commit to return42/searxng that referenced this pull request Feb 16, 2022
Not all configurable aspects can be compiled into `css/searxng.min.css`, some
kind of custom-layer to CSS is needed.  One example is discussed in [1], the
layout of embedded content like audio or movie players.  In the
`css/searxng.min.css` we can layout the players SearXNG knows, but what when
some SearXNG admin wants to replace/redirect embedded content to a different
player with a different layout by using the hostname-replace plugin?

This is where css/custom.css comes into play, in which the admin is able to set
CSS layout for such a different player (URL/hostname) that is unknown to the
SearXNG project.

[1] searxng#882 (review)

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Copy link
Member

@dalf dalf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After few days rereading the code, I end up to think there is no need to for a custom.less file (for now).

The code inside this file can simplified, see the comments.

After that, I think the PR is ready.

searx/webapp.py Outdated Show resolved Hide resolved
@@ -1,4 +1,4 @@
iframe.soundcloud\.com {
iframe.w\.soundcloud\.com {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The host names and other CSS class names are overlapping. While the chance there is an actual overlap is small, it would be better to avoid to take the risk.

It is possible to use CSS attribute selector: https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors

iframe[src^="https://w.soundcloud.com"] {
  height: 120px;
}

iframe[src^="https://www.deezer.com"] {
  // The real size is 92px, but 94px are needed to avoid an inner scrollbar of
  // the embedded HTML.
  height: 94px;
}

iframe[src^="https://www.mixcloud.com"] {
  // the embedded player from mixcloud has some quirks: initial there is an
  // issue with an image URL that is blocked since it is an a Cross-Origin
  // request. The alternative text (<img alt='Mixcloud Logo'> then cause an
  // scrollbar in the inner of the iframe we can't avoid.  Another quirk comes
  // when pressing the play button, somtimes the shown player has an height of
  // 200px, somtimes 250px.
  height: 250px;
}

So there is no need

  • to add kwargs['urlparse'] = urllib.parse.urlparse
  • to call urlparse from video.html and default.html (iframe and audio element).

( changes done in commit 1d72ada )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, implemented in 8acf540

result[parsed] = result[parsed]._replace(netloc=pattern.sub(replacement, result[parsed].netloc))
result['url'] = urlunparse(result[parsed])

if not replacement:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it is okay.
However, some engines return YouTube videos (like qwant, google video, bing video ?)
Currently the iframe_src attribute is not set for these engines.
If in the future, this is attribute is set , this condition is going to prevent the hostname replacement in the iframe_src attribute.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rewrote this commit, see 795e8af

searx/plugins/hostname_replace.py Outdated Show resolved Hide resolved
searx/static/themes/simple/gruntfile.js Outdated Show resolved Hide resolved
This is a rewrite of the hostname_replace.py that:

- don't stop to replace URL in fields ('data_src', 'audio_src') if there isn't a
  'parsed_url',
- adds a comment about keep or remove a result from the result list
- adds a loop over ['data_src', 'audio_src'] instead of doubling code lines

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Suggested-by: @dalf searxng#882 (comment)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Suggested-by: @dalf searxng#882 (comment)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Rename result field data_src to iframe_src

Suggested-by: @dalf searxng#882 (comment)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
This commit sets appropriate height of the (embedded) player from:

- soundcloud
- mixcloud
- deezer

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
@return42
Copy link
Member Author

The code inside this file can simplified, see the comments.
After that, I think the PR is ready.

@dalf I dropped several commits and rewrote others, all I changed is documented in the answers to your comments above.

@return42 return42 requested a review from dalf February 18, 2022 18:25
Copy link
Member

@dalf dalf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this contribution.
It fixes an long standing issue.


In addition to the test described, I have check this configuration:

hostname_replace:
  '(.*\.)?youtube-nocookie\.com': false

👍 it works

@dalf dalf merged commit bf2a2ed into searxng:master Feb 18, 2022
@return42 return42 deleted the fix-873 branch February 18, 2022 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Embeded HTML breaks SearXNG architecture (e.g. hostname_replace)
4 participants